Alternative splicing (AS) and fusion transcripts produce a vast expansion of transcriptomes and proteomes diversity. However, the reliability of these events and the extend of epigenetic mechanisms have not been adequately addressed due to its limitation of uncertainties about the complete structure of mRNA. Here we combined single-molecule real-time sequencing, Illumina RNA-seq and DNA methylation data to characterize the landscapes of DNA methylation on AS, fusion isoforms formation and lncRNA feature and further to unveil the transcriptome complexity of pig. Our analysis identified an unprecedented scale of high-quality full-length isoforms with over 28,127 novel isoforms from 26,881 novel genes. More than 92,000 novel AS events were detected and intron retention predominated in AS model, followed by exon skipping. Interestingly, we found that DNA methylation played an important role in generating various AS isoforms by regulating splicing sites, promoter regions and first exons. Furthermore, we identified a large of fusion transcripts and novel lncRNAs, and found that DNA methylation of the promoter and gene body could regulate lncRNA expression. Our results significantly improved existed gene models of pig and unveiled that pig AS and epigenetic modify were more complex than previously thought.
Identification and analysis of glutathione S-transferase gene family in sweet potato reveal divergent GST-mediated networks in aboveground and underground tissues in response to abiotic stresses.
Sweet potato, a hexaploid species lacking a reference genome, is one of the most important crops in many developing countries, where abiotic stresses are a primary cause of reduction of crop yield. Glutathione S-transferases (GSTs) are multifunctional enzymes that play important roles in oxidative stress tolerance and cellular detoxification.A total of 42 putative full-length GST genes were identified from two local transcriptome databases and validated by molecular cloning and Sanger sequencing. Sequence and intraspecific phylogenetic analyses revealed extensive differentiation in their coding sequences and divided them into eight subfamilies. Interspecific phylogenetic and comparative analyses indicated that most examined GST paralogs might originate and diverge before the speciation of sweet potato. Results from large-scale RNA-seq and quantitative real-time PCR experiments exhibited extensive variation in gene-expression profiles across different tissues and varieties, which implied strong evolutionary divergence in their gene-expression regulation. Moreover, we performed five manipulated stress experiments and uncovered highly divergent stress-response patterns of sweet potato GST genes in aboveground and underground tissues.Our study identified a large number of sweet potato GST genes, systematically investigated their evolutionary diversification, and provides new insights into the GST-mediated stress-response mechanisms in this worldwide crop.
Dissemination of KPC-2-encoding IncX6 plasmids among multiple Enterobacteriaceae species in a single Chinese hospital.
Forty-five KPC-producing Enterobacteriaceae strains were isolated from multiple departments in a Chinese public hospital from 2014 to 2015. Genome sequencing of four representative strains, namely Proteus mirabilis GN2, Serratia marcescens GN26, Morganella morganii GN28, and Klebsiella aerogenes E20, indicated the presence of blaKPC-2-carrying IncX6 plasmids pGN2-KPC, pGN26-KPC, pGN28-KPC, and pE20-KPC in the four strains, respectively. These plasmids were genetically closely related to one another and to the only previously sequenced IncX6 plasmid, pKPC3_SZ. Each of the plasmids carried a single accessory module containing the blaKPC-2/3-carrying ?Tn6296 derivatives. The ?Tn6292 element from pGN26-KPC also contained qnrS, which was absent from all other plasmids. Overall, pKPC3_SZ-like blaKPC-carrying IncX6 plasmids were detected by PCR in 44.4% of the KPC-producing isolates, which included K. aerogenes, P. mirabilis, S. marcescens, M. morganii, Escherichia coli, and Klebsiella pneumoniae, and were obtained from six different departments of the hospital. Data presented herein provided insights into the genomic diversity and evolution of IncX6 plasmids, as well as the dissemination and epidemiology of blaKPC-carrying IncX6 plasmids among Enterobacteriaceae in a hospital setting.
Analysis of the Gli-D2 locus identifies a genetic target for simultaneously improving the breadmaking and health-related traits of common wheat.
Gliadins are a major component of wheat seed proteins. However, the complex homoeologous Gli-2 loci (Gli-A2, -B2 and -D2) that encode the a-gliadins in commercial wheat are still poorly understood. Here we analyzed the Gli-D2 locus of Xiaoyan 81 (Xy81), a winter wheat cultivar. A total of 421.091 kb of the Gli-D2 sequence was assembled from sequencing multiple bacterial artificial clones, and 10 a-gliadin genes were annotated. Comparative genomic analysis showed that Xy81 carried only eight of the a-gliadin genes of the D genome donor Aegilops tauschii, with two of them each experiencing a tandem duplication. A mutant line lacking Gli-D2 (DLGliD2) consistently exhibited better breadmaking quality and dough functionalities than its progenitor Xy81, but without penalties in other agronomic traits. It also had an elevated lysine content in the grains. Transcriptome analysis verified the lack of Gli-D2 a-gliadin gene expression in DLGliD2. Furthermore, the transcript and protein levels of protein disulfide isomerase were both upregulated in DLGliD2 grains. Consistent with this finding, DLGliD2 had increased disulfide content in the flour. Our work sheds light on the structure and function of Gli-D2 in commercial wheat, and suggests that the removal of Gli-D2 and the gliadins specified by it is likely to be useful for simultaneously enhancing the end-use and health-related traits of common wheat. Because gliadins and homologous proteins are widely present in grass species, the strategy and information reported here may be broadly useful for improving the quality traits of diverse cereal crops.© 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Bacterial isolate X39 was isolated from a community-acquired pneumonia patient in Beijing, China. A phylogenetic tree based on rpoB genes and average nucleotide identity data confirmed that isolate X39 belonged to Klebsiella variicola. The genome of K. variicola X39 contained one circular chromosome and nine plasmids. Comparative genomic analyses with other K. variicola isolates revealed that K. variicola X39 contained the most unique genes. Of these unique genes, many were prophages and transposases. Many virulence factors were shared between K. variicola X39 and Klebsiella pneumoniae F1. The pathogenicity of K. variicola X39 was compared with that of K. pneumoniae F1 in an abdominal infection model. The results indicated that K. variicola X39 was less virulent than typical clinical K. pneumoniae F1. The genome of K. variicola X39 also contained some genes involved in plant colonization, nitrogen fixation, and defense against oxidative stress. GFP-labeled K. variicola X39 could colonize maize as an endophytic bacterium. We concluded that K. variicola X39 was a kingdom-crossing strain.
Precision methylome characterization of Mycobacterium tuberculosis complex (MTBC) using PacBio single-molecule real-time (SMRT) technology.
Tuberculosis (TB) remains one of the most common infectious diseases caused by Mycobacterium tuberculosis complex (MTBC). To panoramically analyze MTBC’s genomic methylation, we completed the genomes of 12 MTBC strains (Mycobacterium bovis; M. bovis BCG; M. microti; M. africanum; M. tuberculosis H37Rv; H37Ra; and 6 M. tuberculosis clinical isolates) belonging to different lineages and characterized their methylomes using single-molecule real-time (SMRT) technology. We identified three (m6)A sequence motifs and their corresponding methyltransferase (MTase) genes, including the reported mamA, hsdM and a newly discovered mamB. We also experimentally verified the methylated motifs and functions of HsdM and MamB. Our analysis indicated the MTase activities varied between 12 strains due to mutations/deletions. Furthermore, through measuring ‘the methylated-motif-site ratio’ and ‘the methylated-read ratio’, we explored the methylation status of each modified site and sequence-read to obtain the ‘precision methylome’ of the MTBC strains, which enabled intricate analysis of MTase activity at whole-genome scale. Most unmodified sites overlapped with transcription-factor binding-regions, which might protect these sites from methylation. Overall, our findings show enormous potential for the SMRT platform to investigate the precise character of methylome, and significantly enhance our understanding of the function of DNA MTase.© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction.
The emergence of apomixis-the transition from sexual to asexual reproduction-is a prominent feature of modern citrus. Here we de novo sequenced and comprehensively studied the genomes of four representative citrus species. Additionally, we sequenced 100 accessions of primitive, wild and cultivated citrus. Comparative population analysis suggested that genomic regions harboring energy- and reproduction-associated genes are probably under selection in cultivated citrus. We also narrowed the genetic locus responsible for citrus polyembryony, a form of apomixis, to an 80-kb region containing 11 candidate genes. One of these, CitRWP, is expressed at higher levels in ovules of polyembryonic cultivars. We found a miniature inverted-repeat transposable element insertion in the promoter region of CitRWP that cosegregated with polyembryony. This study provides new insights into citrus apomixis and constitutes a promising resource for the mining of agriculturally important genes.
Genome sequence analysis of the naphthenic acid degrading and metal resistant bacterium Cupriavidus gilardii CR3.
Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals.
Genomic reconnaissance of clinical isolates of emerging human pathogen Mycobacterium abscessus reveals high evolutionary potential.
Mycobacterium abscessus (Ma) is an emerging human pathogen that causes both soft tissue infections and systemic disease. We present the first comparative whole-genome study of Ma strains isolated from patients of wide geographical origin. We found a high proportion of accessory strain-specific genes indicating an open, non-conservative pan-genome structure, and clear evidence of rapid phage-mediated evolution. Although we found fewer virulence factors in Ma compared to M. tuberculosis, our data indicated that Ma evolves rapidly and therefore should be monitored closely for the acquisition of more pathogenic traits. This comparative study provides a better understanding of Ma and forms the basis for future functional work on this important pathogen.
We report here a new type of plasmid that carries the mcr-1 gene, the pMCR-1-P3 plasmid, harbored in an Escherichia coli strain isolated from a pig farm in China. pMCR-1-P3 belongs to the IncY incompatibility group and is a phage-like plasmid that contains a large portion of phage-related sequences. The backbone of this plasmid is different from that of other mcr-1-carrying plasmids reported previously. Copyright © 2017 American Society for Microbiology.
The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae) revealed by whole-genome sequencing
Mangroves invade some very marginal habitats for woody plants—at the interface between land and sea. Since mangroves anchor tropical coastal communities globally, their origin, diversification and adaptation are of scientific significance, particularly at a time of global climate change. In this study, a combination of single-molecule long reads and the more conventional short reads are generated from Rhizophora apiculata for the de novo assembly of its genome to a near chromosome level. The longest scaffold, N50 and N90 for the R. apiculata genome, are 13.3 Mb, 5.4 Mb and 1.0 Mb, respectively. Short reads for the genomes and transcriptomes of eight related species are also generated. We find that the ancestor of Rhizophoreae experienced a whole-genome duplication ~70 Myrs ago, which is followed rather quickly by colonization and species diversification. Mangroves exhibit pan-exome modifications of amino acid (AA) usage as well as unusual AA substitutions among closely related species. The usage and substitution of AAs, unique among plants surveyed, is correlated with the rapid evolution of proteins in mangroves. A small subset of these substitutions is associated with mangroves’ highly specialized traits (vivipary and red bark) thought to be adaptive in the intertidal habitats. Despite the many adaptive features, mangroves are among the least genetically diverse plants, likely the result of continual habitat turnovers caused by repeated rises and falls of sea level in the geologically recent past. Mangrove genomes thus inform about their past evolutionary success as well as portend a possibly difficult future.
Bacterial endophytes with capacity to promote plant growth and improve plant tolerance against biotic and abiotic stresses have importance in agricultural practice and phytoremediation. A plant growth-promoting endophyte named Klebsiella sp. LTGPAF-6F, which was isolated from the roots of the desert plant Alhagi sparsifolia in north-west China, exhibits the ability to enhance the growth of wheat under drought stress. The complete genome sequence of this strain consists of one circular chromosome and two circular plasmids. From the genome, we identified genes related to the plant growth promotion and stress tolerance, such as nitrogen fixation, production of indole-3-acetic acid, acetoin, 2,3-butanediol, spermidine and trehalose. This genome sequence provides a basis for understanding the beneficial interactions between LTGPAF-6F and host plants, and will facilitate its applications as biotechnological agents in agriculture. Copyright © 2017 Elsevier B.V. All rights reserved.
Bioinformatics analysis and characterization of highly efficient polyvinyl alcohol (PVA)-degrading enzymes from the novel PVA degrader Stenotrophomonas rhizophila QL-P4.
Polyvinyl alcohol (PVA) is used widely in industry, and associated environmental pollution is a serious problem. Herein, we report a novel, efficient PVA degrader, Stenotrophomonas rhizophila QL-P4, isolated from fallen leaves from virgin forest in the Qinling Mountains. The complete genome was obtained using single-molecule real-time (SMRT) technology and corrected using Illumina sequencing. Bioinformatics analysis revealed eight PVA/OVA (vinyl alcohol oligomer)-degrading genes. Of these, seven genes were predicted to be involved in the classical intracellular PVA/OVA degradation pathway, and one (BAY15_3292) was identified as a novel PVA oxidase. Five PVA/OVA-degrading enzymes were purified and characterised. Among which, BAY15_1712, a PVA dehydrogenase (PVADH), displayed high catalytic efficiency towards PVA and OVA substrate. All reported PVADHs only have PVA-degrading ability. Most importantly, we discovered a novel PVA oxidase (BAY15_3292) that exhibited highest PVA-degrading efficiency than the reported PVADHs. Further investigation indicated that BAY15_3292 plays a crucial role in PVA degradation in S. rhizophila QL-P4. Knocking out BAY15_3292 resulted in a significant decline in PVA-degrading activity in S. rhizophila QL-P4. Interestingly, we found that BAY15_3292 possesses exocrine activity, which distinguishes it from classical PVADHs. Transparent circle experiments further proved that BAY15_3292 greatly affects extracellular PVA degradation in S. rhizophila QL-P4. The exocrine characteristics of BAY15_3292 facilitate its potential application to PVA bioremediation. In addition, we report three new efficient secondary alcohol dehydrogenases (SADHs) with OVA-degrading ability in S. rhizophila QL-P4, compared with only one OVA-degrading SADH as reported previously.Importance With the widespread application of PVA in industry, PVA-related environmental pollution is an increasingly serious issue. Because PVA is difficult to degrade, it accumulates in aquatic environments and causes chronic toxicity to aquatic organisms. Biodegradation of PVA, as an economical and environment-friendly method, has attracted much interest. To date, effective and applicable PVA-degrading bacteria/enzymes have not been reported. Herein, we report a new efficient PVA degrader (S. rhizophila QL-P4) that has five PVA/OVA-degrading enzymes with high catalytic efficiency, among which BAY15_1712 is the only reported PVADH with both PVA- and OVA-degrading abilities. Importantly, we discovered a novel PVA oxidase (BAY15_3292) that is not only more efficient than other reported PVA-degrading PVADHs, but also has exocrine activity. Overall, our findings provide new insight into PVA-degrading pathways in microorganisms, and suggest S. rhizophila QL-P4 and its enzymes have potential for application to PVA bioremediation to reduce or eliminate PVA-related environmental pollution. Copyright © 2017 American Society for Microbiology.
Comparative whole-genomic analysis of an ancient L2 lineage Mycobacterium novel phylogenetic clade and common genetic determinants of hypervirulent strains.
Background: Development of improved therapeutics against tuberculosis (TB) is hindered by an inadequate understanding of the relationship between disease severity and genetic diversity of its causative agent, Mycobacterium tuberculosis. We previously isolated a hypervirulent M. tuberculosis strain H112 from an HIV-negative patient with an aggressive disease progression from pulmonary TB to tuberculous meningitis—the most severe manifestation of tuberculosis. Human macrophage challenge experiment demonstrated that the strain H112 exhibited significantly better intracellular survivability and induced lower level of TNF-a than the reference virulent strain H37Rv and other 123 clinical isolates. Aim: The present study aimed to identify the potential genetic determinants of mycobacterial virulence that were common to strain H112 and hypervirulent M. tuberculosis strains of the same phylogenetic clade isolated in other global regions. Methods: A low-virulent M. tuberculosis strain H54 which belonged to the same phylogenetic lineage (L2) as strain H112 was selected from a collection of 115 clinical isolates. Both H112 and H54 were whole-genome-sequenced using PacBio sequencing technology. A comparative genomics approach was adopted to identify mutations present in strain H112 but absent in strain H54. Subsequently, an extensive phylogenetic analysis was conducted by including all publically available M. tuberculosis genomes. Single-nucleotide-polymorphisms (SNPs) and structural variations (SVs) common to hypervirulent strains in the global collection of genomes were considered as potential genetic determinants of hypervirulence. Results: Sequencing data revealed that both H112 and H54 were identified as members of the same sub-lineage L2.2.1. After excluding the lineage-related mutations shared between H112 and H54, we analyzed the phylogenetic relatedness of H112 with global collection of M. tuberculosis genomes (n = 4,338), and identified a novel phylogenetic clade in which four hypervirulent strains isolated from geographically diverse regions were clustered together. All hypervirulent strains in the clade shared 12 SNPs and 5 SVs with H112, including those affecting key virulence-associated loci, notably, a deleterious SNP (rv0178 p. D150E) within mce1 operon and an intergenic deletion (854259_ 854261delCC) in close-proximity to phoP. Conclusion: The present study identified common genetic factors in a novel phylogenetic clade of hypervirulent M. tuberculosis. The causative role of these mutations in mycobacterial virulence should be validated in future study.
Sequencing plant genomes are often challenging because of their complex architecture and high content of repetitive sequences. Sugarcane has one of the most complex genomes. It is highly polyploid, preserves intact homeologous chromosomes from its parental species and contains >55% repetitive sequences. Although bacterial artificial chromosome (BAC) libraries have emerged as an alternative for accessing the sugarcane genome, sequencing individual clones is laborious and expensive. Here, we present a strategy for sequencing and assembly reads produced from the DNA of pooled BAC clones. A set of 178 BAC clones, randomly sampled from the SP80-3280 sugarcane BAC library, was pooled and sequenced using the Illumina HiSeq2000 and PacBio platforms. A hybrid assembly strategy was used to generate 2,451 scaffolds comprising 19.2 MB of assembled genome sequence. Scaffolds of =20 Kb corresponded to 80% of the assembled sequences, and the full sequences of forty BACs were recovered in one or two contigs. Alignment of the BAC scaffolds with the chromosome sequences of sorghum showed a high degree of collinearity and gene order. The alignment of the BAC scaffolds to the 10 sorghum chromosomes suggests that the genome of the SP80-3280 sugarcane variety is ~19% contracted in relation to the sorghum genome. In conclusion, our data show that sequencing pools composed of high numbers of BAC clones may help to construct a reference scaffold map of the sugarcane genome.