Menu
September 22, 2019  |  

Abiotic stresses modulate landscape of poplar transcriptome via alternative splicing differential intron retention, and isoform ratio switching.

Abiotic stresses affect plant physiology, development, growth, and alter pre-mRNA splicing. Western poplar is a model woody tree and a potential bioenergy feedstock. To investigate the extent of stress-regulated alternative splicing (AS), we conducted an in-depth survey of leaf, root, and stem xylem transcriptomes under drought, salt, or temperature stress. Analysis of approximately one billion of genome-aligned RNA-Seq reads from tissue- or stress-specific libraries revealed over fifteen millions of novel splice junctions. Transcript models supported by both RNA-Seq and single molecule isoform sequencing (Iso-Seq) data revealed a broad array of novel stress- and/or tissue-specific isoforms. Analysis of Iso-Seq data also resulted in the discovery of 15,087 novel transcribed regions of which 164 show AS. Our findings demonstrate that abiotic stresses profoundly perturb transcript isoform profiles and trigger widespread intron retention (IR) events. Stress treatments often increased or decreased retention of specific introns – a phenomenon described here as differential intron retention (DIR). Many differentially retained introns were regulated in a stress- and/or tissue-specific manner. A subset of transcripts harboring super stress-responsive DIR events showed persisting fluctuations in the degree of IR across all treatments and tissue types. To investigate coordinated dynamics of intron-containing transcripts in the study we quantified absolute copy number of isoforms of two conserved transcription factors (TFs) using Droplet Digital PCR. This case study suggests that stress treatments can be associated with coordinated switches in relative ratios between fully spliced and intron-retaining isoforms and may play a role in adjusting transcriptome to abiotic stresses.


September 22, 2019  |  

Capturing single cell genomes of active polysaccharide degraders: an unexpected contribution of Verrucomicrobia.

Microbial hydrolysis of polysaccharides is critical to ecosystem functioning and is of great interest in diverse biotechnological applications, such as biofuel production and bioremediation. Here we demonstrate the use of a new, efficient approach to recover genomes of active polysaccharide degraders from natural, complex microbial assemblages, using a combination of fluorescently labeled substrates, fluorescence-activated cell sorting, and single cell genomics. We employed this approach to analyze freshwater and coastal bacterioplankton for degraders of laminarin and xylan, two of the most abundant storage and structural polysaccharides in nature. Our results suggest that a few phylotypes of Verrucomicrobia make a considerable contribution to polysaccharide degradation, although they constituted only a minor fraction of the total microbial community. Genomic sequencing of five cells, representing the most predominant, polysaccharide-active Verrucomicrobia phylotype, revealed significant enrichment in genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases and esterases, confirming that these organisms were well equipped for the hydrolysis of diverse polysaccharides. Remarkably, this enrichment was on average higher than in the sequenced representatives of Bacteroidetes, which are frequently regarded as highly efficient biopolymer degraders. These findings shed light on the ecological roles of uncultured Verrucomicrobia and suggest specific taxa as promising bioprospecting targets. The employed method offers a powerful tool to rapidly identify and recover discrete genomes of active players in polysaccharide degradation, without the need for cultivation.


September 22, 2019  |  

De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts

Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.


September 22, 2019  |  

Global analysis of epigenetic regulation of gene expression in response to drought stress in Sorghum.

Abiotic stresses including drought are major limiting factors of crop yields and cause significant crop losses. Acquisition of stress tolerance to abiotic stresses requires coordinated regulation of a multitude of biochemical and physiological changes, and most of these changes depend on alterations in gene expression. The goal of this work is to perform global analysis of differential regulation of gene expression and alternative splicing, and their relationship with chromatin landscape in drought sensitive and tolerant cultivars. our Iso-Seq study revealed transcriptome-wide full-length isoforms at an unprecedented scale with over 11000 novel splice isoforms. Additionally, we uncovered alternative polyadenylation sites of ~11000 expressed genes and many novel genes. Overall, Iso-Seq results greatly enhanced sorghum gene annotations that are not only useful in analyentified differentially expressed genes and splicing events that are correlated with tzing all our RNA-seq, ChIP-seq and ATAC-seq data but also serve as a great resource to the plant biology community. Our studies idhe drought-resistant phenotype. An association between alternative splicing and chromatin accessibility was also revealed. Several computational tools developed here (TAPIS and iDiffIR) have been made freely available to the research community in analyzing alternative splicing and differential alternative splicing.


September 22, 2019  |  

De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana.

Agaves are succulent monocotyledonous plants native to xeric environments of North America. Because of their adaptations to their environment, including crassulacean acid metabolism (CAM, a water-efficient form of photosynthesis), and existing technologies for ethanol production, agaves have gained attention both as potential lignocellulosic bioenergy feedstocks and models for exploring plant responses to abiotic stress. However, the lack of comprehensive Agave sequence datasets limits the scope of investigations into the molecular-genetic basis of Agave traits.Here, we present comprehensive, high quality de novo transcriptome assemblies of two Agave species, A. tequilana and A. deserti, built from short-read RNA-seq data. Our analyses support completeness and accuracy of the de novo transcriptome assemblies, with each species having a minimum of approximately 35,000 protein-coding genes. Comparison of agave proteomes to those of additional plant species identifies biological functions of gene families displaying sequence divergence in agave species. Additionally, a focus on the transcriptomics of the A. deserti juvenile leaf confirms evolutionary conservation of monocotyledonous leaf physiology and development along the proximal-distal axis.Our work presents a comprehensive transcriptome resource for two Agave species and provides insight into their biology and physiology. These resources are a foundation for further investigation of agave biology and their improvement for bioenergy development.


September 22, 2019  |  

Multiplex amplicon sequencing for microbe identification in community-based culture collections.

Microbiome analysis using metagenomic sequencing has revealed a vast microbial diversity associated with plants. Identifying the molecular functions associated with microbiome-plant interaction is a significant challenge concerning the development of microbiome-derived technologies applied to agriculture. An alternative to accelerate the discovery of the microbiome benefits to plants is to construct microbial culture collections concomitant with accessing microbial community structure and abundance. However, traditional methods of isolation, cultivation, and identification of microbes are time-consuming and expensive. Here we describe a method for identification of microbes in culture collections constructed by picking colonies from primary platings that may contain single or multiple microorganisms, which we named community-based culture collections (CBC). A multiplexing 16S rRNA gene amplicon sequencing based on two-step PCR amplifications with tagged primers for plates, rows, and columns allowed the identification of the microbial composition regardless if the well contains single or multiple microorganisms. The multiplexing system enables pooling amplicons into a single tube. The sequencing performed on the PacBio platform led to recovery near-full-length 16S rRNA gene sequences allowing accurate identification of microorganism composition in each plate well. Cross-referencing with plant microbiome structure and abundance allowed the estimation of diversity and abundance representation of microorganism in the CBC.


September 22, 2019  |  

Generation and comparative analysis of full-length transcriptomes in sweetpotato and its putative wild ancestor I. trifida.

Sweetpotato [Ipomoea batatas (L.) Lam.] is one of the most important crops in many developing countries and provides a candidate source of bioenergy. However, neither high-quality reference genome nor large-scale full-length cDNA sequences for this outcrossing hexaploid are still lacking, which in turn impedes progress in research studies in sweetpotato functional genomics and molecular breeding. In this study, we apply a combination of second- and third-generation sequencing technologies to sequence full-length transcriptomes in sweetpotato and its putative ancestor I. trifida. In total, we obtained 53,861/51,184 high-quality transcripts, which includes 34,963/33,637 putative full-length cDNA sequences, from sweetpotato/I. trifida. Amongst, we identified 104,540/94,174 open reading frames, 1476/1475 transcription factors, 25,315/27,090 simple sequence repeats, 417/531 long non-coding RNAs out of the sweetpotato/I. trifida dataset. By utilizing public available genomic contigs, we analyzed the gene features (including exon number, exon size, intron number, intron size, exon-intron structure) of 33,119 and 32,793 full-length transcripts in sweetpotato and I. trifida, respectively. Furthermore, comparative analysis between our transcript datasets and other large-scale cDNA datasets from different plant species enables us assessing the quality of public datasets, estimating the genetic similarity across relative species, and surveyed the evolutionary pattern of genes. Overall, our study provided fundamental resources of large-scale full-length transcripts in sweetpotato and its putative ancestor, for the first time, and would facilitate structural, functional and comparative genomics studies in this important crop.


September 22, 2019  |  

A survey of the sorghum transcriptome using single-molecule long reads.

Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ~11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.


September 22, 2019  |  

Complete genome of Cobetia marina JCM 21022T and phylogenomic analysis of the family Halomonadaceae

Cobetia marina is a model proteobacteria in researches on marine biofouling. Its taxonomic nomenclature has been revised many times over the past few decades. To better understand the role of the surface-associated lifestyle of C. marina and the phylogeny of the family Halomonadaceae, we sequenced the entire genome of C. marina JCM 21022T using single molecule real-time sequencing technology (SMRT) and performed comparative genomics and phylogenomics analyses. The circular chromosome was 4 176 300 bp with an average GC content of 62.44% and contained 3 611 predicted coding sequences, 72 tRNA genes, and 21 rRNA genes. The C. marina JCM 21022T genome contained a set of crucial genes involved in surface colonization processes. The comparative genome analysis indicated the significant diff erences between C. marina JCM 21022T and Cobetia amphilecti KMM 296 (formerly named C. marina KMM 296) resulted from sequence insertions or deletions and chromosomal recombination. Despite these diff erences, pan and core genome analysis showed similar gene functions between the two strains. The phylogenomic study of the family Halomonadaceae is reported here for the first time. We found that the relationships were well resolved among every genera tested, including Chromohalobacter, Halomonas, Cobetia, Kushneria, Zymobacter, and Halotalea.


September 22, 2019  |  

Functional genomics of lipid metabolism in the oleaginous yeast Rhodosporidium toruloides.

The basidiomycete yeast Rhodosporidium toruloides (also known as Rhodotorula toruloides) accumulates high concentrations of lipids and carotenoids from diverse carbon sources. It has great potential as a model for the cellular biology of lipid droplets and for sustainable chemical production. We developed a method for high-throughput genetics (RB-TDNAseq), using sequence-barcoded Agrobacterium tumefaciens T-DNA insertions. We identified 1,337 putative essential genes with low T-DNA insertion rates. We functionally profiled genes required for fatty acid catabolism and lipid accumulation, validating results with 35 targeted deletion strains. We identified a high-confidence set of 150 genes affecting lipid accumulation, including genes with predicted function in signaling cascades, gene expression, protein modification and vesicular trafficking, autophagy, amino acid synthesis and tRNA modification, and genes of unknown function. These results greatly advance our understanding of lipid metabolism in this oleaginous species and demonstrate a general approach for barcoded mutagenesis that should enable functional genomics in diverse fungi.


September 22, 2019  |  

A combinatorial approach to synthetic transcription factor-promoter combinations for yeast strain engineering.

Despite the need for inducible promoters in strain development efforts, the majority of engineering in Saccharomyces cerevisiae continues to rely on a few constitutively active or inducible promoters. Building on advances that use the modular nature of both transcription factors and promoter regions, we have built a library of hybrid promoters that are regulated by a synthetic transcription factor. The hybrid promoters consist of native S. cerevisiae promoters, in which the operator regions have been replaced with sequences that are recognized by the bacterial LexA DNA binding protein. Correspondingly, the synthetic transcription factor (TF) consists of the DNA binding domain of the LexA protein, fused with the human estrogen binding domain and the viral activator domain, VP16. The resulting system with a bacterial DNA binding domain avoids the transcription of native S. cerevisiae genes, and the hybrid promoters can be induced using estradiol, a compound with no detectable impact on S. cerevisiae physiology. Using combinations of one, two or three operator sequence repeats and a set of native S. cerevisiae promoters, we obtained a series of hybrid promoters that can be induced to different levels, using the same synthetic TF and a given estradiol. This set of promoters, in combination with our synthetic TF, has the potential to regulate numerous genes or pathways simultaneously, to multiple desired levels, in a single strain.© 2017 The Authors. Yeast published by John Wiley & Sons, Ltd.


September 22, 2019  |  

Deciphering lignocellulose deconstruction by the white rot fungus Irpex lacteus based on genomic and transcriptomic analyses.

Irpex lacteus is one of the most potent white rot fungi for biological pretreatment of lignocellulose for second biofuel production. To elucidate the underlying molecular mechanism involved in lignocellulose deconstruction, genomic and transcriptomic analyses were carried out for I. lacteus CD2 grown in submerged fermentation using ball-milled corn stover as the carbon source.Irpex lacteus CD2 efficiently decomposed 74.9% lignin, 86.3% cellulose, and 83.5% hemicellulose in corn stover within 9 days. Manganese peroxidases were rapidly induced, followed by accumulation of cellulase and hemicellulase. Genomic analysis revealed that I. lacteus CD2 possessed a complete set of lignocellulose-degrading enzyme system composed mainly of class II peroxidases, dye-decolorizing peroxidases, auxiliary enzymes, and 182 glycoside hydrolases. Comparative transcriptomic analysis substantiated the notion of a selection mode of degradation. These analyses also suggested that free radicals, derived either from MnP-organic acid interplay or from Fenton reaction involving Fe2+ and H2O2, could play an important role in lignocellulose degradation.The selective strategy employed by I. lacteus CD2, in combination with low extracellular glycosidases cleaving plant cell wall polysaccharides into fermentable sugars, may account for high pretreatment efficiency of I. lacteus. Our study also hints the importance of free radicals for future designing of novel, robust lignocellulose-degrading enzyme cocktails.


September 22, 2019  |  

Genomic changes associated with the evolutionary transitions of Nostoc to a plant symbiont.

Cyanobacteria belonging to the genus Nostoc comprise free-living strains and also facultative plant symbionts. Symbiotic strains can enter into symbiosis with taxonomically diverse range of host plants. Little is known about genomic changes associated with evolutionary transition of Nostoc from free-living to plant symbiont. Here, we compared the genomes derived from 11 symbiotic Nostoc strains isolated from different host plants and infer phylogenetic relationships between strains. Phylogenetic reconstructions of 89 Nostocales showed that symbiotic Nostoc strains with a broad host range, entering epiphytic and intracellular or extracellular endophytic interactions, form a monophyletic clade indicating a common evolutionary history. A polyphyletic origin was found for Nostoc strains which enter only extracellular symbioses, and inference of transfer events implied that this trait was likely acquired several times in the evolution of the Nostocales. Symbiotic Nostoc strains showed enriched functions in transport and metabolism of organic sulfur, chemotaxis and motility, as well as the uptake of phosphate, branched-chain amino acids, and ammonium. The genomes of the intracellular clade differ from that of other Nostoc strains, with a gain/enrichment of genes encoding proteins to generate l-methionine from sulfite and pathways for the degradation of the plant metabolites vanillin and vanillate, and of the macromolecule xylan present in plant cell walls. These compounds could function as C-sources for members of the intracellular clade. Molecular clock analysis indicated that the intracellular clade emerged ca. 600 Ma, suggesting that intracellular Nostoc symbioses predate the origin of land plants and the emergence of their extant hosts.


September 22, 2019  |  

Genus-wide assessment of lignocellulose utilization in the extremely thermophilic Caldicellulosiruptor by genomic, pan-genomic and metagenomic analysis

Metagenomic data from Obsidian Pool (Yellowstone National Park, USA) and 13 genome sequences were used to reassess genus-wide biodiversity for the extremely thermophilic Caldicellulosiruptor The updated core genome contains 1,401 ortholog groups (average genome size for 13 species = 2,516 genes). The pangenome, which remains open with a revised total of 3,493 ortholog groups, encodes a variety of multidomain glycoside hydrolases (GHs). These include three cellulases with GH48 domains that are colocated in the glucan degradation locus (GDL) and are specific determinants for microcrystalline cellulose utilization. Three recently sequenced species, Caldicellulosiruptor sp. strain Rt8.B8 (renamed here Caldicellulosiruptor morganii), Thermoanaerobacter cellulolyticus strain NA10 (renamed here Caldicellulosiruptor naganoensis), and Caldicellulosiruptor sp. strain Wai35.B1 (renamed here Caldicellulosiruptor danielii), degraded Avicel and lignocellulose (switchgrass). C. morganii was more efficient than Caldicellulosiruptor bescii in this regard and differed from the other 12 species examined, both based on genome content and organization and in the specific domain features of conserved GHs. Metagenomic analysis of lignocellulose-enriched samples from Obsidian Pool revealed limited new information on genus biodiversity. Enrichments yielded genomic signatures closely related to that of Caldicellulosiruptor obsidiansis, but there was also evidence for other thermophilic fermentative anaerobes (Caldanaerobacter, Fervidobacterium, Caloramator, and Clostridium). One enrichment, containing 89.8% Caldicellulosiruptor and 9.7% Caloramator, had a capacity for switchgrass solubilization comparable to that of C. bescii These results refine the known biodiversity of Caldicellulosiruptor and indicate that microcrystalline cellulose degradation at temperatures above 70°C, based on current information, is limited to certain members of this genus that produce GH48 domain-containing enzymes.IMPORTANCE The genus Caldicellulosiruptor contains the most thermophilic bacteria capable of lignocellulose deconstruction, which are promising candidates for consolidated bioprocessing for the production of biofuels and bio-based chemicals. The focus here is on the extant capability of this genus for plant biomass degradation and the extent to which this can be inferred from the core and pangenomes, based on analysis of 13 species and metagenomic sequence information from environmental samples. Key to microcrystalline hydrolysis is the content of the glucan degradation locus (GDL), a set of genes encoding glycoside hydrolases (GHs), several of which have GH48 and family 3 carbohydrate binding module domains, that function as primary cellulases. Resolving the relationship between the GDL and lignocellulose degradation will inform efforts to identify more prolific members of the genus and to develop metabolic engineering strategies to improve this characteristic. Copyright © 2018 American Society for Microbiology.


September 22, 2019  |  

A mosaic monoploid reference sequence for the highly complex genome of sugarcane.

Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after their divergence.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.