Menu
September 22, 2019

Association of gene expression with biomass content and composition in sugarcane.

About 64% of the total aboveground biomass in sugarcane production is from the culm, of which ~90% is present in fiber and sugars. Understanding the transcriptome in the sugarcane culm, and the transcripts that are associated with the accumulation of the sugar and fiber components would facilitate the modification of biomass composition for enhanced biofuel and biomaterial production. The Sugarcane Iso-Seq Transcriptome (SUGIT) database was used as a reference for RNA-Seq analysis of variation in gene expression between young and mature tissues, and between 10 genotypes with varying fiber content. Global expression analysis suggests that each genotype displayed a unique expression pattern, possibly due to different chromosome combinations and maturation amongst these genotypes. Apart from direct sugar- and fiber-related transcripts, the differentially expressed (DE) transcripts in this study belonged to various supporting pathways that are not obviously involved in the accumulation of these major biomass components. The analysis revealed 1,649 DE transcripts between the young and mature tissues, while 555 DE transcripts were found between the low and high fiber genotypes. Of these, 151 and 23 transcripts respectively, were directly involved in sugar and fiber accumulation. Most of the transcripts identified were up-regulated in the young tissues (2 to 22-fold, FDR adjusted p-value <0.05), which could be explained by the more active metabolism in the young tissues compared to the mature tissues in the sugarcane culm. The results of analysis of the contrasting genotypes suggests that due to the large number of genes contributing to these traits, some of the critical DE transcripts could display less than 2-fold differences in expression and might not be easily identified. However, this transcript profiling analysis identified full-length candidate transcripts and pathways that were likely to determine the differences in sugar and fiber accumulation between tissue types and contrasting genotypes.


September 22, 2019

Assessing the gene content of the megagenome: sugar pine (Pinus lambertiana).

Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq has been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here contribute to the otherwise scarce comparisons of 2nd and 3rd generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data was also used to address some of the questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers. Copyright © 2016 Author et al.


September 22, 2019

Long-read isoform sequencing reveals a hidden complexity of the transcriptional landscape of Herpes Simplex Virus Type 1.

In this study, we used the amplified isoform sequencing technique from Pacific Biosciences to characterize the poly(A)(+) fraction of the lytic transcriptome of the herpes simplex virus type 1 (HSV-1). Our analysis detected 34 formerly unidentified protein-coding genes, 10 non-coding RNAs, as well as 17 polycistronic and complex transcripts. This work also led us to identify many transcript isoforms, including 13 splice and 68 transcript end variants, as well as several transcript overlaps. Additionally, we determined previously unascertained transcriptional start and polyadenylation sites. We analyzed the transcriptional activity from the complementary DNA strand in five convergent HSV gene pairs with quantitative RT-PCR and detected antisense RNAs in each gene. This part of the study revealed an inverse correlation between the expressions of convergent partners. Our work adds new insights for understanding the complexity of the pervasive transcriptional overlaps by suggesting that there is a crosstalk between adjacent and distal genes through interaction between their transcription apparatuses. We also identified transcripts overlapping the HSV replication origins, which may indicate an interplay between the transcription and replication machineries. The relative abundance of HSV-1 transcripts has also been established by using a novel method based on the calculation of sequencing reads for the analysis.


September 22, 2019

Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis).

Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the post-transcriptional mechanism for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single-molecule long-read sequencing technology and polyadenylation site sequencing (PAS-seq) to re-annotate the bamboo genome, and identify genome-wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full-length non-chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis-annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron-containing full-length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome-root system were identified. Taken together, these results suggest that post-transcriptional regulation may potentially have a vital role in the underground rhizome-root system.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


September 22, 2019

Full-length transcriptome sequencing and modular organization analysis of naringin/neoeriocitrin related gene expression pattern in Drynaria roosii.

Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as ‘GuSuiBu’. The effective components, naringin and neoeriocitrin, share a highly similar chemical structure and medicinal function. Our HPLC-tandem mass spectrometry (MS/MS) results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin-related genes involved in their regulatory pathways. Due to a lack of basic genetic information, we applied a combination of single molecule real-time (SMRT) sequencing and second-generation sequencing (SGS) to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the differentially expressed gene (DEG)-based heat map analysis revealed that naringin/neoeriocitrin-related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. We found that naringin/neoeriocitrin-related DEGs distributed in nine distinct modules, and DEGs in these modules showed significantly different patterns of transcript abundance to be linked to specific tissues or ages. Moreover, weighted gene co-expression network analysis (WGCNA) results further identified that PAL, 4CL and C4H, and C3H and HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis, respectively, and exhibited high co-expression with MYB- and basic helix-leucine-helix (bHLH)-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue and time specificity of the gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome data set provided important genetic information for further research on D. roosii.


September 22, 2019

Effect of Chinese rice wine sludge on the production of Chinese steamed buns

Chinese rice wine sludge (CRWS), analogous to beer yeast sludge, is the filter cake remaining after squeezing the fermentation mash of Chinese rice wine. CRWS contains high levels of protein (44.74%), nonstructural carbohydrates (37.33%), crude fiber (13.5%), and essential amino acids, which could enhance the trophic value of Chinese steamed buns. In our research, the microbiota of CRWS (mainly Saccharomyces cerevisiae and Lactobacillus sp.) was analyzed at the species level by single-molecule real-time DNA sequencing technology. Interestingly, the microbiota of CRWS was similar to that of the starter dough typically used to prepare Chinese steamed buns. Incorporation of CRWS significantly influenced the pasting properties and farinograph characteristics of the dough, which control the texture of the Chinese steamed buns, and supplementation with 5~30% CRWS caused the properties of the resulting buns to be more similar to those of northern-style steamed buns. CRWS addition also significantly enhanced the content of aroma compounds in the Chinese steamed buns.


September 22, 2019

Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content.

Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


September 22, 2019

MCF-7 breast cancer cell line PacBio generated transcriptome has ~300 novel transcribed regions, un-annotated in both RefSeq and GENCODE, and absent in the liver, heart and brain transcriptomes

Illuminating the “dark” regions of the human genome remains an ongoing effort, a decade and a half after the human genome was sequenced – RefSeq and GENCODE being two of the major annotation databases. Pacific Biosciences (PacBio) has provided open access to the transcriptome of MCF-7, a breast cancer cell line that has provided significant therapeutic advancement in breast cancer research since the 1970s. PacBio sequencing generates much longer reads compared to second-generation sequencing technologies, with a trade-off of lower throughput, higher error rate and more cost per base. Here, this transcriptome was analyzed using the YeATS pipeline, with additionally introduced kmer based algorithms, reducing computational times to a few hours on a simple workstation. Out of ~300 transcripts that have no match in both RefSeq and GENCODE, ~250 are absent in the transcriptomes of the heart, liver and brain, also provided by PacBio. Also, ~200 transcripts are absent in a recent catalogue of un-annotated long non-coding RNAs from 6,503 samples (~43 Terabases of sequence data) [1], and only two present in common in an experimental workflow RACE-Seq that reported 2,556 novel transcripts [2]. ~100 transcripts have >100 amino acid open reading frames, and have the potential of being protein coding genes. ORF based annotation also identified few bacterial transcripts in the PacBio database mapped to the human genome, and one human transcript that has been annotated as bacterial in the NCBI database. The current work reiterates the under-utilization of transcriptomes for annotating genomes. It also provides new leads for investigating breast cancer by virtue of exclusively expressed transcripts not expressed in other tissues, which have the prospects of breast cancer biomarkers based on further investigations.


September 22, 2019

Full-length transcriptome sequences of ephemeral plant Arabidopsis pumila provides insight into gene expression dynamics during continuous salt stress.

Arabidopsis pumila is native to the desert region of northwest China and it is extraordinarily well adapted to the local semi-desert saline soil, thus providing a candidate plant system for environmental adaptation and salt-tolerance gene mining. However, understanding of the salt-adaptation mechanism of this species is limited because of genomic sequences scarcity. In the present study, the transcriptome profiles of A. pumila leaf tissues treated with 250 mM NaCl for 0, 0.5, 3, 6, 12, 24 and 48 h were analyzed using a combination of second-generation sequencing (SGS) and third-generation single-molecule real-time (SMRT) sequencing.Correction of SMRT long reads by SGS short reads resulted in 59,328 transcripts. We found 8075 differentially expressed genes (DEGs) between salt-stressed tissues and controls, of which 483 were transcription factors and 1157 were transport proteins. Most DEGs were activated within 6 h of salt stress and their expression stabilized after 48 h; the number of DEGs was greatest within 12 h of salt stress. Gene annotation and functional analyses revealed that expression of genes associated with the osmotic and ionic phases rapidly and coordinately changed during the continuous salt stress in this species, and salt stress-related categories were highly enriched among these DEGs, including oxidation-reduction, transmembrane transport, transcription factor activity and ion channel activity. Orphan, MYB, HB, bHLH, C3H, PHD, bZIP, ARF and NAC TFs were most enriched in DEGs; ABCB1, CLC-A, CPK30, KEA2, KUP9, NHX1, SOS1, VHA-A and VP1 TPs were extensively up-regulated in salt-stressed samples, suggesting that they play important roles in slat tolerance. Importantly, further experimental studies identified a mitogen-activated protein kinase (MAPK) gene MAPKKK18 as continuously up-regulated throughout salt stress, suggesting its crucial role in salt tolerance. The expression patterns of the salt-responsive 24 genes resulted from quantitative real-time PCR were basically consistent with their transcript abundance changes identified by RNA-Seq.The full-length transcripts generated in this study provide a more accurate depiction of gene transcription of A. pumila. We identified potential genes involved in salt tolerance of A. pumila. These data present a genetic resource and facilitate better understanding of salt-adaptation mechanism for ephemeral plants.


September 22, 2019

Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853.

Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the virulence genes lecA, lasB, quorum sensing regulators LasI/R, and the type I, III and VI secretion systems were observed in the two strains.The complete genome sequence of P. aeruginosa ATCC 27853 reveals the comprehensive genetic background of the strain, and provides genetic basis for several interesting findings about the functions of surface associated proteins, prophages, and genomic islands. Comparative transcriptome analysis of P. aeruginosa ATCC 27853 and PAO1 revealed several classes of differentially expressed genes in the two strains, underlying the genetic and molecular details of several known and yet to be explored morphological and physiological potentials of P. aeruginosa ATCC 27853.


September 22, 2019

A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.

Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.© 2018 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation.

Species-specific, new, or “orphan” genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.© 2018 Werner et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

A community-based culture collection for targeting novel plant growth-promoting bacteria from the sugarcane microbiome.

The soil-plant ecosystem harbors an immense microbial diversity that challenges investigative approaches to study traits underlying plant-microbe association. Studies solely based on culture-dependent techniques have overlooked most microbial diversity. Here we describe the concomitant use of culture-dependent and -independent techniques to target plant-beneficial microbial groups from the sugarcane microbiome. The community-based culture collection (CBC) approach was used to access microbes from roots and stalks. The CBC recovered 399 unique bacteria representing 15.9% of the rhizosphere core microbiome and 61.6-65.3% of the endophytic core microbiomes of stalks. By cross-referencing the CBC (culture-dependent) with the sugarcane microbiome profile (culture-independent), we designed a synthetic community comprised of naturally occurring highly abundant bacterial groups from roots and stalks, most of which has been poorly explored so far. We then used maize as a model to probe the abundance-based synthetic inoculant. We show that when inoculated in maize plants, members of the synthetic community efficiently colonize plant organs, displace the natural microbiota and dominate at 53.9% of the rhizosphere microbial abundance. As a result, inoculated plants increased biomass by 3.4-fold as compared to uninoculated plants. The results demonstrate that abundance-based synthetic inoculants can be successfully applied to recover beneficial plant microbes from plant microbiota.


September 22, 2019

The state of play in higher eukaryote gene annotation.

A genome sequence is worthless if it cannot be deciphered; therefore, efforts to describe – or ‘annotate’ – genes began as soon as DNA sequences became available. Whereas early work focused on individual protein-coding genes, the modern genomic ocean is a complex maelstrom of alternative splicing, non-coding transcription and pseudogenes. Scientists – from clinicians to evolutionary biologists – need to navigate these waters, and this has led to the design of high-throughput, computationally driven annotation projects. The catalogues that are being produced are key resources for genome exploration, especially as they become integrated with expression, epigenomic and variation data sets. Their creation, however, remains challenging.


September 22, 2019

Analyses of intestinal microbiota: culture versus sequencing.

Analyzing human as well as animal microbiota composition has gained growing interest because structural components and metabolites of microorganisms fundamentally influence all aspects of host physiology. Originally dominated by culture-dependent methods for exploring these ecosystems, the development of molecular techniques such as high throughput sequencing has dramatically increased our knowledge. Because many studies of the microbiota are based on the bacterial 16S ribosomal RNA (rRNA) gene targets, they can, at least in principle, be compared to determine the role of the microbiome composition for developmental processes, host metabolism, and physiology as well as different diseases. In our review, we will summarize differences and pitfalls in current experimental protocols, including all steps from nucleic acid extraction to bioinformatical analysis which may produce variation that outweighs subtle biological differences. Future developments, such as integration of metabolomic, transcriptomic, and metagenomic data sets and standardization of the procedures, will be discussed. © The Author 2015. Published by Oxford University Press on behalf of the Institute for Laboratory Animal Research. All rights reserved. For permissions, please email: journals.permissions@oup.com.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.