Menu
September 22, 2019

Transcriptomic study of Herpes simplex virus type-1 using full-length sequencing techniques

Herpes simplex virus type-1 (HSV-1) is a human pathogenic member of the Alphaherpesvirinae subfamily of herpesviruses. The HSV-1 genome is a large double-stranded DNA specifying about 85 protein coding genes. The latest surveys have demonstrated that the HSV-1 transcriptome is much more complex than it had been thought before. Here, we provide a long-read sequencing dataset, which was generated by using the RSII and Sequel systems from Pacific Biosciences (PacBio), as well as MinION sequencing system from Oxford Nanopore Technologies (ONT). This dataset contains 39,096 reads of inserts (ROIs) mapped to the HSV-1 genome (X14112) in RSII sequencing, while Sequel sequencing yielded 77,851 ROIs. The MinION cDNA sequencing altogether resulted in 158,653 reads, while the direct RNA-seq produced 16,516 reads. This dataset can be utilized for the identification of novel HSV RNAs and transcripts isoforms, as well as for the comparison of the quality and length of the sequencing reads derived from the currently available long- read sequencing platforms. The various library preparation approaches can also be compared with each other.


September 22, 2019

Hybrid sequencing of full-length cDNA transcripts of stems and leaves in Dendrobium officinale.

Dendrobium officinale is an extremely valuable orchid used in traditional Chinese medicine, so sought after that it has a higher market value than gold. Although the expression profiles of some genes involved in the polysaccharide synthesis have previously been investigated, little research has been carried out on their alternatively spliced isoforms in D. officinale. In addition, information regarding the translocation of sugars from leaves to stems in D. officinale also remains limited. We analyzed the polysaccharide content of D. officinale leaves and stems, and completed in-depth transcriptome sequencing of these two diverse tissue types using second-generation sequencing (SGS) and single-molecule real-time (SMRT) sequencing technology. The results of this study yielded a digital inventory of gene and mRNA isoform expressions. A comparative analysis of both transcriptomes uncovered a total of 1414 differentially expressed genes, including 844 that were up-regulated and 570 that were down-regulated in stems. Of these genes, one sugars will eventually be exported transporter (SWEET) and one sucrose transporter (SUT) are expressed to a greater extent in D. officinale stems than in leaves. Two glycosyltransferase (GT) and four cellulose synthase (Ces) genes undergo a distinct degree of alternative splicing. In the stems, the content of polysaccharides is twice as much as that in the leaves. The differentially expressed GT and transcription factor (TF) genes will be the focus of further study. The genes DoSWEET4 and DoSUT1 are significantly expressed in the stem, and are likely to be involved in sugar loading in the phloem.


September 22, 2019

A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.

Despite the economic importance of sugarcane in sugar and bioenergy production, there is not yet a reference genome available. Most of the sugarcane transcriptomic studies have been based on Saccharum officinarum gene indices (SoGI), expressed sequence tags (ESTs) and de novo assembled transcript contigs from short-reads; hence knowledge of the sugarcane transcriptome is limited in relation to transcript length and number of transcript isoforms.The sugarcane transcriptome was sequenced using PacBio isoform sequencing (Iso-Seq) of a pooled RNA sample derived from leaf, internode and root tissues, of different developmental stages, from 22 varieties, to explore the potential for capturing full-length transcript isoforms. A total of 107,598 unique transcript isoforms were obtained, representing about 71% of the total number of predicted sugarcane genes. The majority of this dataset (92%) matched the plant protein database, while just over 2% was novel transcripts, and over 2% was putative long non-coding RNAs. About 56% and 23% of total sequences were annotated against the gene ontology and KEGG pathway databases, respectively. Comparison with de novo contigs from Illumina RNA-Sequencing (RNA-Seq) of the internode samples from the same experiment and public databases showed that the Iso-Seq method recovered more full-length transcript isoforms, had a higher N50 and average length of largest 1,000 proteins; whereas a greater representation of the gene content and RNA diversity was captured in RNA-Seq. Only 62% of PacBio transcript isoforms matched 67% of de novo contigs, while the non-matched proportions were attributed to the inclusion of leaf/root tissues and the normalization in PacBio, and the representation of more gene content and RNA classes in the de novo assembly, respectively. About 69% of PacBio transcript isoforms and 41% of de novo contigs aligned with the sorghum genome, indicating the high conservation of orthologs in the genic regions of the two genomes.The transcriptome dataset should contribute to improved sugarcane gene models and sugarcane protein predictions; and will serve as a reference database for analysis of transcript expression in sugarcane.


September 22, 2019

Indoleacrylic acid produced by commensal Peptostreptococcus species suppresses inflammation.

Host factors in the intestine help select for bacteria that promote health. Certain commensals can utilize mucins as an energy source, thus promoting their colonization. However, health conditions such as inflammatory bowel disease (IBD) are associated with a reduced mucus layer, potentially leading to dysbiosis associated with this disease. We characterize the capability of commensal species to cleave and transport mucin-associated monosaccharides and identify several Clostridiales members that utilize intestinal mucins. One such mucin utilizer, Peptostreptococcus russellii, reduces susceptibility to epithelial injury in mice. Several Peptostreptococcus species contain a gene cluster enabling production of the tryptophan metabolite indoleacrylic acid (IA), which promotes intestinal epithelial barrier function and mitigates inflammatory responses. Furthermore, metagenomic analysis of human stool samples reveals that the genetic capability of microbes to utilize mucins and metabolize tryptophan is diminished in IBD patients. Our data suggest that stimulating IA production could promote anti-inflammatory responses and have therapeutic benefits. Copyright © 2017 Elsevier Inc. All rights reserved.


September 22, 2019

Association of gene expression with biomass content and composition in sugarcane.

About 64% of the total aboveground biomass in sugarcane production is from the culm, of which ~90% is present in fiber and sugars. Understanding the transcriptome in the sugarcane culm, and the transcripts that are associated with the accumulation of the sugar and fiber components would facilitate the modification of biomass composition for enhanced biofuel and biomaterial production. The Sugarcane Iso-Seq Transcriptome (SUGIT) database was used as a reference for RNA-Seq analysis of variation in gene expression between young and mature tissues, and between 10 genotypes with varying fiber content. Global expression analysis suggests that each genotype displayed a unique expression pattern, possibly due to different chromosome combinations and maturation amongst these genotypes. Apart from direct sugar- and fiber-related transcripts, the differentially expressed (DE) transcripts in this study belonged to various supporting pathways that are not obviously involved in the accumulation of these major biomass components. The analysis revealed 1,649 DE transcripts between the young and mature tissues, while 555 DE transcripts were found between the low and high fiber genotypes. Of these, 151 and 23 transcripts respectively, were directly involved in sugar and fiber accumulation. Most of the transcripts identified were up-regulated in the young tissues (2 to 22-fold, FDR adjusted p-value <0.05), which could be explained by the more active metabolism in the young tissues compared to the mature tissues in the sugarcane culm. The results of analysis of the contrasting genotypes suggests that due to the large number of genes contributing to these traits, some of the critical DE transcripts could display less than 2-fold differences in expression and might not be easily identified. However, this transcript profiling analysis identified full-length candidate transcripts and pathways that were likely to determine the differences in sugar and fiber accumulation between tissue types and contrasting genotypes.


September 22, 2019

Assessing the gene content of the megagenome: sugar pine (Pinus lambertiana).

Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq has been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here contribute to the otherwise scarce comparisons of 2nd and 3rd generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data was also used to address some of the questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers. Copyright © 2016 Author et al.


September 22, 2019

Iso-Seq analysis of Nepenthes ampullaria, Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants.

Tropical pitcher plants in the species-rich Nepenthaceae family of carnivorous plants possess unique pitcher organs. Hybridisation, natural or artificial, in this family is extensive resulting in pitchers with diverse features. The pitcher functions as a passive insect trap with digestive fluid for nutrient acquisition in nitrogen-poor habitats. This organ shows specialisation according to the dietary habit of different Nepenthes species. In this study, we performed the first single-molecule real-time isoform sequencing (Iso-Seq) analysis of full-length cDNA from Nepenthes ampullaria which can feed on leaf litter, compared to carnivorous Nepenthes rafflesiana, and their carnivorous hybrid Nepenthes × hookeriana. This allows the comparison of pitcher transcriptomes from the parents and the hybrid to understand how hybridisation could shape the evolution of dietary habit in Nepenthes. Raw reads have been deposited to SRA database with the accession numbers SRX2692198 (N. ampullaria), SRX2692197 (N. rafflesiana), and SRX2692196 (N. × hookeriana).


September 22, 2019

Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis).

Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the post-transcriptional mechanism for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single-molecule long-read sequencing technology and polyadenylation site sequencing (PAS-seq) to re-annotate the bamboo genome, and identify genome-wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full-length non-chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis-annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron-containing full-length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome-root system were identified. Taken together, these results suggest that post-transcriptional regulation may potentially have a vital role in the underground rhizome-root system.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


September 22, 2019

Full-length transcriptome sequencing and modular organization analysis of naringin/neoeriocitrin related gene expression pattern in Drynaria roosii.

Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as ‘GuSuiBu’. The effective components, naringin and neoeriocitrin, share a highly similar chemical structure and medicinal function. Our HPLC-tandem mass spectrometry (MS/MS) results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin-related genes involved in their regulatory pathways. Due to a lack of basic genetic information, we applied a combination of single molecule real-time (SMRT) sequencing and second-generation sequencing (SGS) to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the differentially expressed gene (DEG)-based heat map analysis revealed that naringin/neoeriocitrin-related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. We found that naringin/neoeriocitrin-related DEGs distributed in nine distinct modules, and DEGs in these modules showed significantly different patterns of transcript abundance to be linked to specific tissues or ages. Moreover, weighted gene co-expression network analysis (WGCNA) results further identified that PAL, 4CL and C4H, and C3H and HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis, respectively, and exhibited high co-expression with MYB- and basic helix-leucine-helix (bHLH)-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue and time specificity of the gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome data set provided important genetic information for further research on D. roosii.


September 22, 2019

Species groups distributed across elevational gradients reveal convergent and continuous genetic adaptation to high elevations.

Although many cases of genetic adaptations to high elevations have been reported, the processes driving these modifications and the pace of their evolution remain unclear. Many high-elevation adaptations (HEAs) are thought to have arisen in situ as populations rose with growing mountains. In contrast, most high-elevation lineages of the Qinghai-Tibetan Plateau appear to have colonized from low-elevation areas. These lineages provide an opportunity for studying recent HEAs and comparing them with ancestral low-elevation alternatives. Herein, we compare four frogs (three species of Nanorana and a close lowland relative) and four lizards (Phrynocephalus) that inhabit a range of elevations on or along the slopes of the Qinghai-Tibetan Plateau. The sequential cladogenesis of these species across an elevational gradient allows us to examine the gradual accumulation of HEA at increasing elevations. Many adaptations to high elevations appear to arise gradually and evolve continuously with increasing elevational distributions. Numerous related functions, especially DNA repair and energy metabolism pathways, exhibit rapid change and continuous positive selection with increasing elevations. Although the two studied genera are distantly related, they exhibit numerous convergent evolutionary changes, especially at the functional level. This functional convergence appears to be more extensive than convergence at the individual gene level, although we found 32 homologous genes undergoing positive selection for change in both high-elevation groups. We argue that species groups distributed along a broad elevational gradient provide a more powerful system for testing adaptations to high-elevation environments compared with studies that compare only pairs of high-elevation versus low-elevation species.


September 22, 2019

Avian transcriptomics: opportunities and challenges

Recent developments in next-generation sequencing technologies have greatly facilitated the study of whole transcriptomes in model and non-model species. Studying the transcriptome and how it changes across a variety of biological conditions has had major implications for our understanding of how the genome is regulated in different contexts, and how to interpret adaptations and the phenotype of an organism. The aim of this review is to highlight the potential of these new technologies for the study of avian transcriptomics, and to summarise how transcriptomics has been applied in ornithology. A total of 81 peer-reviewed scientific articles that used transcriptomics to answer questions within a broad range of study areas in birds are used as examples throughout the review. We further provide a quick guide to highlight the most important points which need to be take into account when planning a transcriptomic study in birds, and discuss how researchers with little background in molecular biology can avoid potential pitfalls. Suggestions for further reading are supplied throughout. We also discuss possible future developments in the technology platforms used for ribonucleic acid sequencing. By summarising how these novel technologies can be used to answer questions that have long been asked by ornithologists, we hope to bridge the gap between traditional ornithology and genomics, and to stimulate more interdisciplinary research.


September 22, 2019

MCF-7 breast cancer cell line PacBio generated transcriptome has ~300 novel transcribed regions, un-annotated in both RefSeq and GENCODE, and absent in the liver, heart and brain transcriptomes

Illuminating the “dark” regions of the human genome remains an ongoing effort, a decade and a half after the human genome was sequenced – RefSeq and GENCODE being two of the major annotation databases. Pacific Biosciences (PacBio) has provided open access to the transcriptome of MCF-7, a breast cancer cell line that has provided significant therapeutic advancement in breast cancer research since the 1970s. PacBio sequencing generates much longer reads compared to second-generation sequencing technologies, with a trade-off of lower throughput, higher error rate and more cost per base. Here, this transcriptome was analyzed using the YeATS pipeline, with additionally introduced kmer based algorithms, reducing computational times to a few hours on a simple workstation. Out of ~300 transcripts that have no match in both RefSeq and GENCODE, ~250 are absent in the transcriptomes of the heart, liver and brain, also provided by PacBio. Also, ~200 transcripts are absent in a recent catalogue of un-annotated long non-coding RNAs from 6,503 samples (~43 Terabases of sequence data) [1], and only two present in common in an experimental workflow RACE-Seq that reported 2,556 novel transcripts [2]. ~100 transcripts have >100 amino acid open reading frames, and have the potential of being protein coding genes. ORF based annotation also identified few bacterial transcripts in the PacBio database mapped to the human genome, and one human transcript that has been annotated as bacterial in the NCBI database. The current work reiterates the under-utilization of transcriptomes for annotating genomes. It also provides new leads for investigating breast cancer by virtue of exclusively expressed transcripts not expressed in other tissues, which have the prospects of breast cancer biomarkers based on further investigations.


September 22, 2019

Full-length transcriptome sequences of ephemeral plant Arabidopsis pumila provides insight into gene expression dynamics during continuous salt stress.

Arabidopsis pumila is native to the desert region of northwest China and it is extraordinarily well adapted to the local semi-desert saline soil, thus providing a candidate plant system for environmental adaptation and salt-tolerance gene mining. However, understanding of the salt-adaptation mechanism of this species is limited because of genomic sequences scarcity. In the present study, the transcriptome profiles of A. pumila leaf tissues treated with 250 mM NaCl for 0, 0.5, 3, 6, 12, 24 and 48 h were analyzed using a combination of second-generation sequencing (SGS) and third-generation single-molecule real-time (SMRT) sequencing.Correction of SMRT long reads by SGS short reads resulted in 59,328 transcripts. We found 8075 differentially expressed genes (DEGs) between salt-stressed tissues and controls, of which 483 were transcription factors and 1157 were transport proteins. Most DEGs were activated within 6 h of salt stress and their expression stabilized after 48 h; the number of DEGs was greatest within 12 h of salt stress. Gene annotation and functional analyses revealed that expression of genes associated with the osmotic and ionic phases rapidly and coordinately changed during the continuous salt stress in this species, and salt stress-related categories were highly enriched among these DEGs, including oxidation-reduction, transmembrane transport, transcription factor activity and ion channel activity. Orphan, MYB, HB, bHLH, C3H, PHD, bZIP, ARF and NAC TFs were most enriched in DEGs; ABCB1, CLC-A, CPK30, KEA2, KUP9, NHX1, SOS1, VHA-A and VP1 TPs were extensively up-regulated in salt-stressed samples, suggesting that they play important roles in slat tolerance. Importantly, further experimental studies identified a mitogen-activated protein kinase (MAPK) gene MAPKKK18 as continuously up-regulated throughout salt stress, suggesting its crucial role in salt tolerance. The expression patterns of the salt-responsive 24 genes resulted from quantitative real-time PCR were basically consistent with their transcript abundance changes identified by RNA-Seq.The full-length transcripts generated in this study provide a more accurate depiction of gene transcription of A. pumila. We identified potential genes involved in salt tolerance of A. pumila. These data present a genetic resource and facilitate better understanding of salt-adaptation mechanism for ephemeral plants.


September 22, 2019

A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.

Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.© 2018 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Comparative transcriptome analysis of genes involved in Na+ transport in the leaves of halophyte Halogeton glomeratus.

Compartmentalization of Na+ into vacuoles is considered to be the most critical aspect of salt tolerance in H. glomeratus, an annual, succulent halophyte. Previous analysis of transcriptome involved in the H. glomeratus salt stress response relied on next-generation sequencing technologies that limit the capture of accurately spliced, full-length isoforms. To gain deeper insights into its salt stress response, we used the H. glomeratus Iso-Seq transcriptome database as a reference, and subsequent next-generation sequencing was subjected to various NaCl concentrations of leaves from plants revealed 115 upregulated and 87 downregulated differentially expressed isoforms (core DEIs). The majority of the core DEIs were involved in carbohydrate metabolism and energy production and conversion. In contrast, levels of known isoforms encoding Na+ transporters did not change significantly under salt stress. However, 16 core DEIs of unknown function were predicted to possess transmembrane domains, suggesting that these candidate isoforms could be involved in Na+ transport in H. glomeratus. These results suggest a potential means for identification of novel Na+ transporters, in addition to providing a foundation for further investigation of Na+ transport networks in halophytes. Copyright © 2018. Published by Elsevier B.V.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.