Menu
September 22, 2019  |  

A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.

Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.© 2018 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Transcriptional fates of human-specific segmental duplications in brain.

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.© 2018 Dougherty et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

The state of play in higher eukaryote gene annotation.

A genome sequence is worthless if it cannot be deciphered; therefore, efforts to describe – or ‘annotate’ – genes began as soon as DNA sequences became available. Whereas early work focused on individual protein-coding genes, the modern genomic ocean is a complex maelstrom of alternative splicing, non-coding transcription and pseudogenes. Scientists – from clinicians to evolutionary biologists – need to navigate these waters, and this has led to the design of high-throughput, computationally driven annotation projects. The catalogues that are being produced are key resources for genome exploration, especially as they become integrated with expression, epigenomic and variation data sets. Their creation, however, remains challenging.


September 22, 2019  |  

A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.

It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.


September 22, 2019  |  

Single-cell mRNA isoform diversity in the mouse brain.

Alternative mRNA isoform usage is an important source of protein diversity in mammalian cells. This phenomenon has been extensively studied in bulk tissues, however, it remains unclear how this diversity is reflected in single cells.Here we use long-read sequencing technology combined with unique molecular identifiers (UMIs) to reveal patterns of alternative full-length isoform expression in single cells from the mouse brain. We found a surprising amount of isoform diversity, even after applying a conservative definition of what constitutes an isoform. Genes tend to have one or a few isoforms highly expressed and a larger number of isoforms expressed at a low level. However, for many genes, nearly every sequenced mRNA molecule was unique, and many events affected coding regions suggesting previously unknown protein diversity in single cells. Exon junctions in coding regions were less prone to splicing errors than those in non-coding regions, indicating purifying selection on splice donor and acceptor efficiency.Our findings indicate that mRNA isoform diversity is an important source of biological variability also in single cells.


September 22, 2019  |  

High-quality reference transcript datasets hold the key to transcript-specific RNA-sequencing analysis in plants.

Re-programming of the transcriptome involves both transcription and alternative splicing (AS). Some genes are regulated only at the AS level with no change in expression at the gene level. AS data must be incorporated as an essential aspect of the regulation of gene expression. RNA-sequencing (RNA-seq) can deliver both transcriptional and AS information, but accurate methods to analyse the added complexity in RNA-seq data are needed. The construction of a comprehensive reference transcript dataset (RTD) for a specific plant species, variety or accession, from all available sequence data, will immediately allow more robust analysis of RNA-seq data. RTDs will continually evolve and improve, a process that will be more efficient if resources across a community are shared and pooled.© 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.


September 22, 2019  |  

The expressed portion of the barley genome

In this chapter, we refer to the expressed portion of the barley genome as the relatively small fraction of the total cellular DNA that either contains the genes that ultimately produce proteins, or that directly/indirectly controls the level, location and/or timing of when these genes are expressed and proteins are produced. We start by describing the dynamics of tissue and time-dependent gene expression and how common patterns across multiple samples can provide clues about gene networks involved in common biological processes. We then describe some of the complexities of how a single mRNA template can be differentially processed by alternative splicing to generate multiple different proteins or provide a mechanism to regulate the amount of functional gene product in a cell at a given point in time. We extend our analysis, using a number of biological examples, to address how diverse families of small non-coding microRNAs specifically regulate gene expression, and complete our appraisal by looking at the physical/molecular environment around genes that can result in either the promotion or repression of gene expression. We conclude by assessing some of the issues that remain around our ability to fully exploit the depth and power of current approaches for analysing gene expression and propose improvements that could be made using new but available sequencing and bioinformatics technologies.


September 22, 2019  |  

Transcriptome characterization of moso bamboo (Phyllostachys edulis) seedlings in response to exogenous gibberellin applications.

Moso bamboo (Phyllostachys edulis) is a well-known bamboo species of high economic value in the textile industry due to its rapid growth. Phytohormones, which are master regulators of growth and development, serve as important endogenous signals. However, the mechanisms through which phytohormones regulate growth in moso bamboo remain unknown to date.Here, we reported that exogenous gibberellins (GA) applications resulted in a significantly increased internode length and lignin condensation. Transcriptome sequencing revealed that photosynthesis-related genes were enriched in the GA-repressed gene class, which was consistent with the decrease in leaf chlorophyll concentrations and the lower rate of photosynthesis following GA treatment. Exogenous GA applications on seedlings are relatively easy to perform, thus we used 4-week-old whole seedlings of bamboo for GA- treatment followed by high throughput sequencing. In this study, we identified 932 cis-nature antisense transcripts (cis-NATs), and 22,196 alternative splicing (AS) events in total. Among them, 42 cis-nature antisense transcripts (cis-NATs) and 442 AS events were differentially expressed upon exposure to exogenous GA3, suggesting that post-transcriptional regulation might be also involved in the GA3 response. Targets of differential expression of cis-NATs included genes involved in hormone receptor, photosynthesis and cell wall biogenesis. For example, LAC4 and its corresponding cis-NATs were GA3-induced, and may be involved in the accumulation of lignin, thus affecting cell wall composition.This study provides novel insights illustrating how GA alters post-transcriptional regulation and will shed light on the underlying mechanism of growth modulated by GA in moso bamboo.


September 22, 2019  |  

Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing.

Red clover (Trifolium pratense L.) is an important cool-season legume plant, which is the most widely planted forage legume after alfalfa. Although a draft genome sequence was published already, the sequences and completed structure of mRNA transcripts remain unclear, which limit further explore on red clover.In this study, the red clover transcriptome was sequenced using single-molecule long-read sequencing to identify full-length splice isoforms, and 29,730 novel isoforms from known genes and 2194 novel isoforms from novel genes were identified. A total of 5492 alternative splicing events was identified and the majority of alter spliced events in red clover was corrected as intron retention. In addition, of the 15,229 genes detected by SMRT, 8719 including 186,517 transcripts have at least one poly(A) site. Furthermore, we identified 4333 long non-coding RNAs and 3762 fusion transcripts.We analyzed full-length transcriptome of red clover with PacBio SMRT. Those new findings provided important information for improving red clover draft genome annotation and fully characterization of red clover transcriptome.


September 22, 2019  |  

Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms.

Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100?bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50?bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore’s MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.


September 22, 2019  |  

Analyses of alternative polyadenylation: from old school biochemistry to high-throughput technologies.

Alternations in usage of polyadenylation sites during transcription termination yield transcript isoforms from a gene. Recent findings of transcriptome-wide alternative polyadenylation (APA) as a molecular response to changes in biology position APA not only as a molecular event of early transcriptional termination but also as a cellular regulatory step affecting various biological pathways. With the development of high-throughput profiling technologies at a single nucleotide level and their applications targeted to the 3′-end of mRNAs, dynamics in the landscape of mRNA 3′-end is measureable at a global scale. In this review, methods and technologies that have been adopted to study APA events are discussed. In addition, various bioinformatics algorithms for APA isoform analysis using publicly available RNA-seq datasets are introduced. [BMB Reports 2017; 50(4): 201-207].


September 22, 2019  |  

Global analysis of epigenetic regulation of gene expression in response to drought stress in Sorghum.

Abiotic stresses including drought are major limiting factors of crop yields and cause significant crop losses. Acquisition of stress tolerance to abiotic stresses requires coordinated regulation of a multitude of biochemical and physiological changes, and most of these changes depend on alterations in gene expression. The goal of this work is to perform global analysis of differential regulation of gene expression and alternative splicing, and their relationship with chromatin landscape in drought sensitive and tolerant cultivars. our Iso-Seq study revealed transcriptome-wide full-length isoforms at an unprecedented scale with over 11000 novel splice isoforms. Additionally, we uncovered alternative polyadenylation sites of ~11000 expressed genes and many novel genes. Overall, Iso-Seq results greatly enhanced sorghum gene annotations that are not only useful in analyentified differentially expressed genes and splicing events that are correlated with tzing all our RNA-seq, ChIP-seq and ATAC-seq data but also serve as a great resource to the plant biology community. Our studies idhe drought-resistant phenotype. An association between alternative splicing and chromatin accessibility was also revealed. Several computational tools developed here (TAPIS and iDiffIR) have been made freely available to the research community in analyzing alternative splicing and differential alternative splicing.


September 22, 2019  |  

Single-cell RNAseq for the study of isoforms-how is that possible?

Single-cell RNAseq and alternative splicing studies have recently become two of the most prominent applications of RNAseq. However, the combination of both is still challenging, and few research efforts have been dedicated to the intersection between them. Cell-level insight on isoform expression is required to fully understand the biology of alternative splicing, but it is still an open question to what extent isoform expression analysis at the single-cell level is actually feasible. Here, we establish a set of four conditions that are required for a successful single-cell-level isoform study and evaluate how these conditions are met by these technologies in published research.


September 22, 2019  |  

Full-length RNA sequencing reveals unique transcriptome composition in bermudagrass.

Bermudagrass [Cynodon dactylon (L.) Pers.] is an important perennial warm-season turfgrass species with great economic value. However, the reference genome and transcriptome information are still deficient in bermudagrass, which severely impedes functional and molecular breeding studies. In this study, through analyzing a mixture sample of leaves, stolons, shoots, roots and flowers with single-molecule long-read sequencing technology from Pacific Biosciences (PacBio), we reported the first full-length transcriptome dataset of bermudagrass (C. dactylon cultivar Yangjiang) comprising 78,192 unigenes. Among the unigenes, 66,409 were functionally annotated, whereas 27,946 were found to have two or more isoforms. The annotated full-length unigenes provided many new insights into gene sequence characteristics and systematic phylogeny of bermudagrass. By comparison with transcriptome dataset in nine grass species, KEGG pathway analyses further revealed that C4 photosynthesis-related genes, notably the phosphoenolpyruvate carboxylase and pyruvate, phosphate dikinase genes, are specifically enriched in bermudagrass. These results not only explained the possible reason why bermudagrass flourishes in warm areas but also provided a solid basis for future studies in this important turfgrass species. Copyright © 2018 Elsevier Masson SAS. All rights reserved.


September 22, 2019  |  

Single-molecule long-read sequencing facilitates shrimp transcriptome research.

Although shrimp are of great economic importance, few full-length shrimp transcriptomes are available. Here, we used Pacific Biosciences single-molecule real-time (SMRT) long-read sequencing technology to generate transcripts from the Pacific white shrimp (Litopenaeus vannamei). We obtained 322,600 full-length non-chimeric reads, from which we generated 51,367 high-quality unique full-length transcripts. We corrected errors in the SMRT sequences by comparison with Illumina-produced short reads. We successfully annotated 81.72% of all unique SMRT transcripts against the NCBI non-redundant database, 58.63% against Swiss-Prot, 45.38% against Gene Ontology, 32.57% against Clusters of Orthologous Groups of proteins (COG), and 47.83% against Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Across all transcripts, we identified 3,958 long non-coding RNAs (lncRNAs) and 80,650 simple sequence repeats (SSRs). Our study provides a rich set of full-length cDNA sequences for L. vannamei, which will greatly facilitate shrimp transcriptome research.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.