April 21, 2020  |  

The landscape of SNCA transcripts across synucleinopathies: New insights from long reads sequencing analysis

Dysregulation of alpha-synuclein expression has been implicated in the pathogenesis of synucleinopathies, in particular Parkinsontextquoterights Disease (PD) and Dementia with Lewy bodies (DLB). Previous studies have shown that the alternatively spliced isoforms of the SNCA gene are differentially expressed in different parts of the brain for PD and DLB patients. Similarly, SNCA isoforms with skipped exons can have a functional impact on the protein domains. The large intronic region of the SNCA gene was also shown to harbor structural variants that affect transcriptional levels. Here we apply the first study of using long read sequencing with targeted capture of both the gDNA and cDNA of the SNCA gene in brain tissues of PD, DLB, and control samples using the PacBio Sequel system. The targeted full-length cDNA (Iso-Seq) data confirmed complex usage of known alternative start sites and variable 3textquoteright UTR lengths, as well as novel 5textquoteright starts and 3textquoteright ends not previously described. The targeted gDNA data allowed phasing of up to 81% of the ~114kb SNCA region, with the longest phased block excedding 54 kb. We demonstrate that long gDNA and cDNA reads have the potential to reveal long-range information not previously accessible using traditional sequencing methods. This approach has a potential impact in studying disease risk genes such as SNCA, providing new insights into the genetic etiologies, including perturbations to the landscape the gene transcripts, of human complex diseases such as synucleinopathies.


April 21, 2020  |  

Targeted Long-Read RNA Sequencing Demonstrates Transcriptional Diversity Driven by Splice-Site Variation in MYBPC3.

To date, clinical sequencing has focused on genomic DNA using targeted panels and exome sequencing. Sequencing of a large hypertrophic cardiomyopathy (HCM) cohort revealed that positive identification of a disease-associated variant was returned in only 32% of patients, with an additional 15% receiving inconclusive results. When genome sequencing fails to reveal causative variants, the transcriptome may provide additional diagnostic clarity. A recent study examining patients with genetically undiagnosed muscle disorders found that RNA sequencing, when used as a complement to exome and whole genome sequencing, had an overall diagnosis rate of 35%.


April 21, 2020  |  

Alternative Splicing of the Delta-Opioid Receptor Gene Suggests Existence of New Functional Isoforms.

The delta-opioid receptor (DOPr) participates in mediating the effects of opioid analgesics. However, no selective agonists have entered clinical care despite potential to ameliorate many neurological and psychiatric disorders. In an effort to address the drug development challenges, the functional contribution of receptor isoforms created by alternative splicing of the three-exonic coding gene, OPRD1, has been overlooked. We report that the gene is transcriptionally more diverse than previously demonstrated, producing novel protein isoforms in humans and mice. We provide support for the functional relevance of splice variants through context-dependent expression profiling (tissues, disease model) and conservation of the transcriptional landscape in closely related vertebrates. The conserved alternative transcriptional events have two distinct patterns. First, cassette exon inclusions between exons 1 and 2 interrupt the reading frame, producing truncated receptor fragments comprising only the first transmembrane (TM) domain, despite the lack of exact exon orthologues between distant species. Second, a novel promoter and transcriptional start site upstream of exon 2 produces a transcript of an N-terminally truncated 6TM isoform. However, a fundamental difference in the exonic landscaping as well as translation and translation products poses limits for modelling the human DOPr receptor system in mice.


April 21, 2020  |  

The developmental dynamics of the Populus stem transcriptome.

The Populus shoot undergoes primary growth (longitudinal growth) followed by secondary growth (radial growth), which produces biomass that is an important source of energy worldwide. We adopted joint PacBio Iso-Seq and RNA-seq analysis to identify differentially expressed transcripts along a developmental gradient from the shoot apex to the fifth internode of Populus Nanlin895. We obtained 87 150 full-length transcripts, including 2081 new isoforms and 62 058 new alternatively spliced isoforms, most of which were produced by intron retention, that were used to update the Populus annotation. Among these novel isoforms, there are 1187 long non-coding RNAs and 356 fusion genes. Using this annotation, we found 15 838 differentially expressed transcripts along the shoot developmental gradient, of which 1216 were transcription factors (TFs). Only a few of these genes were reported previously. The differential expression of these TFs suggests that they may play important roles in primary and secondary growth. AP2, ARF, YABBY and GRF TFs are highly expressed in the apex, whereas NAC, bZIP, PLATZ and HSF TFs are likely to be important for secondary growth. Overall, our findings provide evidence that long-read sequencing can complement short-read sequencing for cataloguing and quantifying eukaryotic transcripts and increase our understanding of the vital and dynamic process of shoot development. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


April 21, 2020  |  

Comprehensive identification of the full-length transcripts and alternative splicing related to the secondary metabolism pathways in the tea plant (Camellia sinensis).

Flavonoids, theanine and caffeine are the main secondary metabolites of the tea plant (Camellia sinensis), which account for the tea’s unique flavor quality and health benefits. The biosynthesis pathways of these metabolites have been extensively studied at the transcriptional level, but the regulatory mechanisms are still unclear. In this study, to explore the transcriptome diversity and complexity of tea plant, PacBio Iso-Seq and RNA-seq analysis were combined to obtain full-length transcripts and to profile the changes in gene expression during the leaf development. A total of 1,388,066 reads of insert (ROI) were generated with an average length of 1,762?bp, and more than 54% (755,716) of the ROIs were full-length non-chimeric (FLNC) reads. The Benchmarking Universal Single-Copy Orthologue (BUSCO) completeness was 92.7%. A total of 93,883 non-redundant transcripts were obtained, and 87,395 (93.1%) were new alternatively spliced isoforms. Meanwhile, 7,650 differential expression transcripts (DETs) were identified. A total of 28,980 alternative splicing (AS) events were predicted, including 1,297 differential AS (DAS) events. The transcript isoforms of the key genes involved in the flavonoid, theanine and caffeine biosynthesis pathways were characterized. Additionally, 5,777 fusion transcripts and 9,052 long non-coding RNAs (lncRNAs) were also predicted. Our results revealed that AS potentially plays a crucial role in the regulation of the secondary metabolism of the tea plant. These findings enhanced our understanding of the complexity of the secondary metabolic regulation of tea plants and provided a basis for the subsequent exploration of the regulatory mechanisms of flavonoid, theanine and caffeine biosynthesis in tea plants.


September 22, 2019  |  

A global survey of alternative splicing in allopolyploid cotton: landscape, complexity and regulation.

Alternative splicing (AS) is a crucial regulatory mechanism in eukaryotes, which acts by greatly increasing transcriptome diversity. The extent and complexity of AS has been revealed in model plants using high-throughput next-generation sequencing. However, this technique is less effective in accurately identifying transcript isoforms in polyploid species because of the high sequence similarity between coexisting subgenomes. Here we characterize AS in the polyploid species cotton. Using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq), we developed an integrated pipeline for Iso-Seq transcriptome data analysis (https://github.com/Nextomics/pipeline-for-isoseq). We identified 176 849 full-length transcript isoforms from 44 968 gene models and updated gene annotation. These data led us to identify 15 102 fibre-specific AS events and estimate that c. 51.4% of homoeologous genes produce divergent isoforms in each subgenome. We reveal that AS allows differential regulation of the same gene by miRNAs at the isoform level. We also show that nucleosome occupancy and DNA methylation play a role in defining exons at the chromatin level. This study provides new insights into the complexity and regulation of AS, and will enhance our understanding of AS in polyploid species. Our methodology for Iso-Seq data analysis will be a useful reference for the study of AS in other species.© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.


September 22, 2019  |  

The state of long non-coding RNA biology.

Transcriptomic studies have demonstrated that the vast majority of the genomes of mammals and other complex organisms is expressed in highly dynamic and cell-specific patterns to produce large numbers of intergenic, antisense and intronic long non-protein-coding RNAs (lncRNAs). Despite well characterized examples, their scaling with developmental complexity, and many demonstrations of their association with cellular processes, development and diseases, lncRNAs are still to be widely accepted as major players in gene regulation. This may reflect an underappreciation of the extent and precision of the epigenetic control of differentiation and development, where lncRNAs appear to have a central role, likely as organizational and guide molecules: most lncRNAs are nuclear-localized and chromatin-associated, with some involved in the formation of specialized subcellular domains. I suggest that a reassessment of the conceptual framework of genetic information and gene expression in the 4-dimensional ontogeny of spatially organized multicellular organisms is required. Together with this and further studies on their biology, the key challenges now are to determine the structure?function relationships of lncRNAs, which may be aided by emerging evidence of their modular structure, the role of RNA editing and modification in enabling epigenetic plasticity, and the role of RNA signaling in transgenerational inheritance of experience.


September 22, 2019  |  

Global dissection of alternative splicing uncovers transcriptional diversity in tissues and associates with the flavonoid pathway in tea plant (Camellia sinensis).

Alternative splicing (AS) regulates mRNA at the post-transcriptional level to change gene function in organisms. However, little is known about the AS and its roles in tea plant (Camellia sinensis), widely cultivated for making a popular beverage tea.In our study, the AS landscape and dynamics were characterized in eight tissues (bud, young leaf, summer mature leaf, winter old leaf, stem, root, flower, fruit) of tea plant by Illumina RNA-Seq and confirmed by Iso-Seq. The most abundant AS (~?20%) was intron retention and involved in RNA processes. The some alternative splicings were found to be tissue specific in stem and root etc. Thirteen co-expressed modules of AS transcripts were identified, which revealed a similar pattern between the bud and young leaves as well as a distinct pattern between seasons. AS events of structural genes including anthocyanidin reductase and MYB transcription factors were involved in biosynthesis of flavonoid, especially in vegetative tissues. The AS isoforms rather than the full-length ones were the major transcripts involved in flavonoid synthesis pathway, and is positively correlated with the catechins content conferring the tea taste. We propose that the AS is an important functional mechanism in regulating flavonoid metabolites.Our study provides the insight into the AS events underlying tea plant’s uniquely different developmental process and highlights the important contribution and efficacy of alternative splicing regulatory function to biosynthesis of flavonoids.


September 22, 2019  |  

Genome-wide transcriptome profiling of the medicinal plant Zanthoxylum planispinum using a single-molecule direct RNA sequencing approach.

High-throughput RNA sequencing has revolutionized transcriptome-based studies of candidate genes, key pathways and gene regulation in non-model organisms. We analyzed full-length cDNA sequences in Zanthoxylum planispinum (Z. planispinum), a medicinal herb in major parts of East Asia. The full-length mRNA derived from tissues of leaf, early fruit and maturing fruit stage were sequenced using PacBio RSII platform to identify isoform transcriptome. We obtained 51,402 unigenes, with average 1781?bp per gene in 82.473?Mb gene lengths. Among 51,402, 3963 unigenes showed variety of isoform. By selection of one representative gene among each of the various isoforms, we finalized 46,306 unique gene set for this herb. We identified 76 cytochrome P450 (CYP450) and related isoforms that are of the wide diversity in the molecular function and biological process. These transcriptome data of Z. planispinum will provide a good resource to study metabolic engineering for the production of valuable medicinal drugs and phytochemicals. Copyright © 2018. Published by Elsevier Inc.


September 22, 2019  |  

SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data.

Recent advances in high-throughput RNA sequencing (RNA-seq) technologies have made it possible to reconstruct the full transcriptome of various types of cells. It is important to accurately assemble transcripts or identify isoforms for an improved understanding of molecular mechanisms in biological systems.We have developed a novel Bayesian method, SparseIso, to reliably identify spliced isoforms from RNA-seq data. A spike-and-slab prior is incorporated into the Bayesian model to enforce the sparsity for isoform identification, effectively alleviating the problem of overfitting. A Gibbs sampling procedure is further developed to simultaneously identify and quantify transcripts from RNA-seq data. With the sampling approach, SparseIso estimates the joint distribution of all candidate transcripts, resulting in a significantly improved performance in detecting lowly expressed transcripts and multiple expressed isoforms of genes. Both simulation study and real data analysis have demonstrated that the proposed SparseIso method significantly outperforms existing methods for improved transcript assembly and isoform identification.The SparseIso package is available at http://github.com/henryxushi/SparseIso.xuan@vt.edu.Supplementary data are available at Bioinformatics online.© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com


September 22, 2019  |  

Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis.

Danshen, Salvia miltiorrhiza Bunge, is one of the most widely used herbs in traditional Chinese medicine, wherein its rhizome/roots are particularly valued. The corresponding bioactive components include the tanshinone diterpenoids, the biosynthesis of which is a subject of considerable interest. Previous investigations of the S. miltiorrhiza transcriptome have relied on short-read next-generation sequencing (NGS) technology, and the vast majority of the resulting isotigs do not represent full-length cDNA sequences. Moreover, these efforts have been targeted at either whole plants or hairy root cultures. Here, we demonstrate that the tanshinone pigments are produced and accumulate in the root periderm, and apply a combination of NGS and single-molecule real-time (SMRT) sequencing to various root tissues, particularly including the periderm, to provide a more complete view of the S. miltiorrhiza transcriptome, with further insight into tanshinone biosynthesis as well. In addition, the use of SMRT long-read sequencing offered the ability to examine alternative splicing, which was found to occur in approximately 40% of the detected gene loci, including several involved in isoprenoid/terpenoid metabolism.© 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.


September 22, 2019  |  

Chromosome-level reference genome and alternative splicing atlas of moso bamboo (Phyllostachys edulis).

Bamboo is one of the most important nontimber forestry products worldwide. However, a chromosome-level reference genome is lacking, and an evolutionary view of alternative splicing (AS) in bamboo remains unclear despite emerging omics data and improved technologies.Here, we provide a chromosome-level de novo genome assembly of moso bamboo (Phyllostachys edulis) using additional abundance sequencing data and a Hi-C scaffolding strategy. The significantly improved genome is a scaffold N50 of 79.90 Mb, approximately 243 times longer than the previous version. A total of 51,074 high-quality protein-coding loci with intact structures were identified using single-molecule real-time sequencing and manual verification. Moreover, we provide a comprehensive AS profile based on the identification of 266,711 unique AS events in 25,225 AS genes by large-scale transcriptomic sequencing of 26 representative bamboo tissues using both the Illumina and Pacific Biosciences sequencing platforms. Through comparisons with orthologous genes in related plant species, we observed that the AS genes are concentrated among more conserved genes that tend to accumulate higher transcript levels and share less tissue specificity. Furthermore, gene family expansion, abundant AS, and positive selection were identified in crucial genes involved in the lignin biosynthetic pathway of moso bamboo.These fundamental studies provide useful information for future in-depth analyses of comparative genome and AS features. Additionally, our results highlight a global perspective of AS during evolution and diversification in bamboo.


September 22, 2019  |  

Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads.

Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming. Next-generation sequencing technologies such as Illumina and 454 achieve high throughput but their short read lengths are a critical barrier to accurate assembly of highly similar gene isoforms, and may result in ambiguities and false joining during sequence assembly. More recently, the third generation sequencer represented by the PacBio platform offers sufficient throughput and long reads covering the full length of typical genes, thus providing a potential to reliably profile gene isoforms. However, the PacBio long reads are error-prone and cannot be effectively analyzed by traditional assembly programs.We present a clustering-based analysis pipeline integrated with PacBio sequencing data for profiling highly similar gene isoforms. This approach was first evaluated in comparison to de novo assembly of 454 reads using a benchmark admixture containing 10 known, cloned msg genes encoding the major surface glycoprotein of Pneumocystis jirovecii. All 10 msg isoforms were successfully reconstructed with the expected length (~1.5 kb) and correct sequence by the new approach, while 454 reads could not be correctly assembled using various assembly programs. When using an additional benchmark admixture containing 22 known P. jirovecii msg isoforms, this approach accurately reconstructed all but 4 these isoforms in their full-length (~3 kb); these 4 isoforms were present in low concentrations in the admixture. Finally, when applied to the original clinical sample from which the 22 known msg isoforms were cloned, this approach successfully identified not only all known isoforms accurately (~3 kb each) but also 48 novel isoforms.PacBio sequencing integrated with the clustering-based analysis pipeline achieves high-throughput and high-resolution discrimination of highly similar sequences, and can serve as a new approach for genome-wide characterization of gene isoforms and other highly repetitive sequences.


September 22, 2019  |  

A comprehensive analysis of alternative splicing in paleopolyploid maize.

Identifying and characterizing alternative splicing (AS) enables our understanding of the biological role of transcript isoform diversity. This study describes the use of publicly available RNA-Seq data to identify and characterize the global diversity of AS isoforms in maize using the inbred lines B73 and Mo17, and a related species, sorghum. Identification and characterization of AS within maize tissues revealed that genes expressed in seed exhibit the largest differential AS relative to other tissues examined. Additionally, differences in AS between the two genotypes B73 and Mo17 are greatest within genes expressed in seed. We demonstrate that changes in the level of alternatively spliced transcripts (intron retention and exon skipping) do not solely reflect differences in total transcript abundance, and we present evidence that intron retention may act to fine-tune gene expression across seed development stages. Furthermore, we have identified temperature sensitive AS in maize and demonstrate that drought-induced changes in AS involve distinct sets of genes in reproductive and vegetative tissues. Examining our identified AS isoforms within B73 × Mo17 recombinant inbred lines (RILs) identified splicing QTL (sQTL). The 43.3% of cis-sQTL regulated junctions are actually identified as alternatively spliced junctions in our analysis, while 10 Mb windows on each side of 48.2% of trans-sQTLs overlap with splicing related genes. Using sorghum as an out-group enabled direct examination of loss or conservation of AS between homeologous genes representing the two subgenomes of maize. We identify several instances where AS isoforms that are conserved between one maize homeolog and its sorghum ortholog are absent from the second maize homeolog, suggesting that these AS isoforms may have been lost after the maize whole genome duplication event. This comprehensive analysis provides new insights into the complexity of AS in maize.


September 22, 2019  |  

Global identification of alternative splicing via comparative analysis of SMRT- and Illumina-based RNA-seq in strawberry.

Alternative splicing (AS) is a key post-transcriptional regulatory mechanism, yet little information is known about its roles in fruit crops. Here, AS was globally analyzed in the wild strawberry Fragaria vesca genome with RNA-seq data derived from different stages of fruit development. The AS landscape was characterized and compared between the single-molecule, real-time (SMRT) and Illumina RNA-seq platform. While SMRT has a lower sequencing depth, it identifies more genes undergoing AS (57.67% of detected multiexon genes) when it is compared with Illumina (33.48%), illustrating the efficacy of SMRT in AS identification. We investigated different modes of AS in the context of fruit development; the percentage of intron retention (IR) is markedly reduced whereas that of alternative acceptor sites (AA) is significantly increased post-fertilization when compared with pre-fertilization. When all the identified transcripts were combined, a total of 66.43% detected multiexon genes in strawberry undergo AS, some of which lead to a gain or loss of conserved domains in the gene products. The work demonstrates that SMRT sequencing is highly powerful in AS discovery and provides a rich data resource for later functional studies of different isoforms. Further, shifting AS modes may contribute to rapid changes of gene expression during fruit set.© 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.