June 1, 2021  |  

Rapid full-length Iso-Seq cDNA sequencing of rice mRNA to facilitate annotation and identify splice-site variation.

PacBio’s new Iso-Seq technology allows for rapid generation of full-length cDNA sequences without the need for assembly steps. The technology was tested on leaf mRNA from two model O. sativa ssp. indica cultivars – Minghui 63 and Zhenshan 97. Even though each transcriptome was not exhaustively sequenced, several thousand isoforms described genes over a wide size range, most of which are not present in any currently available FL cDNA collection. In addition, the lack of an assembly requirement provides direct and immediate access to complete mRNA sequences and rapid unraveling of biological novelties.


June 1, 2021  |  

Complete microbial genomes, epigenomes, and transcriptomes using long-read PacBio Sequencing.

For comprehensive metabolic reconstructions and a resulting understanding of the pathways leading to natural products, it is desirable to obtain complete information about the genetic blueprint of the organisms used. Traditional Sanger and next-generation, short-read sequencing technologies have shortcomings with respect to read lengths and DNA-sequence context bias, leading to fragmented and incomplete genome information. The development of long-read, single molecule, real-time (SMRT) DNA sequencing from Pacific Biosciences, with >10,000 bp average read lengths and a lack of sequence context bias, now allows for the generation of complete genomes in a fully automated workflow. In addition to the genome sequence, DNA methylation is characterized in the process of sequencing. PacBio® sequencing has also been applied to microbial transcriptomes. Long reads enable sequencing of full-length cDNAs allowing for identification of complete gene and operon sequences without the need for transcript assembly. We will highlight several examples where these capabilities have been leveraged in the areas of industrial microbiology, including biocommodities, biofuels, bioremediation, new bacteria with potential commercial applications, antibiotic discovery, and livestock/plant microbiome interactions.


June 1, 2021  |  

Single Molecule, Real-Time sequencing of full-length cDNA transcripts uncovers novel alternatively spliced isoforms.

In higher eukaryotic organisms, the majority of multi-exon genes are alternatively spliced. Different mRNA isoforms from the same gene can produce proteins that have distinct properties such as structure, function, or subcellular localization. Thus, the importance of understanding the full complement of transcript isoforms with potential phenotypic impact cannot be underscored. While microarrays and other NGS-based methods have become useful for studying transcriptomes, these technologies yield short, fragmented transcripts that remain a challenge for accurate, complete reconstruction of splice variants. The Iso-Seq protocol developed at PacBio offers the only solution for direct sequencing of full-length, single-molecule cDNA sequences to survey transcriptome isoform diversity useful for gene discovery and annotation. Knowledge of the complete isoform repertoire is also key for accurate quantification of isoform abundance. As most transcripts range from 1 – 10 kb, fully intact RNA molecules can be sequenced using SMRT Sequencing (avg. read length: 10-15 kb) without requiring fragmentation or post-sequencing assembly. Our open-source computational pipeline delivers high-quality, non-redundant sequences for unambiguous identification of alternative splicing events, alternative transcriptional start sites, polyA tail, and gene fusion events. The standard Iso-Seq protocol workflow available for all researchers is presented using a deep dataset of full- length cDNA sequences from the MCF-7 cancer cell line, and multiple tissues (brain, heart, and liver). Detected novel transcripts approaching 10 kb and alternative splicing events are highlighted. Even in extensively profiled samples, the method uncovered large numbers of novel alternatively spliced isoforms and previously unannotated genes.


June 1, 2021  |  

Full-length cDNA sequencing of alternatively spliced isoforms provides insight into human diseases.

The majority of human genes are alternatively spliced, making it possible for most genes to generate multiple proteins. The process of alternative splicing is highly regulated in a developmental-stage and tissue-specific manner. Perturbations in the regulation of these events can lead to disease in humans. Alternative splicing has been shown to play a role in human cancer, muscular dystrophy, Alzheimer’s, and many other diseases. Understanding these diseases requires knowing the full complement of mRNA isoforms. Microarrays and high-throughput cDNA sequencing have become highly successful tools for studying transcriptomes, however these technologies only provide small fragments of transcripts and building complete transcript isoforms has been very challenging. We have developed the Iso-Seq technique, which is capable of sequencing full-length, single-molecule cDNA sequences. The method employs SMRT Sequencing to generate individual molecules with average read lengths of more than 10 kb and some as long as 40 kb. As most transcripts are from 1 to 10 kb, we can sequence through entire RNA molecules, requiring no fragmentation or post-sequencing assembly. Jointly with the sequencing method, we developed a computational pipeline that polishes these full-length transcript sequences into high-quality, non-redundant transcript consensus sequences. Iso-Seq sequencing enables unambiguous identification of alternative splicing events, alternative transcriptional start and poly-A sites, and transcripts from gene fusion events. Knowledge of the complete set of isoforms from a sample of interest is key for accurate quantification of isoform abundance when using any technology for transcriptome studies. Here we characterize the full-length transcriptome of normal human tissues, paired tumor/normal samples from breast cancer, and a brain sample from a patient with Alzheimer’s using deep Iso-Seq sequencing. We highlight numerous discoveries of novel alternatively spliced isoforms, gene-fusions events, and previously unannotated genes that will improve our understanding of human diseases.


June 1, 2021  |  

Full-length cDNA sequencing of alternatively spliced isoforms provides insight into human cancer

The majority of human genes are alternatively spliced, making it possible for most genes to generate multiple proteins. The process of alternative splicing is highly regulated in a developmental-stage and tissue-specific manner. Perturbations in the regulation of these events can lead to disease in humans (1). Alternative splicing has been shown to play a role in human cancer, muscular dystrophy, Alzheimer’s, and many other diseases. Understanding these diseases requires knowing the full complement of mRNA isoforms. Microarrays and high-throughput cDNA sequencing have become highly successful tools for studying transcriptomes, however these technologies only provide small fragments of transcripts and building complete transcript isoforms has been very challenging (2). We have developed a technique, called Iso-Seq sequencing, that is capable of sequencing full-length, single-molecule cDNA sequences. The method employs SMRT Sequencing from PacBio, which can sequence individual molecules with read lengths that average more than 10 kb and can reach as long as 40 kb. As most transcripts are from 1 – 10 kb, we can sequence through entire RNA molecules, requiring no fragmentation or post-sequencing assembly. Jointly with the sequencing method, we developed a computational pipeline that polishes these full-length transcript sequences into high-quality, non-redundant transcript consensus sequences. Iso-Seq sequencing enables unambiguous identification of alternative splicing events, alternative transcriptional start and polyA sites, and transcripts from gene fusion events. Knowledge of the complete set of isoforms from a sample of interest is key for accurate quantification of isoform abundance when using any technology for transcriptome studies (3). Here we characterize the full-length transcriptome of paired tumor/normal samples from breast cancer using deep Iso-Seq sequencing. We highlight numerous discoveries of novel alternatively spliced isoforms, gene-fusion events, and previously unannotated genes that will improve our understanding of human cancer. (1) Faustino NA and Cooper TA. Genes and Development. 2003. 17: 419-437(2) Steijger T, et al. Nat Methods. 2013 Dec;10(12):1177-84.(3) Au KF, et al. Proc Natl Acad Sci U S A. 2013 Dec 10;110(50):E4821-30.


June 1, 2021  |  

Full-length cDNA sequencing for genome annotation and analysis of alternative splicing

In higher eukaryotic organisms, the majority of multi-exon genes are alternatively spliced. Different mRNA isoforms from the same gene can produce proteins that have distinct properties and functions. Thus, the importance of understanding the full complement of transcript isoforms with potential phenotypic impact cannot be understated. While microarrays and other NGS-based methods have become useful for studying transcriptomes, these technologies yield short, fragmented transcripts that remain a challenge for accurate, complete reconstruction of splice variants. The Iso-Seq protocol developed at PacBio offers the only solution for direct sequencing of full-length, single-molecule cDNA sequences to survey transcriptome isoform diversity useful for gene discovery and annotation. Knowledge of the complete isoform repertoire is also key for accurate quantification of isoform abundance. As most transcripts range from 1 – 10 kb, fully intact RNA molecules can be sequenced using SMRT Sequencing without requiring fragmentation or post-sequencing assembly. Our open-source computational pipeline delivers high-quality, non-redundant sequences for unambiguous identification of alternative splicing events, alternative transcriptional start sites, polyA tail, and gene fusion events. We applied the Iso-Seq method to the maize (Zea mays) inbred line B73. Full-length cDNAs from six diverse tissues were barcoded and sequenced across multiple size-fractionated SMRTbell libraries. A total of 111,151 unique transcripts were identified. More than half of these transcripts (57%) represented novel, sometimes tissue-specific, isoforms of known genes. In addition to the 2250 novel coding genes and 860 lncRNAs discovered, the Iso-Seq dataset corrected errors in existing gene models, highlighting the value of full-length transcripts for whole gene annotations.


June 1, 2021  |  

Application specific barcoding strategies for SMRT Sequencing

Over the last few years, several advances were implemented in the PacBio RS II System to maximize throughput and efficiency while reducing the cost per sample. The number of useable bases per SMRT Cell now exceeds 1 Gb with the latest P6-C4 chemistry and 6-hour movies. For applications such as microbial sequencing, targeted sequencing, Iso-Seq (full-length isoform sequencing) and Nimblegen’s target enrichment method, current SMRT Cell yields could be an excess relative to project requirements. To this end, barcoding is a viable option for multiplexing samples. For microbial sequencing, multiplexing can be accomplished by tagging sheared genomic DNA during library construction with modified SMRTbell adapters. We studied the performance of 2- to 8-plex microbial sequencing. For full-length amplicon sequencing such as HLA typing, amplicons as large as 5 kb may be barcoded during amplification using barcoded locus-specific primers. Alternatively, amplicons may be barcoded during SMRTbell library construction using barcoded SMRTbell adapters. The preferred barcoding strategy depends on the user’s existing workflow and flexibility to changing and/or updating existing workflows. Using barcoded adapters, five Class I and II genes (3.3 – 5.8 kb) x 96 patients can be multiplexed and typed. For Iso-Seq full-length cDNA sequencing, barcodes are incorporated during 1st-strand synthesis and are enabled by tailing the oligo-dT primer with any PacBio published 16-bp barcode sequences. RNA samples from 6 maize tissues were multiplexed to generate barcoded cDNA libraries. The NimbleGen SeqCap Target Enrichment method, combined with PacBio’s long-read sequencing, provides comprehensive view of multi-kilobase contiguous regions, both exonic and intronic regions. To make this cost effective, we recommend barcoding samples for pooling prior to target enrichment and capture. Here, we present specific examples of strategies and best practices for multiplexing samples for different applications for SMRT Sequencing. Additionally, we describe recommendations for analyzing barcoded samples.


June 1, 2021  |  

Full-length cDNA sequencing on the PacBio Sequel platform

The protein coding potential of most plant and animal genomes is dramatically increased via alternative splicing. Identification and annotation of expressed mRNA isoforms is critical to the understanding of these complex organisms. While microarrays and other NGS-based methods have become useful for studying transcriptomes, these technologies yield short, fragmented transcripts that remain a challenge for accurate, complete reconstruction of splice variants. The Iso-Seq protocol developed at PacBio offers the only solution for direct sequencing of full-length, single-molecule cDNA sequences to survey transcriptome isoform diversity useful for gene discovery and annotation. Knowledge of the complete isoform repertoire is also key for accurate quantification of isoform abundance. As most transcripts range from 1 – 10 kb, fully intact RNA molecules can be sequenced using SMRT Sequencing without requiring fragmentation or post-sequencing assembly. The PacBio Sequel platform has improved throughput thereby increasing the number of full-length transcripts per SMRT Cell. Furthermore, loading enhancements on the Sequel instrument have decreased the need for size fractionation steps. We have optimized the Iso-Seq library preparation process for use on the Sequel platform. Here, we demonstrate the capabilities of the Iso-Seq method on the Sequel system using cDNAs from the maize (Zea mays) inbred line B73. Full-length cDNA from six diverse tissues were barcoded, pooled, and sequenced on the PacBio Sequel system using a combination of size-selected and non-size-selected SMRTbell libraries. The results highlight the value of full-length transcripts for genome annotations and analysis of alternative splicing.


June 1, 2021  |  

Full-length cDNA sequencing of prokaryotic transcriptome and metatranscriptome samples

Next-generation sequencing has become a useful tool for studying transcriptomes. However, these methods typically rely on sequencing short fragments of cDNA, then attempting to assemble the pieces into full-length transcripts. Here, we describe a method that uses PacBio long reads to sequence full-length cDNAs from individual transcriptomes and metatranscriptome samples. We have adapted the PacBio Iso-Seq protocol for use with prokaryotic samples by incorporating RNA polyadenylation and rRNA-depletion steps. In conjunction with SMRT Sequencing, which has average readlengths of 10-15 kb, we are able to sequence entire transcripts, including polycistronic RNAs, in a single read. Here, we show full-length bacterial transcriptomes with the ability to visualize transcription of operons. In the area of metatranscriptomics, long reads reveal unambiguous gene sequences without the need for post-sequencing transcript assembly. We also show full-length bacterial transcripts sequenced after being treated with NEB’s Cappable-Seq, which is an alternative method for depleting rRNA and enriching for full-length transcripts with intact 5’ ends. Combining Cappable-Seq with PacBio long reads allows for the detection of transcription start sites, with the additional benefit of sequencing entire transcripts.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.