Since the launch of the Iso-Seq protocol in SMRT Analysis in 2014, the analysis pipeline has seen several improvements, as well a host of new tools. Please see our updated post to learn more about current applications of this increasingly popular method.
With the recent launch of SMRT Analysis v2.2, we’re excited to introduce analysis software support for the new Iso-Seq™ method for sequencing full-length transcripts and gene isoforms, with no assembly required! Today we’ll take a deeper look at the Iso-Seq method to explain its unique scientific value and review publications from those already applying Single Molecule, Real-Time (SMRT®) Sequencing to this exciting area of research.
In plant and animal genomes, along with all higher eukaryotic organisms, the majority of genes are alternatively spliced to produce multiple transcript isoforms. In humans, for example, there is evidence for alternative splicing of more than 95% of genes [1], with an average of more than five isoforms per gene. Gene regulation through alternative splicing can dramatically increase the protein-coding potential of a genome that contains a limited number of genes that encode proteins. Somewhat surprisingly, alternatively spliced isoforms from a single gene can also have very different, even antagonistic, functions [2]. Therefore, understanding the functional biology of a genome requires knowing the full complement of isoforms. Microarrays and high-throughput cDNA sequencing have become incredibly useful tools for studying transcriptomes, yet these technologies provide small snippets of transcripts and building complete transcripts to study gene isoforms has been challenging.
Thanks to the extraordinarily long reads available with PacBio® sequencing, the new Iso-Seq method provides full-length reads spanning entire transcript isoforms all the way from the polyA-tail to the 5′ end. It is no longer necessary to reconstruct transcripts or infer isoforms based on combining local information since each sequence represents an individual full-length cDNA molecule. The method combines isoform-level resolution with the best of whole-transcriptome sequencing to enable direct gene isoform sequencing across an entire transcriptome. We’re pleased to report that scientists are now using the Iso-Seq method to routinely sequence full-length isoforms in a wide variety of organisms and are applying the approach to improve annotations in reference genomes, characterize gene isoforms in important gene families, and find novel genes even in the most comprehensively studied human cell lines.
Here is a selected list of recent publications, presentations, and sample data:
• Tilgner et al. (2014) Defining a personal, allele-specific, and single-molecule long-read transcriptome. PNAS June 24, 2014.
• Au et al. (2013) Characterization of the human ESC transcriptome by hybrid sequencing. PNAS 110: E4821-4830.
• Sharon et al. (2013) A single-molecule long-read survey of the human transcriptome. Nature Biotechnol 31: 1009-1014.
• Thomas et al. (2014) Long-read sequencing of chicken transcripts and identification of new transcript isoforms. PLoS One. 9: e94650.
• Treutlein et al. (2014) Cartography of neurexin alternative splicing mapped by single-molecule long-read mRNA sequencing. PNAS 111: E1291-1299.
• Zhang et al. (2014) PacBio sequencing of gene families – a case study with wheat gluten genes.Gene 533: 541-546.
• Larsen. et al. (2014) Next-generation approaches to advancing eco-immunogenomic research in critically endangered primates. Molecular Ecology Resources.
• BrinzevichD. et al. (2014) HIV-1 Interacts with Human Endogenous Retrovirus K(HML-2) Envelopes Derived from Human Primary Lymphocytes. J Virology 88:6213-6223.
• P. Larsen et al. (2012) Application of circular consensus sequencing and network analysis to characterize the bovine IgG repertoire. BMC Immunology, 13: 52
• Webinar: No Assembly Required – Extremely Long Reads for Full-length Transcript Isoform Sequencing
• Human MCF-7 Iso-Seq Dataset
For more detailed information on cDNA sequencing with PacBio, don’t miss this primer on GitHub and the shared protocol on SampleNet.
[1] Pan et al. (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature Genetics 40: 1413-1415.
[2] Boise et al. (1993) Bcl-x, a bcl-2-related gene that functions as a dominant regulator of apoptotic cell death. Cell 74: 597-608.
June 2, 2014 | General