Most genes in eukaryotic organisms produce alternative isoforms, broadening the diversity of proteins and non-coding RNAs encoded by the genome. In contrast to other RNA sequencing platforms that rely on short-read sequencing, long accurate reads from PacBio Single Molecule, Real-Time (SMRT) Sequencing can characterize full-length transcripts without the need for assembly and inference. The PacBio isoform sequencing (Iso-Seq) method generates full-length sequences for transcripts up to 10 kb in length, with scalable throughput using barcoding approaches. The Iso-Seq application can be employed for a wide variety of studies, including improvement of gene annotation, identification of novel isoforms and fusion transcripts, and differential isoform expression across tissues and cell types. Long-read sequencing together with hybridization-capture targeting provides a powerful approach to target candidate transcripts of interest using biotinylated capture probes. In addition to the comprehensive isoform characterization that offers for targeted genes, long reads provide other advantages as well. For example, our studies revealed that a customized Alzheimer’s Disease (AD) panel captured cDNA applied to two AD subjects could identify heterozygous variants in the targeted genes, allowing phasing of transcript isoforms and sorting into their respective haplotypes using the IsoPhase pipeline. We will also discuss how we rapidly applied the capture long-read RNA-seq concept in recent SARS-CoV2 viral sequencing efforts. Although PCR is widely used in COVID-19 studies, primer design and PCR conditions are susceptible to sample quality and viral titer, resulting in uneven coverage and dropouts. Here we present an alternative approach with probe-based enrichment and long-read sequencing. Viral transcripts in the RNA sample are reverse transcribed into cDNA just as in the standard Iso-Seq workflow but then a custom panel of IDT xGen Lockdown probes tilling the SARS-CoV-2 viral genome are used for capture. The method affords complete genomic sequence determination, shows more even coverage than traditional RT-PCR, and is robust to RNA quality and quantity. Furthermore, we took advantage of unique molecular indexes (UMIs) to separate founder molecules and detect PCR artifacts during sample preparation. This work provides an orthogonal approach for researchers elucidating the virology of this novel coronavirus and we foresee that this workflow can be easily modified to capture long read sequences for other viruses as well.
Learn more at pacb.com/COVID-19