Understanding the myriad of different cell types, states, and their functions is essential to understanding biology. Cell atlas projects and consortia aim to map different cell types in tissues, organs, and organisms, generating a reference resource that can help us better understand human health and disease. RNA sequencing technologies are key to these initiatives, and increasingly, researchers find that moving to full-length transcript sequencing gives a more detailed and complete view of cell types, moving the field from gene to transcript isoform analysis. Bulk and single cell long-read isoform sequencing with PacBio provides a more comprehensive view of the transcriptome, providing a powerful tool for cell atlas initiatives.
A new era of transcriptomics
A striking example of this paradigm shift is described in a new preprint titled Cell-type-specificity of isoform diversity in the developing human neocortex informs mechanisms of neurodevelopmental disorders from researchers at UCLA, UPenn and collaborators, summarizing their work as: “A novel cell-specific atlas of gene isoform expression reshapes our understanding of brain development and disease.”
The study highlights the limitations of short reads for this research, stating, “due to technological limitations of short-read scRNA-Seq, previous genomic characterization has been largely limited to changes at the gene-level, unable to capture the full complexity of alternative splicing and resulting isoform diversity present in the human brain.” To overcome these limitations, the researchers leveraged HiFi sequencing in the form of both bulk and single-cell Iso-Seq methods, “we leverage single molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone (GZ) and cortical plate (CP) regions of the developing human neocortex at tissue and single-cell resolution.” They find that Iso-Seq reads characterize the transcriptome with higher quality, observing,
“…over 99% of the full-length reads were confidently aligned to the human reference genome, a marked improvement over the 85% mapping rate of short read RNA-Seq, enhancing discovery of novel splice isoforms.”
As a result, the researchers made remarkable new discoveries, including:
- Detection of 214,516 unique isoforms covering 22,391 genes, 72.6% of the isoforms are novel, while many are predicted to have a functional impact at the protein-coding level and are supported by peptide evidence. In total, this new resource expanded the proteome by over 92,000 proteoforms.
- Identification of 27 Mb of the human genome that is transcriptionally active in the developing human brain but not currently annotated in Gencode.
- Characterization of novel isoform switches during cortical neurogenesis, implicating previously uncharacterized RNA-binding protein-mediated and other regulatory mechanisms in cellular identity and disease.
- Isoform-based single-cell clustering identified sub-clusters and transitory states not discernible using gene-based single-cell clustering.
Leveraging this new resource, the researchers proceeded to “re-prioritize thousands of rare de novo risk variants associated with neurodevelopmental disorders and reveal that risk genes are strongly associated with the number of unique isoforms observed per gene.” Their work:
“…uncovers a substantial contribution of transcript-isoform diversity in cellular identity in the developing neocortex, elucidates novel genetic risk mechanisms for neurodevelopmental and neuropsychiatric disorders, and provides a comprehensive isoform-centric gene annotation for the developing human brain.”
Full-length transcript isoform analysis reveals new insights into biology and disease
These findings represent a common theme across many studies in the field – single-cell Iso-Seq data has the power to greatly expand our knowledge of expressed transcripts. One such study, titled “Single-cell long-read mRNA isoform regulation is pervasive across mammalian brain regions, cell types, and development from researchers at Weill Cornell comprehensively analyzed full-length RNA isoforms in multiple mouse brain regions, cell subtypes, and developmental timepoints. Similarly, they found that, “high accuracy single-cell PacBio reads identified novel splice sites, enhancing the GENCODE annotation by 22.1% (40,184 transcripts),” and that “for 75% of genes, full-length isoform expression varies along one or more axes of phenotypic origin, underscoring the pervasiveness of isoform regulation across multiple scales.” And then just last week in the area of cancer research, another preprint, entitled An isoform-resolution transcriptomic atlas of colorectal cancer from long-read single-cell sequencing from researchers in Saudi Arabia and Singapore, describes “the first isoform-resolution CRC [colorectal cancer] transcriptomic atlas”, identifying several hundred dysregulated transcript structures in tumor cells, and including nearly three hundred resulting from various combinations of multiple splicing events.
To add to this, the Iso-Seq method has the potential to enhance single-cell atlas studies in a variety of other ways – one notable example being its ability to improve genome annotation. This was highlighted in a recent study titled Single-cell sequencing unravels the cellular diversity that shapes neuro- and gliogenesis in the fast aging killifish (N. furzeri) brain from researchers at Leuven, Belgium. Here, the researchers note that “the Iso-Seq method produces full-length transcripts, thus attaining high accuracy with better genome coverage.” This in turn improved the mapping of short RNA-Seq reads “from 45 % to 69 %,” helping to “improve our final genomic annotation file and ensured higher mapping accuracy and coverage for performing Seurat-based single-cell sequencing analysis.” Notably, the Iso-Seq method enabled higher cell atlas resolution – 17 cell types in the initial study (version 1 of the preprint) – now increased to 25 cell types identified by including Iso-Seq data.
If you’re interested in building a more comprehensive and accurate cell atlas, HiFi sequencing may be just the solution you’ve been searching for. Our new Revio system and MAS-Seq kit for single-cell studies, offer exceptional accuracy and efficiency, which will soon be complemented by the recently announced expansion of throughput increases for bulk Iso-Seq applications. To learn more about the possibilities, please don’t hesitate to connect with us.
We look forward to helping you take your research to the next level!