PacBio HiFi sequencing technology continues to play an increasingly pivotal role in advancing critical research across the life sciences. In this blog series, we explore some of the latest and most exciting scientific papers and preprints that demonstrate the power of HiFi sequencing in unraveling new insights in areas as diverse as rare disease research, neurobiology, epigenetics, and population genetics.
Rare disease/Iso-Seq method
University College London researchers used PacBio long-read RNA sequencing datasets found on the ENCODE database to comprehensively study the full range of transcripts from the CLN3 gene. CLN3 is associated with Batten disease which is a group of inherited nervous system disorders that begin early in development. The team evaluated full-length transcripts of CLN3 across various tissues and cell types in human control samples.
- The researchers found more than 100 novel CLN3 transcripts with no dominantly expressed CLN3 transcript being observed.
- 48 CLN3 open reading frames (ORFs) were observed, 26 of which are new to science.
- Identical ORFs were found with alternative untranslated regions (UTRs).
- The authors noted that, “around one-third of CLN3 transcripts encode protein isoforms with different stretches of amino acids.”
A team of scientists based in Germany and Switzerland used HiFi sequencing present a “new avenue for assessing gene function in cell fate commitment” by examining the variation of inferred protein structures that arise from alternative splicing (AS). This represents a significant departure from using the standard approach of measuring only gene expression alone.
- “In the context of mammalian brain development, … little attention was paid to the fact that different transcripts can arise from any given gene through alternative splicing (AS).”
- Leveraging PacBio Iso-Seq, the authors were able to reconstruct cell type-specific transcriptome diversity during brain development and assess AS events quantitatively.
- In their findings the authors, “describe nearly 50,000 new transcripts including novel exons, splice sites and/or microexons, thus, uncovering the full spectrum of splicing dynamics accompanying [cell] fate transitions.”
- The team observed a “profound remodeling of the transcriptional profile of specific cortical cell types.”
- The biological significance of AS on protein structure” was computationally inferred using AlphaFold2. Using this approach:
- “…nearly 40% of isoform pairs originating from the same gene exhibited large global conformational changes including fold switches.”
- “the occurrence of regions with identical sequence yet adopting profoundly different secondary structures… depending on distant AS events, thus, revealing that even negligible changes in exon usage can induce large conformational changes influencing the functional properties of proteins.” was noted.
- This “study reveals that AS has a greater potential to impact protein diversity and function than previously thought independently from changes in gene expression.”
Researchers from multiple institutions in San Francisco describe RASAM (Replication-Aware Single-molecule Accessibility Mapping) to map the structure of nascent chromatin fibers immediately following replication. HiFi sequencing simultaneously detects labeled regions of open chromatin (using m6A) and sites of recent replication (using BrdU). Observing nascent hyperaccessibilty of newly replicated chromatin and discovering different modes of their resolution using RASAM has enabled the team to open a new paradigm of “unique organization of newly replicated chromatin that must be reset by active processes, providing a substrate for epigenetic reprogramming.”
- Previous approaches “suffer from technical limitations of Illumina sequencing”. They “fail to capture the connectivity of protein-DNA interactions on individual molecules,” are challenged by ‘unmappable’ repetitive regions, with library prep limitations that “potentially underestimate DNA accessibility” and “require population-averaging of short-read sequencing data which has led to diverging biological interpretations.”
- “PacBio sequencing can coincidentally measure BrdU-incorporation and chromatin accessibility on long individual DNA molecules purified from mammalian cells.”
- “Our pilot datasets provide the highest-resolution views to date of individual nucleosomes and TFs on nascent chromatin fibers at genome scale, including within ‘unmappable’ repetitive regions. Given the broad utility of nucleotide-analog-labeling, we anticipate the application of RASAM to myriad topics.”
US National Institutes of Health scientists recently generated HiFi assemblies of two Indigenous Americans (one female and one male) from the state of Arizona.
- “Each assembly included ~17 Mb of DNA sequence not present (non-reference sequence; NRS) in hg38”
- The team generated a modified hg38-NRS reference genome, finding ~50,000 SNVs present in at least 5% of a ~400 NGS whole genome sequencing cohort which were not detected with the standard hg38 reference.
- This study shows that the “inclusion of population-specific NRSs can dramatically change the variant profile in an under-represented ethnic groups[sic] and thereby lead to the discovery of previously missed common variations.”
Ready to kickstart breakthroughs of your own?
These recent publications showcase the versatility and power of PacBio sequencing technology to enable scientists around the world to advance research across multiple domains. From transcriptome profiling and genetic disease detection to agrigenomics and neurology research, PacBio long-read sequencing continues to facilitate innovative discoveries and insights that shape our understanding of complex biological processes.
Be it through a purchase, financing, or a certified service provider, PacBio sequencing is now more accessible than ever for research teams great and small. To find out how to generate PacBio data for your project:
Connect with a PacBio scientist