Menu
April 1, 2026  |  Featured

Powered by PacBio:
Selected publications from January – March 2026

 

Our latest roundup highlights the diverse ways researchers are using PacBio HiFi sequencing to uncover new insights across rare disease, cancer, and transcriptomics.

The year has started strongly, with studies that reveal cryptic transcript biology in ALS, power new machine learning approaches in cancer, and serves as a foundation for discovery in rare disease research.

We typically share these publications month by month, but with so much going on in 2026 we’re bringing you our first quarterly edition so that you can keep on the research shaping the year ahead.

Keep reading for a closer look at this featured studies from January, February, and March:

 

Jump to topic:

ALS + transcript biology | Rare disease | PGx | RNA sequencing | Cancer genomics


 

ALS + transcript biology

TDP-43 dysfunction leads to the accumulation of cryptic transposable element-derived exons, crypTEs, in iPSC derived neurons and ALS/FTD patient tissues 

 In this preprint from NYU, NYGC, and the University of Edinburgh UK describes “a novel mechanism by which transposable element dysregulation impacts Amyotrophic Lateral Sclerosis (ALS)”, showing that TDP-43 dysfunction leads to the accumulation of cryptic transposable element-derived exons (crypTEs) in iPSC derived neurons and ALS/FTD patient tissues.

Key highlights:

  • The Iso-Seq method was used for the “efficient capture of full-length gene mRNAs as well as full-length TE transcripts. In particular, the low error rates of the PacBio Iso-seq platform enabled the resolution of TE-derived transcripts to the specific originating TE locus.”
  • Researchers “identified hundreds of … cryptic gene-TE fusion events as a result of mis-splicing of TE sequences into gene transcripts”, including:
    • “TEs that provide alternate gene promoters/5’UTRs”,
    • “TEs that act as cassette exons inside host gene mRNAs”
    • “TEs that provide alternate transcript 3’ ends.”
  • These cryptic gene-TE fusions are predicted “to induce aberrant expression of ALS relevant genes, nonsense mediated decay (NMD) products, as well as novel peptides from gene-TE fusions within the gene coding sequence.”
  • Combined Iso-seq and single-nucleus RNA-seq from postmortem ALS tissues “further verified that many of these crypTE transcripts are enriched in frontal cortex samples from ALS donors with cognitive involvement (ALSci) and associated with altered expression of those genes in deep layer cortical excitatory neurons.”

 

Conclusion:

Top neurological research centers around the world like NYGC are using Iso-Seq and Kinnex to uncover disease relevant RNA transcripts that short-read methods miss. These tools give researchers clearer insight into the biology of ALS and help drive the discovery of new disease mechanisms, biomarkers, and therapeutic targets.

 


 

Rare disease

Clinical long-read genome sequencing for rare disease diagnostics 

This preprint from Radboud Netherlands highlights the potential for PacBio WGS “as a feasible and effective first-tier test for rare disease diagnostics.”

Key highlights:

HiFi WGS (30x) of 1,000 clinical samples (832 index cases, 84 trios) was compared against standard-of-care testing (9 cytogenetic/molecular tests):

  • overall concordance was 96.4% … improved or refined genetic diagnoses in 3.4% of cases”.
  • Only one case not detected with PacBio (low-frequency (4%) somatic mutation).
  • A second missed case only had half the target coverage initially, it was confidently detected with subsequent nominal coverage (Fig. S2C, page 13).
  • Modeling the implementation of HiFi WGS as a first-tier test showed:
    • would improve diagnostic outcomes in 521/15,150 index cases (3.4%)”
    • significantly increase the conclusive diagnostic yield from 16.4% to 18.9%”.
  • The study also found: “Reducing coverage to 20× would result in only a marginal loss of clinically relevant findings, given the 99.6% recall rate for SNVs/InDels”.
  • Importantly, “lrGS data can be retained for future reinterpretation, offering a superior substrate compared with srGS because of its greater completeness and near haplotype-resolved genome representation.”
  • Researchers found that: “our data support a more comprehensive assessment of individual clinical genomes and the feasibility of large-scale deployment in health care systems

 

Conclusion:

By consolidating multiple assays into a single, comprehensive approach, researchers can potentially improve diagnostic yield while retaining complete genomic data for future reinterpretation.

 

HiFi long-read RNA sequencing enhances clinical diagnostics in rare disorders

In this study, researchers from U Southampton, Salisbury Dist Hosp & U Hosp Birmingham NHS UK, PacBio, UMC Rotterdam & GenomeScan Netherlands found that long-read sequencing “enhances detection and interpretation of clinically relevant splicing events, supporting its integration into diagnostic workflows for rare diseases”.

Key highlights:

Splice-disrupting variants are a major contributor to disease, yet often missed:

  • “Splice-disrupting variants are estimated to account for one-third of disease-causing variants, yet many remain underrepresented in clinical databases due to limitations in detecting splicing changes beyond canonical splice sites”

Using PacBio Kinnex (n=25), researchers demonstrated improved detection and interpretation:

  • “captured 21 confirmed known events, and revealed additional transcript-level effects in eight cases”
  • including “intron retention, multiple exon skipping, leaky splicing, variant phasing, and isoform switching”

Compared to short-read RNA-seq, long-read sequencing enables:

  • “better detect intron retention events with less noise and events that span multiple exons”
  • “full-length transcript detection, quantification of transcript diversity, and allele-specific expression quantification”

The study also highlights gaps in existing annotations:

  • “at least 10% of detected transcripts were novel, not in the catalogue”
  • and “relying solely on SR RNA-seq quantification of specific isoforms can be misleading”

 

Conclusion:

Many rare disease cases are caused by complex splicing defects, but short-read RNA-seq often misses these events, leaving important causes of disease a mystery. HiFi full-length RNA sequencing allows detection of complex splicing changes and transcript effects, including novel isoforms. This more complete view of transcript biology can help improve rare disease diagnosis that were previously unsolved.

 


 

PGx

Pharmacokinetic recall study of Estonian Biobank participants with novel genetic variants in CYP2C19 and CYP2D6

A study from researchers at Estonia, Norway, Sweden, Germany, China “elucidates the functional consequences of rare and structurally complex variants in CYP2C19 and CYP2D6.”

PacBio sequencing was used to characterize these variants at high resolution across 114 participants:

  • PacBio “enabled high-resolution star allele calling” for 114 Estonian Biobank participants, identifying novel variants, new carriers, and “offers enhanced resolution for more complex structural variants, like CYP2D6-2D7 hybrids and phased additional copies of CYP2D6

To understand the clinical impact of these variants, the study incorporated in vivo phenotyping:

  • In vivo phenotyping “to evaluate the functional impact of rare or novel single-nucleotide and structural variants in the CYP2C19 and CYP2D6 genes using omeprazole and metoprolol as respective probe drugs”
  • “First in vivo confirmation that partial gene and intragenic deletions in CYP2C19 …, enriched in Estonians and Finns, are associated with poor metaboliser phenotypes”

The authors highlight the broader implication for pharmacogenomics:

  • “Our findings emphasise the importance of identifying genetic variants in CYP2C19 and CYP2D6 beyond commonly assessed star alleles”
  • “Long-read technologies, increasingly cost-effective, are vital for characterizing such variants”

 

Conclusion:

This study highlights the importance of looking beyond commonly assessed variants to fully understand drug response. HiFi sequencing enables highly accurate characterization of complex and rare variants, providing a more complete view of pharmacogenomic profiles that can better inform precision medicine approaches.

 


 

RNA sequencing

Systematic evaluation of long- and short-read RNA-seq for human peripheral blood

This study from researchers in Japan provides the “first … comparison of long-read, short-read, and microarray-based transcriptome profiling using identical RNA samples from peripheral blood mononuclear cells”.

Key highlights:

Despite about a 72-fold difference in read depth, Iso-Seq demonstrated strong advantages in transcript discovery and resolution:

  • “outperformed short-read sequencing in detecting complex alternative splicing events, novel transcript isoforms, and full-length immune receptor sequences”
  • “more comprehensive and consistent identification of gene isoforms, lncRNAs, and fusion transcripts”
  • “additional microRNA variants and complex isoforms, underscoring its advantages for non-coding RNA analyses”
  • “more precise reconstruction of full-length V(D)J recombination sequences and CDR3 motifs, particularly within immunoglobulin heavy chains, thereby enhancing clonotype resolution”
  • “more novel spliced transcripts from known gene loci, and those transcripts exhibited higher completeness, indicating greater structural fidelity”

Iso-Seq also showed strong agreement with conventional approaches:

  • “demonstrated strong concordance with short-read data in both absolute expression levels and gene ranking”
  • providing “reproducible expression profiles comparable to conventional methods”

Importantly, long-read-specific findings captured biology not seen with short reads:

  • “long-read-specific transcripts were associated with diverse and biologically significant processes, including innate immune responses, cell cycle regulation, and autophagy-categories that were absent from the short-read-specific gene sets”

The study notes remaining differences:

  • “Short-read sequencing retained superior quantification accuracy for highly expressed genes and stronger concordance with microarray data”
  • while also highlighting that “microarray measurements rely on short oligonucleotide probes designed from established gene annotations, most of which were originally derived from short-read sequencing data”

 

Conclusion:

Full-length RNA sequencing at scale with Kinnex kits on Revio and Vega platforms offer complete transcript reads that outpace partial reads. Just like short-read RNA-seq replaced microarrays, this study shows that Iso-Seq is the next evolution in delivering a more complete and biologically meaningful view of the human transcriptome.

 


 

Cancer genomics

Genome-wide classification of tumor-derived reads from bulk long-read sequencing 

This preprint from UCLA describes “a major step forward in the analysis of bulk tumors with long-reads,” enabling accurate and sensitive identification of reads with specific cell types of origin genome-wide.

Key highlights:

Because tumor sequencing is confounded by non-cancerous cells, the study leveraged long-read methylation signal:

  • A single 20 kb read can capture the methylation status of an average of 190 CpG sites in the human genome, compared to just 0-5 CpG sites in 300 bp Illumina paired-end reads

Using this information, researchers developed ROCIT (Read Origin Classification In Tumors), a machine learning model that:

  • “can correctly identify the majority of individual tumor-derived reads from bulk biopsies across the entire genome”
  • “uses read-level methylation patterns to accurately classify reads from anywhere in the genome without requiring the adjacent normal tissue or the explicit identification of tumor differentially methylated regions”
  • “demonstrate high classification accuracy across the entire genome”
  • improves detection of tumor somatic mutations

 

Conclusion:

By leveraging rich, long-read methylation information through deep bulk HiFi sequencing, researchers developed a machine learning model that accurately classifies tumor-derived reads across the entire genome without requiring paired normal tissue or predefined tumor-specific regions. This study demonstrates how HiFi sequencing generates comprehensive, high-quality data that can power more effective AI-driven cancer genomics.

 


 

Ready to make discoveries of your own?

These studies show that when researchers can access more complete data – across sequence, structure, methylation, and full-length transcripts – they can ask better questions and get clearer answers.

HiFi long-read sequencing is helping researchers move past fragmented views of the genome with more comprehensive, biologically meaningful information.

We’ll be back next quarter with more. Until then, explore the studies above and see how others are applying HiFi sequencing across a growing range of research areas.

Ready to see how you can use HiFi sequencing for your next project? Let’s get started.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.