Menu
April 18, 2024  |  Cancer research

Long-read sequencing myths: debunked.
Part 3 — cancer genomics


 

Blog header image with HiFi Facts and cancer genomics background

The complexities of cancer biology are famously tricky to decipher. Peering into the murky cancer genome and making sense of tumor mechanisms has long been the goal of researchers everywhere, and technology has often felt one step behind. But the genomics landscape is evolving quickly. Now, researchers no longer need to be limited by the sequencing technology of yesterday.

Today, cancer researchers are moving away from using only short-read sequencing and instead harnessing the power of long reads to see into the cancer genome, transcriptome, and epigenome to help characterize tumor biology. PacBio HiFi reads provide comprehensive discovery power that can help possible reveal differences in drug response, identify novel biomarkers and drug targets, uncover RNA isoforms, detect fusion genes, and more.

This is part three of our six-part myth-busting series, where we dispel common myths and misconceptions about PacBio sequencing in cancer research applications. (New to the series? Check out Part 1 – HiFi sequencing and Part 2 – human genomics.)

Keep reading to find out the truth of how you can demystify cancer biology more than ever before with the power of HiFi long reads!


Myth #1:

Short-read RNA-seq captures all transcriptomic dysregulation in cancer.

Fact:

This statement is shortsighted.


Scientists are well aware that accuracy is crucial for all fields of science. Sensitivity and specificity are arguably even more important in cancer research, when variants occur at lower allele frequencies compared to inherited diseases and are thus harder to detect.

For researchers who want to end cancer as we know it, accuracy matters. PacBio HiFi long reads have been shown to detect more variants more accurately in both whole-genome sequencing and RNA sequencing in cancer applications, compared to other long-read sequencing technologies that are noisy and error-prone.

Short-read RNA-seq is limited — you only see parts, not the whole. It breaks down genetic information into lots of small cDNA fragments, typically between 100 and 200 base pairs long. This makes it tough to figure out exactly how these fragments fit together, especially when it comes to isoforms, which are the different variations of a single gene. Without seeing how the entire sequence unfolds from start to finish, it’s like trying to guess the entire plot of a movie from a handful of out-of-order scenes — you can miss critical details, like alternative ways the gene is spliced or subtle variations that can have big impacts on health. With short-read sequencing this means you need extra computational methods added on post-sequencing to infer original transcript isoforms and accurate inference is often impossible. Long-read sequencing excels over short-read sequencing by reading full-length sequences, offering a more complete narrative from the outset without the need for complex post-sequencing analysis. The PacBio Kinnex full-length RNA kits provide a more complete story from the start.

Kinnex eliminates the need for transcript assembly by sequencing full-length cDNAs. By sequencing full-length RNA from both the 5’ and 3’ ends, Kinnex accurately characterizes splice sites, can detect novel genes and isoforms, and obtains isoform read count information that can be used to characterize cancer-driving genomic alterations.

Two images showing how with short-read sequencing transcript assembly is required vs. long-read sequencing where no assembly is required
Long-read RNA sequencing eliminates the need for transcript assembly, which cannot accurately resolve the isoform structure. Long-read RNA-seq using PacBio (the Iso-Seq method) sequences the entire full-length cDNA to provide an unambiguous view of the transcriptome.

Myth #2:

PacBio is more expensive than nanopore sequencing for cancer genomics research applications.

Fact:

This statement is false.


PacBio HiFi long reads outperform nanopore long reads, enabling you to get more from your sequencing data while doing less sequencing overall.

For whole genome sequencing in cancer research, HiFi sequencing on the Revio system can achieve better small variant detection – SNVs and indels – compared to nanopore sequencing, even at 2.5x less sequencing depth. Lower read depth means spending less time and money on sequencing, while still getting to answers faster.

chart showing HiFi accuracy at lower depth for tumor sequencing
In paired tumor/normal sequencing of the breast cancer cell line HCC13951, HiFi sequencing on the Revio system yielded greater somatic variant calling accuracy with ClairS for SNVs and indels than Oxford Nanopore Technologies even at 2.5 less sequencing

 
And it’s not just read depth that saves time and money. Sequencing data needs to be accurate to make sure you’re getting the full picture. With nanopore long-read sequencing, there’s a bit of hiccup because it can be error-prone, and faces challenges in the detection of small variants, such as single nucleotide variants (SNVs).2 In contrast, the high accuracy of HiFi sequencing delivers a more complete interrogation of cancer genomes, including the detection of both small and complex structural variants. This is particularly important in cancers where allele frequencies are much lower.

On the flip side, for transcriptome analysis PacBio Kinnex full-length RNA sequencing offers long read lengths and high accuracy for more successful whole transcriptome analysis. This stands out over nanopore, giving you the ability to:

• Uncover both rare and extended isoforms that might otherwise remain undetected.
• Accurately interrogate all RNA variants including isoforms, fusions, and expressed mutations, which are crucial for understanding transcriptomic dysregulation in cancer.
• Produce a greater yield of usable, high-quality reads, essential for reliable downstream analysis and interpretation


Myth #3:

HiFi sequencing instruments are too low-throughput for somatic variant detection.

Fact:

This statement is outdated.


Now the increased throughput power on the Revio system, plus the accuracy of HiFi reads, are making somatic variant detection possible at scale, without the need to compare data to traditional short-read sequencing.

Because somatic variants are present at lower frequencies than germline mutations, detecting them has long been a challenge. Increasingly, the exceptional accuracy and long read lengths of PacBio HiFi sequencing are being used to detect complex variants in cancer that are often missed by short reads or less accurate long reads,3,4 now with an even higher throughput of 360 Gb/day needed to detect rare variants. Achieve 30X whole-genome coverage, 8 full-length RNA samples at 5M reads each, or 1 single-cell RNA sample per SMRT Cell, respectively.

You can sequence paired tumor/normal whole genomes using highly accurate HiFi long reads to detect somatic small variants, structural variants, and methylation, all in a single sequencing run. See the full workflow in this application note.


Myth #4:

HiFi sequencing has complicated bioinformatics and lacks clear protocols.

Fact:

This statement is inaccurate.


Streamlined, validated workflows that span sample prep through analysis are now available for cancer whole genome sequencing and RNA sequencing research applications. The Revio system provides on-board 5mC methylation calling, and third-party bioinformatic callers are available for detecting somatic small and structural variants.

For DNA/WGS bfx:

The HiFi somatic WDL is a tumor-normal variant calling pipeline that consolidates variant callers for small and large variants into a single workflow. Together these tools identify and phase single nucleotide variants, small insertions and deletions, copy number variants, structural variants, and methylation.

For RNA bfx:

The SMRT Link Read Segmentation and Single-cell Iso-Seq workflow process the HiFi reads generated from the Kinnex full-length RNA library to produce classified isoforms with read counts that are compatible with tertiary analysis tools. Isoforms are classified against a genome annotation using the Pigeon tool to identify known and novel genes/isoforms.

Once you’ve generated SMRT sequencing data, you can easily map your HiFi reads to a reference genome in SMRT Link.

Our analysis partners offer several more SMRT compatible solutions for variant calling, including:

DeepSomatic — a high-accuracy variant caller for both SNVs and indels at 30x tumor/30x normal coverage


Myth #5:

PacBio long read sequencing is only useful for DNA sequencing applications and doesn’t meet the needs of modern cancer genomics research.

Fact:

This statement is outdated.


A growing body of literature demonstrates that complex structural variants, methylation patterns, transcriptome changes, and more are frequently at play in cancer. That reality means that a multi-omic view is no longer just “nice to have,” but a necessity. Multi-application flexibility with HiFi sequencing on the Revio system means you can get DNA, RNA, and methylation all from a single run, offering a complete multi-omic view of cancer biology.

Whether you are looking for complex structural variants or transcriptomic changes, HiFi sequencing is tailored for modern cancer genomics research. HiFi genomes excel in detecting a diverse array of somatic mutations with high accuracy, requiring significantly less sequencing than comparative technologies like nanopore sequencing. This includes single nucleotide variants, structural variations, and methylation patterns—critical for a nuanced understanding of tumor DNA. Similarly for transcriptome profiling, the Kinnex kits and Iso-Seq method bring precision to RNA analysis, spanning entire transcripts to reveal RNA isoforms, fusions, and expressed mutations with exceptional accuracy. This provides an expansive view of the transcriptome, vital for identifying cancer progression markers and enabling potential therapeutic targets. Together, these methods form a robust, end-to-end workflow for somatic variant detection and full transcriptome analysis, showcasing the PacBio capacity to address the multifaceted challenges of cancer genomics beyond just genome sequencing.


Tomorrow’s genomic discoveries start today

 
In the genomic landscape of today, accuracy matters. The need to better decipher cancer genomics and somatic variation while also recognizing the specific scenarios — such as circulating tumor DNA detection — highlight why having access to exceptionally accurate sequencing technology is undeniable. Now, researchers around the globe are choosing ultra-accurate HiFi sequencing to break through the noise of variation in cancer to get to the truth. With the insights from HiFi long-read sequencing, researchers at the forefront of this battle are driving towards a day when we can stop cancer in its tracks.

Stay tuned for part four in our six-part series, where we disprove common myths about long reads in plant and animal genomics. Let the myth-busting continue!
 

Are you ready to try HiFi?

 
Take advantage of our spring promo
 

References


  1. ONT data sequenced on a R10.4.1 PromethION flowcell as presented in Zheng et al., (2023). ClairS: a deep-learning method for long-read somatic small variant calling. bioRxiv. https://doi.org/10.1101/2023.08.17.553778
  2. Olson, N. D., et al., (2022). PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions. Cell Genomics, 2(5). https://doi.org/10.1016/j.xgen.2022.100129
  3. Garg, S. (2023). Towards routine chromosome-scale haplotype-resolved reconstruction in cancer genomics. Nature Communications, 14(1). https://doi.org/10.1038/s41467-023-36689-5
  4. Keskus, A. et al. (2024). Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads. medRxiv. https://doi.org/10.1101/2024.03.22.24304756

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.