Menu

Scientific posters

AACR 2025  |  2025

Comprehensive, multi-omic detection of somatic variants from the GIAB HG008 matched tumor-normal pair

Alex Sockell1, Khi Pin Chua1, Christine Lambert1, Matt Boitano1, Melanie Wescott1, Ian J McLaughlin1, Primo Baybayan1, Jennifer McDaniel2, Justin Zook2, Aaron M Wenger1 1 PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025, 2Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, MD 20899, USA

Here we apply PacBio HiFi to perform whole-genome sequencing of the newly described HG008 matched tumor- normal pair from the Genome in a Bottle (GIAB) consortium. 1 This reference sample includes an adherent, epithelial-like pancreatic adenocarcinoma (PDAC) cell line as the tumor material, with the matched normal obtained from adjacent duodenal and pancreatic tissue. We perform whole-genome sequencing of the tumor cell line and matched pancreatic normal tissue with PacBio HiFi, resulting in a more robust and comprehensive picture of somatic variation in this reference sample and contributing to the development of this novel benchmark.
AACR 2025  |  2025

Precise characterization of complex repeat regions in cancer genomes

Khi Pin, Chua1,2, Egor Dolzhenko1, Tom Mokveld1, Zev N Kronenberg1, Seiya Imoto2, Seiichi Mori3, Michael A Eberle1 1. Pacific Biosciences of California, Menlo Park, CA, USA 2. Institute of Medical Sciences, University of Tokyo, Japan 3. Japan Foundation for Cancer Research, Japan

The characterization of somatic variation, especially in complex genomic regions, is crucial for understanding the molecular drivers of cancer progression. Accurate PacBio long-read sequencing (HiFi) enables detection of all variant classes, from simple SNVs and INDELs up to complex structural variation, tandem repeats, and changes in epigenetic signatures. Complex and repetitive regions, while fully sequenced by HiFi reads, remain bioinformatically challenging, requiring tailored solutions. Here, we describe new tools to genotype understudied repetitive regions in cancer genomes, a task that has historically posed significant challenges for short-read sequencing.
ACMG 2025  |  2025

Extracting HMW DNA from saliva for HiFi sequencing applications

Sarah Kingan, Deborah Moine, Nina Gonzaludo, Shreyasee Chakraborty, Heather Ferrao, Kristina Weber, Christina Dillane*, Mike Tayeb*, Duncan Kilburn, PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025, *DNA Genotek Inc., Ottawa, ON, Canada

In this proof-of-concept study, we demonstrate that high quality PacBio HiFi sequencing results can be obtained from DNA extracted from saliva collected in DNA Genotek Oragene devices and extracted using the Nanobind PanDNA or CBB kits.
ACMG 2025  |  2025

StarPhase: Leveraging Long-Read Sequencing to Update Pharmacogenomic Benchmarks

J. Matthew Holt, John Harting, Xiao Chen, Daniel Baker, Nina Gonzaludo, Zev Kronenberg, Christopher T. Saunders, Michael A. Eberle PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025

StarPhase is a long-read pharmacogenomic diplotyper that provides highly accurate diplotype results from long-read observations, provides refined PGx diplotypes for GeT-RM benchmark samples, and generates full-length haplotype sequences and visualizations for complex pharmacogenes
ACMG 2025  |  2025

Targeted long-read sequencing of native DNA for genetic disease diagnostic and screening research

Jocelyne Bruand, Sarah B. Kingan, Jeff Zhou, Davy Lee, Heather Ferrao, Ian McLaughlin, Sijie Wei, Richa Pathak, Ravi Dalal, Tom Mokveld, Guilherme De Sena Brandine, Egor Dolzhenko, Nat Echols, Michael A. Eberle, Duncan Kilburn PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025

Short tandem repeats (STRs) are DNA sequences composed of repetitions of 2 – 6 bp motifs. Expansions of STRs are the cause of over 60 monogenic diseases, including Huntington’s disease, fragile X syndrome, and amyotrophic lateral sclerosis1. In addition to their length, the pathogenicity of these STRs can be impacted by sequence composition, methylation status and mosaicism. One such example is the FMR1 repeat whose CGG repeat expansions are typically hypermethylated and where AGG interruption sequences can stabilize the repeat. Detecting all the characteristics associated with pathogenic repeat expansions traditionally required multiple assays, however high-accuracy long-read sequencing of unamplified DNA can resolve all these features in a single assay.
PAG 2025  |  2025

High-resolution microbiome species profiling at scale with the Kinnex kit for full- length 16S rRNA sequencing

Jeremy E Wilkinson1, Jocelyne Bruand1, Khi Pin Chua1, Heather Ferrao1, Davy Lee1, Jeff Zhou1, Kristopher Locken2, Shuiquan Tang2, Ethan Thai2, John Sherman2, Brett Farthing2, Elizabeth Tseng1 1 PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025, USA, 2 Zymo Research Corporation, 17062 Murphy Avenue, Irvine, CA 92614, USA

Targeted 16S sequencing is a cost-effective approach for assessing the bacterial composition of microbial communities. This is especially true for low bacterial biomass samples where amplicon sequencing is the best option. However, the high similarity between the 16S rRNA genes of related bacteria means that sequencing the entirety of the 16S gene (~1.5 kb) with high accuracy is essential for species- or strain-level characterization. Many recent comparative studies have shown that PacBio full-length (FL) 16S sequencing outperforms other sequencing methods for taxonomic resolution and data accuracy
PAG 2025  |  2025

Long-read metagenome assembly produces hundreds of high-quality MAGs from different soil types

Daniel M. Portik1, Luis E. Valentin-Alvarado2,3,, Jillian F. Banfield2, Boyke Bunk4, Jorg Overmann4, and Jeremy E. Wilkinson1, 1. PacBio, 1305 O’Brien Dr, Menlo Park, California 93025 USA;, 2. Innovative Genomics Institute, University of California, Berkeley, California 94720 USA;, 3. Department of Plant and Microbial Ecology, University of California, Berkeley, California USA;, 4. Leibniz Institute, DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Inhoffenstraße 7B, 38124 Braunschweig, Germany

Metagenome assembly of soil has been historically difficult using short reads. The combination of high species diversity and ultra-low relative abundances poses a challenge and requires a higher sequencing depth to achieve success. Here, we demonstrate that the amount of HiFi data from the high-throughput Revio system is sufficient to assemble high-quality MAGs in complex microbiomes such as wetland soil.
PAG 2025  |  2025

Optimised Workflow for HMW DNA Extraction, Sample Preparation and Sequencing for Marine Vertebrates on the PacBio Revio System

Liam Anstiss1, Adrianne Doran1, Lara Parata1, Emma de Jong1, Deborah Moine2, James Miller3, Paul Gooding4, Shannon Corrigan5 and OceanOmics Centre, (1)University of Western Australia, Perth, Western Australia, Australia, (2)PacBio, Menlo Park, CA, (3)PacBio, Adelaide, SA, Australia, (4)Millennium Science, Adelaide, SA, Australia, (5)Minderoo Foundation, Perth, Western Australia, Australia

Here we describe our ocean genome laboratory workflow developed for processing marine vertebrates for HiFi sequencing on the PacBio Revio System. We outline methods optimised for High Molecular Weight (HMW) DNA extraction using the Nanobind PanDNA kit from various tissue types, our strategy for size selection, and the implementation of automated library preparation method using the Beckman Biomek i7 system.
AMP 2024  |  2024

Improved detection of low frequency mutations in ovarian and endometrial cancers by utilizing a highly accurate sequencing platform

Timothée Revil1 , Nairi Pezeshkian2, Dan Nasko2, Lucy Gilbert1, Alexandra Sockell2 , Jiannis Ragoussis1 1) McGill University, Quebec, QC, Canada, 2) Pacific Biosciences, Menlo Park, CA

Ovarian and endometrial cancers are the 4th highest (combined) cancer killer of Canadian women. In 2020, over 3000 women were diagnosed with an ovarian cancer, of which 75% were in the later stages. The goal of the DOvEEgene (Detecting Ovarian and Endometrial cancer Early using Genomics) project is to detect these cancers as early as the first stage through a low-cost, low invasiveness and widely available test, similar to what the Pap test has done for cervical cancers. In this assay, for each subject, an intra-uterine brush sample is collected along with a saliva sample. The genomic DNA is extracted from both these samples, captured using probes with a total size of 146.46 kb using SureSelect XT HS (see target design), sequenced at 20 million reads to a median DNA fragment depth of at least 80% at 1000x, and deduplicated using UMIs. In parallel, uncaptured libraries are also used for Low-pass whole genome sequencing (LP-WGS). Somatic and copy number variants are called, as well as germline variants for 10 genes, and microsatellite instability (MSI) status is determined for known microsatellite loci within the target region. Separately, clinical MSI testing is performed on each sample using a PCR-based assay. As the ability to detect early stage cancers relies on high sensitivity and specificity, we were interested in testing the PacBio Onso sequencing by binding (SBB) technology which promises much higher sequencing qualities and better performance in homopolymer regions, thus should potentially increase variant detection and MSI calling performance.
AMP 2024  |  2024

Improved liquid biopsy assay performance using sequencing by binding (SBB) on the PacBio Onso system

Dan Nasko1, Phillip Pham1 , Stuti Joshi1, Kristi Kim1 , Nairi Pezeshkian1, Young Kim1 , Alexandra Sockell1, and Jonas Korlach1 1) Pacific Biosciences, Menlo Park, CA

Liquid biopsy is revolutionizing the field of early cancer detection research through non-invasive detection of tumor DNA in the blood. However, existing liquid biopsy assays are limited in their sensitivity for ctDNA detection at low variant allele frequencies (VAFs). Here we describe the application of the PacBio Onso short-read sequencing system to help enable detecti
AMP 2024  |  2024

Targeted long-read sequencing of native DNA for comprehensive characterization of repeat expansions

Sarah B Kingan1, Guilherme De Sena Brandine1, Jocelyne Bruand1, Jeff Zhou1, Valeriya Gaysinskaya1, Janet Aiyedun1, Julian Rocha1, Duncan Kilburn1, Egor Dolzhenko1, Zoi Kontogeorgiou2, Anita Szabo3, Christina Zarouchlioti3, Robert Thaenert4, Pilar Alvarez Jerez5, Kimberley Billingsley5, Sonia Lameiras6, Sylvain Baulande6, Alice Davidson3, Georgios Koutsis7, Georgia Karadima2, Stéphanie Tomé8, Michael A Eberle1 1. Pacific Biosciences (PacBio), Menlo Park, United States, 2. National and Kapodistrian University of Athens, 1st Department of Neurology, Athens, Greece, 3. University College London, Institute of Ophthalmology, United Kingdom, 4. Quest Diagnostics, Marlborough, United States, 5. National Institutes of Health, Center for Alzheimer's and Related Dementias, National Institute on Aging, Bethesda, United States, 6. Institut Curie, PSL Research University, ICGex Next-Generation Sequencing Platform, Paris, France, 7. National and Kapodistrian University of Athens, Neurogenetics Unit, 1st Department of Neurology, Eginition Hospital, School of Medicine, Athens, Greece 8. Sorbonne Université, Inserm, Institut de Myologie, Centre de Recherche en myologie, Paris, France

Short tandem repeats (STRs) are DNA sequences composed of repetitions of 1 – 6 bp motifs. Expansions of STRs are the cause of over 60 monogenic diseases, including Huntington’s disease, fragile X syndrome, and amyotrophic lateral sclerosis1. In addition to their length, the pathogenicity of these STRs can be impacted by sequence composition, methylation status and mosaicism. One such example is the FMR1 repeat whose CGG repeat expansions are typically hypermethylated and where AGG interruption sequences can stabilize the repeat. Detecting all the characteristics associated with pathogenic repeat expansions traditionally required multiple assays, however long-read sequencing of unamplified DNA holds the promise to resolve all these features in a single assay.
ASHG 2024  |  2024

Detection of repeat expansions with PureTarget

M. Eberle1, G. De Sena Brandine2, V. Gaysinskaya2, J. Aiyedun2, J. Rocha3, D. Kilburn2, S. Kingan4, E. Dolzhenko2, Z. Kontogeorgiou5, A. Szabo6, C. Zarouchlioti6, R. Thaenert7, P. Alvarez Jerez8, K. Billingsley8, S. Lameiras9, S. Baulande9, A. Davidson10, G. Koutsis5, G. Karadima5, S. Tome11; 1) PacBio, Oceanside, CA, 2) PacBio, Menlo Park, CA, 3) PacBio, Bel Air, MD, 4) PacBio, San Mateo, CA, 5) Natl. and Kapodistrian Univ. of Athens, Athens, Greece, 6) Univ. Coll. London, London, United Kingdom, 7) Quest Diagnostics, Marlborough, MA, 8) NIH, Bethesda, MD, 9) Inst. Curie, Paris, France, 10) UCL, London, United Kingdom, 11) INSERM, Paris, France

Abstract: Short tandem repeats (STRs) are DNA sequences composed of repetitions of 1-6bp motifs. Expansions of STRs are the cause of over 60 monogenic diseases, including Huntington’s disease, Fragile X syndrome, and amyotrophic lateral sclerosis. In addition to their length, the pathogenicity of these STRs is impacted by sequence composition, methylation status and mosaicism. One such example is a repeat in an intron of the RFC1 gene whose reference sequence consists of a short stretch of AAAAGs while expansions that span hundreds of AAGGGs cause cerebellar ataxia with neuropathy and vestibular areflexia syndrome. Another example is the FMR1 repeat whose expansions are typically hypermethylated. Detecting all the characteristics associated with pathogenic repeat expansions traditionally required multiple assays, however long-read sequencing of unamplified DNA holds the promise to resolve all of the required features in a single assay.

We describe a robust amplification-free protocol to generate long-read HiFi sequencing libraries containing a panel of loci associated with 20 pathogenic STR expansions. The protocol can be multiplexed to sequence 48 samples at up to 1000x coverage per locus in one sequencing run. To assess the accuracy of this protocol, we sequenced 129 samples with validated pathogenic expansions at 20 loci including CNBP, DMPK, RFC1 and C9orf72.

Combined, we tested 2580 sample-expansion combinations, including technical replicates, for expansions between 66 bp and >10kb. Our assay correctly categorized all (129/129) expansions, including the detection of hypermethylation in the FMR1 expansion and differentiating the pathogenic AAGGG motif in RFC1. We identified additional expansions in FXN, RFC1 and TCF4, consistent with these loci having carrier frequencies between 1:50 and 1:20. Excluding these three genes, we found no unexpected expansions (0/2064) in any sample-loci combination.

We will also present a detailed characterization of lengths, sequence composition, mosaicism, and methylation of normal and expanded alleles in 150 genomes. Most repeats we profiled exhibit high genetic or epigenetic polymorphism and also mosaicism at the expanded size ranges. Motivated by these results, we describe a novel computational approach that will capture all these modalities to robustly differentiate between normal and abnormal variation at known pathogenic or any other repeats in the human genome. In summary, we will present a protocol and a set of computational methods for accurately assessing tissue-level molecular landscapes of various pathogenic STRs, which can be further adapted to other loci in the human genome.

ASHG 2024  |  2024

Sawfish: Improving long-read structural variant discovery and genotyping with local haplotype modeling

Christopher T. Saunders, James M. Holt, Daniel N. Baker, Juniper A. Lake, Jonathan R. Belyeu, Zev Kronenberg, William J. Rowell, Michael A. Eberle

We describe sawfish, a structural variant (SV) caller for mapped high-quality long reads. This method emphasizes assembly of local SV haplotypes and their utilization in downstream sample merging and genotyping steps, improving accuracy compared to variant-focused approaches in both individual and joint-genotyping contexts.

Assessing sawfish against the GIAB draft SV benchmark based on the T2T-HG002-Q100 diploid assembly shows substantial accuracy gains compared to pbsv and Sniffles2 on HiFi WGS 33x input, with a sawfish F1 score of 0.971 compared to 0.930 and 0.935 for pbsv and Sniffles2, respectively. This accuracy gain persists at lower depth, for example at 10x depth the sawfish F1 score is 0.937, compared to 0.857 and 0.882 for pbsv and Sniffles2. For SVs in the GIAB Challenging Medically Relevant Genes benchmark, sawfish has a combined false positive and false negative count of 4, compared to 19 and 15 with pbsv and Sniffles2, respectively.

Sawfish also has higher genotype concordance in the Platinum Pedigree (CEPH-1463). Joint-genotyping accuracy was assessed on 10 HiFi WGS samples comprising the 2nd and 3rd pedigree generations, where the known inheritance pattern enables genotype accuracy assessment. From high genotype-quality calls, sawfish yields 27,811 concordant and 4,414 discordant SV alleles (86.3% concordance), where concordant alleles respectively represent 7.8 Mb and 11.9 Mb of deleted and inserted sequence. This substantially improves concordant allele count, length and percent concordance compared to the next most concordant method, Sniffles2, with 20,519 concordant and 7,645 discordant alleles (72.9% concordance), where concordant alleles represent 4.2 Mb and 5.6 Mb of deleted and inserted sequence.

As additional improvements, our assembly-focused approach allows all calls to be made with single-base precision, enabling breakpoint insertion and homology annotation for all SV types. Sawfish also assesses depth of large deletions and duplications to evaluate their consistency with its own expected GC-corrected depth model, improving precision for these large SV types. Through the combination of high genotyping accuracy, detailed breakpoint modeling, and joint assessment of breakpoint evidence with read depth, sawfish offers improved options for WGS sample analysis with high-quality long reads.

ASHG 2024  |  2024

StarPhase: Comprehensive Phase-Aware Pharmacogenomic Diplotyper for Long-Read Sequencing Data

James M. Holt, John Harting, Xiao Chen, Daniel Baker, Nina Gonzaludo, Zev Kronenberg, Christopher T. Saunders, Michael A. Eberle

Introduction: Pharmacogenomics (PGx) is critically important to precision medicine, informing the use of medications at an individual level, improving both safety and efficacy. PGx diplotyping relies on the ability to both accurately detect genomic variation and phase that variation onto distinct haplotypes, commonly referred to as “star (*) alleles”. PacBio HiFi sequencing provides long reads with highly accurate base-calling, enabling variant calling and phasing for both targeted and whole-genome sequencing approaches.

Methods: We developed StarPhase, a phase-aware tool for generating comprehensive PGx diplotype calls from PacBio HiFi sequencing datasets. StarPhase accepts both phased and unphased variant calls from a HiFi sequencing pipeline (e.g., DeepVariant followed by HiPhase) as well as an aligned BAM file to produce PGx diplotypes for 21 genes, including the complex genes HLA-A, HLA-B, and CYP2D6. In contrast to existing tools, StarPhase correctly handles genes that are fully phased as well as ambiguity from unphased variants.

Results: We compared StarPhase diplotype calls to those from other PGx diplotyping tools (PharmCAT, HiFiHLA, Pangu, and Cyrius) and to known diplotypes from GeT-RM. For simple PGx genes, StarPhase has a 98.24% concordance with PharmCAT, and all discrepancies were explained via manual inspection as either differences in reporting or corrections that StarPhase made relative to PharmCat. For HLA-A and HLA-B, all StarPhase results were 100% concordant with the assembly-based results of HiFiHLA. Finally, StarPhase calls for CYP2D6 were 100% concordant for whole genome sequencing datasets, and 96% concordant for targeted sequencing. All discrepancies in the targeted sequencing were explained through either ambiguity in GeT-RM, errors in the comparator tool, or low coverage of hybrid alleles in the raw data (likely due to reduced capture rate). We further demonstrated the utility of StarPhase by applying it to CEPH pedigree 1463, consisting of 27 whole genome sequencing PacBio HiFi datasets from four generations. In this analysis, all diplotype calls across all 21 genes were inherited consistent with the pedigree.

ASHG 2024  |  2024

Visualize complex structural variants in HiFi data with SVTopo

Jonathan R Belyeu, William J Rowell, Juniper Lake, James M Holt, Zev N Kronenberg, Christopher T Saunders, Michael A Eberle

Structural variants (SVs) are alleles that differ from the reference genome by at least 50 nucleotides. SVs are common in the human genome and play a major role in both phenotypic diversity and human health. Many are deletions or duplications of genomic material resulting from a single non-reference end-joining event. These are easily identified and visualized from high quality long reads by existing software tools. Other SVs, which we define as complex SVs, are less easily categorized and remain difficult to interpret.

Complex SVs often lead to convoluted signals in both coverage and break-ends. Complex SVs may combine multiple copy-number alterations in tandem with duplicated inversions, creating genomic rearrangements that may visually appear as several nearby changes in coverage. Inversion events, often appearing with inconsistent coverage and mapping abnormalities, can be difficult to visualize with popular genome browsers in the best of cases but are even more challenging when appearing in tandem with CNVs.

SVTopo addresses the challenge with a dedicated complex variant plotting approach. It uses haplotagged HiFi reads to identify genome alignment breakpoints relative to a reference genome, connects these into multi-locus SVs via shared chimeric alignments, and presents the supporting evidence in easily understood images.

In an analysis of complex SVs within the Platinum Pedigree (CEPH-1463, a family of 28 samples), SVTopo characterized 469 distinct complex SVs. 34 were solitary inversion events, and 112 were triplications. An additional 87 inversions were found with a flanking deletion on one or both sides, and 112 inversions were found with other complexities such as duplications of nearby sequences, the inverted sequence, or both. SVTopo also found 124 other SVs, such as non-tandem duplications, deletions followed by insertion of a non-tandem duplication, paired deletion/duplication events, and re-ordering of multiple genomic blocks. In many of these cases, SV callers using high quality long reads are able to identify individual SV components, but variant interpretation with SVTopo clarifies multiple calls into a single complex rearrangement.

The prevalence and complexity of these variants in many samples without genetic disorders, makes them an important but challenging target of human genomic research, highlighting both the importance of long reads for their identification and the role of targeted software tools like SVTopo to better understand the signals they produce.

Keyword search
Author search

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.