April 5, 2024  |  General

Powered by PacBio: Selected publications from March 2024

blog header featuring images promoting key publications for PacBio in March 2024
PacBio HiFi sequencing technology continues to be the tool of choice for genomics professionals working at the forefront of discovery, enabling them to pursue new avenues of exploration across diverse domains of biology.

In this edition of our Powered by PacBio blog series, we highlight scientific papers from the month of March 2024. These intriguing publications highlight the power of PacBio sequencing for fusion detection, immunology, looking into divergent and mutated regions of primate genomes, and understanding how stimulated saliva interacts with wine.
Jump to topic:

Gene fusions  Immunology  Plant and animal sciences  Metagenomics

Gene fusions

CTAT-LR-fusion: accurate fusion transcript identification from long and short read isoform sequencing at bulk or single cell resolution

In this paper, researchers at the Broad, ETH, & SIB Switzerland present their work on the development of a new tool as part of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT) to “detect fusion transcripts from long read RNA-seq with or without companion short reads, with applications to bulk or single cell transcriptomes”.

Key takeaways:

  • The team used PacBio Kinnex kits to sequence a reference fusion control sample. “All 16 control fusions were detected by CTAT-LR-fusion [using Kinnex]”, while “relatively few control fusion supporting reads were detected and not all fusions were detected across three replicates based on the Illumina short reads”.
  • The use of PacBio full-length RNA sequencing led to the discovery of nine (diverse) cancer cell lines: of 133 high-confidence long-read detected fusions, while only 79 (59%) were identified with short reads.
  • The researchers were also able to use PacBio sequencing to identify tumor single-cell transcriptomes.In a T-cell infiltrated melanoma sample,“60% of the NUTM2A-AS1::RP11-203L2.4 containing tumor cells were solely identified by long read evidence”, and “The short read alignments provide evidence for five alternatively spliced isoforms but because of the short read length only the partial isoform structure around the fusion transcript breakpoints were resolved as opposed to the complete isoform structures clearly evident from the long reads

a data figure from a recent CTAT-LR fusion study



Addressing the technical pitfalls in pursuit of molecular factors that mediate immunoglobulin gene regulation

In this preprint, a team from the University of Louisville asserts that “…standard approaches [for functional genomic data, such as ChIP-Seq] using short reads have limited utility for characterizing regulatory regions in IGH at haplotype-resolution”.

Key takeaways:

  • Defining “features of immunoglobulin heavy chain (IGH) that limit use of short reads and a single reference genome, namely 1) the highly duplicated nature of DNA sequence in IGH and 2) structural polymorphisms that are frequent in the population.
  • Demonstrating that “personalized diploid references enhance performance of short-read data for characterizing mappable portions of the locus, while also showing that long-read profiling tools will ultimately be needed to fully resolve functional impacts of IGH germline variation on expressed antibody repertoires.
  • Also noting “Critically, these points should not be overlooked in non-human animal models. The mouse IG loci, for example, are also enriched for large duplications and repeats, and thus issues with read mappability would be expected.”
  • Conclusion: “Moving forward, we promote the adoption of new approaches that leverage the combined use of personalized reference sequences and long-read molecular assays.” (mentioning DiMeLo-seq (ref. 61) as one such assay example from ref. 61: “DiMeLo-seq provides a versatile approach for characterizing protein-DNA interactions on individual molecules spanning difficult-to-interrogate genomic regions.”)


Plant and animal sciences

Structurally divergent and recurrently mutated regions of primate genomes 

In this preprint, a team of scientists from UW, China, UCSC, the Broad, MIT, U of CO, MD Anderson, NC State, U of MO, OHSU, and the OR National Primate Research Center used PacBio long-read sequencing to generate high-quality genome assemblies of eight nonhuman primate (NHP) species, including New World monkeys (owl monkey and marmoset), Old World monkey (macaque), Asian apes (orangutan and gibbon), and African ape lineages (gorilla, bonobo, and chimpanzee).

Key takeaways:

  • The authors identified >1.3 million lineage-specific fixed structural variants (SVs) disrupting >1,500 protein-coding genes and >135,000 regulatory elements (representing the most complete set of human-specific fixed differences). Notably, 820 Mb (~27%) of the genome were affected by SVs across 50 million years of primate evolution.
  • In addition, they identified >1,600 structurally divergent regions (SDRs) wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost, new lineage-specific genes are generated, and which have become targets of rapid chromosomal diversification and positive selection.
  • In interpreting their results, the team highlighted some key advantages of HiFi sequencing. These advantages were in comparison to previous CLR assemblies, with ONT only used for orthogonal validation of HiFi calls:

    “we wanted to leverage the higher accuracy and assembly contiguity of HiFi sequencing data by sequence and assembly of all NHP genomes where haplotypic differences could be distinguished.”

    HiFi assemblies are estimated to be more accurate (QV=42 to 58 or 99.9937% to 99.9998% accuracy) and significantly more contiguous (contig N50=19 to 104 Mbp) when compared to the CLR draft genome assemblies.”

    “In addition to increased accuracy and haplotype resolution, another major advantage of HiFi-based assemblies is their 4-to 6-fold increase in sequence contiguity.”

    “The use of HiFi data and inclusion of additional NHP species as well as genotyping in population samples significantly improves earlier surveys of fixed SV events and extends the analysis deeper within the primate phylogeny.”

    “The greater accuracy afforded by HiFi sequencing allowed more complex regions of genetic variation to be assembled contiguously across the primates (e.g., MHC).”

  • Conclusion: “High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species for the first time.”



Stimulated saliva has a distinct composition that influences the release of volatiles from wine 

In this preprint researches from Australia present a study of the (full-length 16S) microbiomes of 15 Australian & 15 Chinese adults “before, during, and after salivary stimulation” in the context of wine tasting as well as mixing saliva with wine followed by mass spectrometry.

Key takeaways:

  • “Differences in salivary composition and specific wine volatiles were found between Australian and Chinese participants, and amongst the three stimulation stages.
  • Differential species were identified and significant correlations between the relative abundance of 3 bacterial species and 10 wine volatiles were observed.
  • Conclusion: “Understanding the interactions of salivary components, especially salivary bacteria, on the release of aroma during wine tasting allows nuanced appreciation of the variability of flavour perception in wine consumers.

Our thoughts?
You can thank your microbes for your enological sophistication (or blame them for lack thereof).

Ready to kickstart breakthroughs of your own?

These recent publications exemplify the versatility and power of PacBio sequencing. From fusion detection to making dynamic regions of the genome accessible within and between primate species, PacBio technology is enabling scientific pioneers to make innovative breakthroughs like never before.

PacBio sequencing is now more accessible for research teams of all sizes –thanks to new options for instrument financing or collaboration with certified service providers. To learn how to incorporate PacBio data into your next project:
Connect with a PacBio scientist

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.