There is so much to discover with HiFi reads. Beyond SMRT Link, which offers a graphic interface to execute fully-developed analysis workflows, PacBio research is constantly innovating with new tools that uncover more from HiFi sequencing.
This page highlights some of the popular tools and workflows for HiFi applications. To explore the complete list of tools developed by PacBio visit the PacBio GitHub page or access through Bioconda. Note that these command line tools are considered in-development tools, can be updated frequently, and are not supported as part of the PacBio software product suite.
Explore example datasets with the applications and tools
Variant analysis tools
These advanced tools and workflows provide a comprehensive view of the full genome or specific region of interest of an organism and exceptional accuracy of variant calling.
De novo assembly
The unique combination of length and accuracy enables HiFi reads to resolve difficult repeats, segmental duplications, and centromeres to produce more contiguous and phased assemblies.
RNA sequencing tools
Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.
Microbiome sequencing tools
See microbiomes in high resolution with species or strain-level identification, complete genes and operons, and reference quality metagenome-assembled genomes.
Variant analysis tools
These advanced tools and workflows provide a comprehensive view of the full genome or specific region of interest of an organism and exceptional accuracy of variant calling.
PacBio WGS Variant Pipeline
The recently developed PacBio WGS Variant Pipeline for human genome consolidates 11 robust secondary analysis tools into a best practice workflow for alignment, variant calling, joint calling and optional annotation.
This workflow is also accessible through our partner bioinformatics platforms:
Paraphase
Paraphase is a computational tool for variant calling in homologous genes, for either WGS or targeted sequencing. Paraphase was introduced for gene profiling of spinal muscular atrophy (SMA), for which it successfully identified full-length SMN1 and SMN2 haplotypes.
This tool is also accessible through our partner platform DNAnexus. This tool is also applicable for targeted sequencing.
Explore the SMA Case study and publication1
TRGT
TRGT is a computational tool for targeted genotyping of tandem repeat variation from HiFi sequencing data. TRGT profiles sequence length and composition, mosaicism, and CpG methylation of each repeat. TRGT is accompanied by a visualization tool, TRVZ, and a database of allele sequences and methylation levels for nearly 1 million tandem repeats. TRGT is the main tool in the PureTarget repeat expansion library analysis
Explore the TRGT flier and publication2
HiFiCNV
HiFiCNV is a copy number variant caller optimized for PacBio HiFi reads. This tool also provides utility for depth visualization.
pbsv
pbsv is a suite of tools to call and analyze structural variants in diploid genomes from HiFi reads. This tool is applicable for both WGS and targeted sequencing. Selected versions of pbSV are also available in the Structural Variant Calling analysis workflow of SMRT Link.
HiPhase
HiPhase is a tool that phases both small variants and structural variants made from HiFi sequencing data. HiPhase benefits from highly accurate long reads that allow phase blocks to span breaks from reference gaps and homozygous deletions.
Explore the HiPhase publication.
pbaa
PacBio Amplicon Analysis (pbaa) is designed to cluster and generate high-quality consensus sequences from HiFi reads. The tool can separate complex mixtures of amplicon targets like HLA and genes with pseudogenes or paralogs.
HiFiHLA
HiFiHLA generates high resolution (4-field) HLA allele calls from PacBio HiFi data. It identifies the closest matching allele(s) and any differences between a sample and the reference IPD-IMGT/HLA Database, a specialist database for sequences of the human major histocompatibility complex and includes the official sequences named by the WHO Nomenclature Committee For Factors of the HLA System. Compatible data types are aligned HiFi reads, assembly contigs, and amplicon consensus.
pb-StarPhase
pb-StarPhase determines diplotypes (phased, diploid haplotypes) for pharmacogenomic (PGx) genes in human samples. It leverages the latest Clinical Pharmacogenetics Implementation Consortium (CPIC), IPD-IMGT/HLA, and PharmVar databases to label identified alleles. It supports 21 PGx genes including the complex genes HLA-A, HLA-B, and CYP2D6. This tool is compatible with targeted and whole genome HiFi data.
de novo assembly
As more researchers embrace the benefits of PacBio long-read sequencing technology for de novo genome assembly, an expanding community developed tools has taken shape.
Hifiasm
Developed by researchers at Dana-Farber Cancer Institute and Harvard Medical School, hifiasm is fast, easy to use and delivers high-quality telomere-to-telomere assemblies. It can by run on HiFi reads alone or combined with Hi-C reads or short-read parental data to enhance haplotype-resolved assemblies.
Explore the hifiasm publication.
Verkko
Developed by researchers at the National Human Genome Research Institute, verkko is an improved and automated version of the de novo assembly strategy used to produce a gapless human genome by the Telomere-to-Telomere consortium. Verkko uses a graph-based pipeline for assembling complete, diploid genomes.
Explore the verkko publication.
PacBio Human Assembly Pipeline
Best practices workflow for single sample and trio-binned de novo assembly of HiFi human whole genome sequencing data.
RNA sequencing tools
Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.
For more about how RNA sequencing is done with PacBio, visit Kinnex or RNA sequencing
Connect with the community View application brief Watch Kinnex playlist
IsoSeq
IsoSeq contains multiple sub-tools for full-length read identification, isoform clustering, and collapsing redundant transcripts. IsoSeq can be used with and without a reference genome.
pigeon
Pigeon is a long-read transcript classification and filtering tool based on SQANTI3. Full-length transcripts from the isoseq tool output can be classified against a known annotation (e.g., Gencode) to identify novel genes and isoforms. Additional sub-tools can be used to generate single-cell gene and isoform count matrices and saturation curves.
pbfusion
pbfusion is a fusion gene detection tool designed for identifying fusion transcripts from Iso-Seq data.
Microbiome sequencing
See microbiomes in high resolution with species level identification, complete genes and operons, and reference-quality metagenome-assembled genomes.
pb-metagenomics-tools
pb-metagenomics-tools is a collection of recommended tools and common practices workflows for metagenomics profiling and assembly using HiFi reads.
Explore a list of metagenomics analysis using HiFi data.
HiFi-16S-workflow
The HiFi-16S-workflow pipeline is designed to process HiFi full-length 16S data into high-quality amplicon sequence variants (ASVs) using QIIME 2 and DADA2. This pipeline provides a set of visualizations through the QIIME 2 framework for interactive plotting and generates an HTML report for the important statistics and top taxonomies.
Computational tools FAQs
HiFi targeted enrichment data (such as HiFi Targeted Enrichment with Twist probes) can be analyzed using HiFi Target Enrichment workflow in SMRT Link version 13.0 or later. For PureTarget repeat expansion libraries, users can use the PureTarget repeat expansion analysis workflow in SMRT Link 13.1 or later.
The Revio system provides on-board 5mC methylation calling, and third-party bioinformatic callers are available for detecting somatic small and structural variants. ClairS is a deep learning method for long-read somatic small variant calling and Severus and Sniffes2 are recommended for detecting structural variants. Check out our Application Note for additional details.
- Publication: Chen X, Harting J, Farrow E, et al. Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing. The American Journal of Human Genetics. 2023;0(0). doi:10.1016/j.ajhg.2023.01.001
- Publication: Dolzhenko, E., English, A., Dashnow, H. et al. Characterization and visualization of tandem repeats at genome scale. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02057-3