Menu

Computational Tools

Command line workflows and tools for advanced users

Tertiary hero overlay

There is so much to discover with HiFi reads. Beyond SMRT Link, which offers a graphic interface to execute fully-developed analysis workflows, PacBio research is constantly innovating with new tools that uncover more from HiFi sequencing.

This page highlights some of the popular tools and workflows for HiFi applications. To explore the complete list of tools developed by PacBio visit the PacBio GitHub page or access through Bioconda. Note that these command line tools are considered in-development tools, can be updated frequently, and are not supported as part of the PacBio software product suite.

Explore example datasets with the applications and tools

whole genome sequencing white icon

Variant analysis tools

These advanced tools and workflows provide a comprehensive view of the full genome or specific region of interest of an organism and exceptional accuracy of variant calling.

Learn more

Null

De novo assembly

The unique combination of length and accuracy enables HiFi reads to resolve difficult repeats, segmental duplications, and centromeres to produce more contiguous and phased assemblies.

Learn more

rna sequencing white icon

RNA sequencing tools

Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.

Learn more

Microbiome icon

Microbiome sequencing tools

See microbiomes in high resolution with species or strain-level identification, complete genes and operons, and reference quality metagenome-assembled genomes.

Learn more

Variant analysis tools

These advanced tools and workflows provide a comprehensive view of the full genome or specific region of interest of an organism and exceptional accuracy of variant calling.

PacBio WGS Variant Pipeline

The recently developed PacBio WGS Variant Pipeline for human genome consolidates 11 robust secondary analysis tools into a best practice workflow for alignment, variant calling, joint calling and optional annotation.

This workflow is also accessible through our partner bioinformatics platforms:

Form Bio  |  DNAnexus   |  Terra   |  DNAStack

Access WGS variant pipeline via GitHub

Paraphase

Paraphase is a computational tool for variant calling in homologous genes, for either WGS or targeted sequencing. Paraphase was introduced for gene profiling of spinal muscular atrophy (SMA), for which it successfully identified full-length SMN1 and SMN2 haplotypes.

This tool is also accessible through our partner platform DNAnexus. This tool is also applicable for targeted sequencing.

Explore the SMA Case study and publication1

Access Paraphrase via GitHub

TRGT

TRGT is a computational tool for targeted genotyping of tandem repeat variation from HiFi sequencing data. TRGT profiles sequence length and composition, mosaicism, and CpG methylation of each repeat. TRGT is accompanied by a visualization tool, TRVZ, and a database of allele sequences and methylation levels for nearly 1 million tandem repeats. TRGT is the main tool in the PureTarget repeat expansion library analysis

Explore the TRGT flier and publication2

Access TRGT tools via GitHub

HiFiCNV

HiFiCNV is a copy number variant caller optimized for PacBio HiFi reads. This tool also provides utility for depth visualization.

Access HiFiCNV via GitHub

pbsv

pbsv is a suite of tools to call and analyze structural variants in diploid genomes from HiFi reads. This tool is applicable for both WGS and targeted sequencing.  Selected versions of pbSV are also available in the Structural Variant Calling analysis workflow of SMRT Link.

Access pbsv via GitHub

HiPhase

HiPhase is a tool that phases both small variants and structural variants made from HiFi sequencing data.  HiPhase benefits from highly accurate long reads that allow phase blocks to span breaks from reference gaps and homozygous deletions.

Explore the HiPhase publication.

Access HiPhase via GitHub

pbaa

PacBio Amplicon Analysis (pbaa) is designed to cluster and generate high-quality consensus sequences from HiFi reads.  The tool can separate complex mixtures of amplicon targets like HLA and genes with pseudogenes or paralogs.

Access pbaa via GitHub

HiFiHLA

HiFiHLA generates high resolution (4-field) HLA allele calls from PacBio HiFi data. It identifies the closest matching allele(s) and any differences between a sample and the reference IPD-IMGT/HLA Database, a specialist database for sequences of the human major histocompatibility complex and includes the official sequences named by the WHO Nomenclature Committee For Factors of the HLA System. Compatible data types are aligned HiFi reads, assembly contigs, and amplicon consensus.

Access HiFiHLA via GitHub

pb-StarPhase

pb-StarPhase determines diplotypes (phased, diploid haplotypes) for pharmacogenomic (PGx) genes in human samples. It leverages the latest Clinical Pharmacogenetics Implementation Consortium (CPIC), IPD-IMGT/HLA, and PharmVar databases to label identified alleles. It supports 21 PGx genes including the complex genes HLA-A, HLA-B, and CYP2D6. This tool is compatible with targeted and whole genome HiFi data.

Access pb-StarPhase via GitHub

de novo assembly

As more researchers embrace the benefits of PacBio long-read sequencing technology for de novo genome assembly, an expanding community developed tools has taken shape.

Hifiasm 

Developed by researchers at Dana-Farber Cancer Institute and Harvard Medical School, hifiasm is fast, easy to use and delivers high-quality telomere-to-telomere assemblies. It can by run on HiFi reads alone or combined with Hi-C reads or short-read parental data to enhance haplotype-resolved assemblies.

Explore the hifiasm publication.

Access hifiasm via GitHub

Verkko 

Developed by researchers at the National Human Genome Research Institute, verkko is an improved and automated version of the de novo assembly strategy used to produce a gapless human genome by the Telomere-to-Telomere consortium. Verkko uses a graph-based pipeline for assembling complete, diploid genomes.

Explore the verkko publication.

Access verkko via GitHub

PacBio Human Assembly Pipeline

Best practices workflow for single sample and trio-binned de novo assembly of HiFi human whole genome sequencing data.

Access Human assembly pipeline via GitHub

RNA sequencing tools

Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.

For more about how RNA sequencing is done with PacBio, visit Kinnex or RNA sequencing

Connect with the community    View application brief        Watch Kinnex playlist

IsoSeq

IsoSeq contains multiple sub-tools for full-length read identification, isoform clustering, and collapsing redundant transcripts. IsoSeq can be used with and without a reference genome.

pigeon

Pigeon is a long-read transcript classification and filtering tool based on SQANTI3. Full-length transcripts from the isoseq tool output can be classified against a known annotation (e.g., Gencode) to identify novel genes and isoforms. Additional sub-tools can be used to generate single-cell gene and isoform count matrices and saturation curves. 

pbfusion

pbfusion is a fusion gene detection tool designed for identifying fusion transcripts from Iso-Seq data.

Microbiome sequencing

See microbiomes in high resolution with species level identification, complete genes and operons, and reference-quality metagenome-assembled genomes.

pb-metagenomics-tools

pb-metagenomics-tools is a collection of recommended tools and common practices workflows for metagenomics profiling and assembly using HiFi reads.

Explore a list of metagenomics analysis using HiFi data.

Access pb-metagenomics tools via GitHub

HiFi-16S-workflow

The HiFi-16S-workflow pipeline is designed to process HiFi full-length 16S data into high-quality amplicon sequence variants (ASVs) using QIIME 2 and DADA2. This pipeline provides a set of visualizations through the QIIME 2 framework for interactive plotting and generates an HTML report for the important statistics and top taxonomies.

Access HiFi-16S-workflow via GitHub

Computational tools FAQs

HiFi targeted enrichment data (such as HiFi Targeted Enrichment with Twist probes) can be analyzed using HiFi Target Enrichment workflow in SMRT Link version 13.0 or later. For PureTarget repeat expansion libraries, users can use the PureTarget repeat expansion analysis workflow in SMRT Link 13.1 or later.

The Revio system provides on-board 5mC methylation calling, and third-party bioinformatic callers are available for detecting somatic small and structural variants. ClairS is a deep learning method for long-read somatic small variant calling and Severus and Sniffes2 are recommended for detecting structural variants. Check out our Application Note for additional details.

  1. Publication: Chen X, Harting J, Farrow E, et al. Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing. The American Journal of Human Genetics. 2023;0(0). doi:10.1016/j.ajhg.2023.01.001
  2. Publication: Dolzhenko, E., English, A., Dashnow, H. et al. Characterization and visualization of tandem repeats at genome scale. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02057-3

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.