Menu

Computational Tools

Command line workflows and tools for advanced users

Tertiary hero overlay
Sprite decoration

There is so much to discover with HiFi reads. Beyond SMRT Link, which offers a graphic interface to execute fully-developed analysis workflows, PacBio research is constantly innovating with new tools that uncover more from HiFi sequencing.

This page highlights some of the popular tools and workflows for HiFi applications. To explore the complete list of tools developed by PacBio visit the PacBio GitHub page or access through Bioconda. Note that these command line tools are considered in-development tools, can be updated frequently, and are not supported as part of the PacBio software product suite.

Null

Whole genome sequencing tools

These advanced variant callers and workflows provide a more comprehensive view of the full genome of an organism and exceptional accuracy for variant calling and complete genome assemblies.

Learn more

Null

RNA sequencing tools

Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.

Learn more

Null

Microbiome sequencing tools

See microbiomes in high resolution with species or strain-level identification, complete genes and operons, and reference quality metagenome-assembled genomes.

Learn more

Whole genome sequencing tools

These advanced variant callers and workflows provide a comprehensive view of the full genome of an organism and exceptional accuracy for variant calling and complete genome assemblies.

Variant calling

PacBio WGS Variant Pipeline

The recently developed PacBio WGS Variant Pipeline for human genome consolidates 11 robust secondary analysis tools into a best practice workflow for alignment, variant calling, joint calling and optional annotation.

This workflow is also accessible through our partner bioinformatics platforms:

Form Bio  |  DNAnexus   |  Terra   |  DNAStack

Access WGS variant pipeline via GitHub

Paraphase

Paraphase is a computational tool for variant calling in homologous genes, for either WGS or targeted sequencing. Paraphase was introduced for gene profiling of spinal muscular atrophy (SMA), for which it successfully identified full-length SMN1 and SMN2 haplotypes.

This tool is also accessible through our partner platform DNAnexus. This tool is also applicable for targeted sequencing.

Explore the SMA Case study and publication.1

Access Paraphrase via GitHub

TRGT

TRGT is a computational tool for targeted genotyping of tandem repeat variation from HiFi sequencing data. TRGT profiles sequence length and composition, mosaicism, and CpG methylation of each repeat. TRGT is accompanied by a visualization tool, TRVZ, and a database of allele sequences and methylation levels for nearly 1 million tandem repeats.

Explore the TRGT flier and publication.2

Access TRGT tools via GitHub

HiFiCNV

HiFiCNV is a copy number variant caller optimized for PacBio HiFi reads. This tool also provides utility for depth visualization.

Access HiFiCNV via GitHub

pbsv

pbsv is a suite of tools to call and analyze structural variants in diploid genomes from HiFi reads. This tool is applicable for both WGS and targeted sequencing.  Selected versions of pbSV are also available in the Structural Variant Calling analysis workflow of SMRT Link.

Access pbsv via GitHub

HiPhase

HiPhase is a tool that phases both small variants and structural variants made from HiFi sequencing data.  HiPhase benefits from highly accurate long reads that allow phase blocks to span breaks from reference gaps and homozygous deletions.

Explore the HiPhase publication.

Access HiPhase via GitHub

de novo assembly

As more researchers embrace the benefits of PacBio long-read sequencing technology for de novo genome assembly, an expanding community developed tools has taken shape.

Hifiasm 

Developed by researchers at Dana-Farber Cancer Institute and Harvard Medical School, hifiasm is fast, easy to use and delivers high-quality telomere-to-telomere assemblies. It can by run on HiFi reads alone or combined with Hi-C reads or short-read parental data to enhance haplotype-resolved assemblies.

Explore the hifiasm publication.

Access hifiasm via GitHub

Verkko 

Developed by researchers at the National Human Genome Research Institute, verkko is an improved and automated version of the de novo assembly strategy used to produce a gapless human genome by the Telomere-to-Telomere consortium. Verkko uses a graph-based pipeline for assembling complete, diploid genomes.

Explore the verkko publication.

Access verkko via GitHub

Whole genome datasets

Application Dataset Download literature Technology Sequencing system
Variant detection, assembly, epigenetics Homo sapiens — GIAB trio HG002-4 N/A HiFi long read Revio system
Tumor/normal COLO829 melanoma N/A HiFi long read Revio system
Tumor/normal HCC1395 N/A HiFi long read Revio system
Whole genome sequencing Various plant & animals – mouse, ladybug, oak, mistletoe, and maize Plant + animal biology HiFi long read Revio system
Whole genome sequencing Homo sapiens - GIAB trio HG002-4 N/A SBB short read Onso system
Assembly (ultra-low DNA input) Phlebotomus papatasi, Homo sapiens, Drosophila melanogaster Considerations for using ultra-low DNA input workflows for WGS HiFi long read Sequel II system
Assembly Food safety & infectious microbes – 96 plex Microbial WGS HiFi long read Sequel IIe system

RNA sequencing tools

Achieve isoform characterization of PacBio full-length RNA sequencing for bulk and single-cell transcriptome data. Characterize alternative splicing events, and discover new isoforms and fusion transcripts.  For more details and tutorials on Iso-Seq tools please visit https://isoseq.how/.

Connect with the community

IsoSeq

IsoSeq contains multiple sub-tools for full-length read identification, isoform clustering, and collapsing redundant transcripts. IsoSeq can be used with and without a reference genome.

pigeon

Pigeon is a long-read transcript classification and filtering tool based on SQANTI3. Full-length transcripts from the isoseq tool output can be classified against a known annotation (e.g., Gencode) to identify novel genes and isoforms. Additional sub-tools can be used to generate single-cell gene and isoform count matrices and saturation curves. 

pbfusion

pbfusion is a fusion gene detection tool designed for identifying fusion transcripts from Iso-Seq data.

RNA sequencing datasets

Application Dataset Download literature Technology Sequencing system
MAS-Seq single-cell Homo sapiens - PBMC 10x Chromium Single Cell 3' libraries MAS-Seq for single-cell isoform sequencing HiFi long read Sequel II and Revio systems
Whole transcriptome Homo sapiens– brain with Alzheimer’s disease Bulk and single-cell isoform sequencing for human disease research HiFi long read Sequel II system
Whole transcriptome Homo sapiens – universal human reference RNA (UHRR) Bulk and single-cell isoform sequencing for human disease research HiFi long read Sequel II system

Microbiome sequencing

See microbiomes in high resolution with species level identification, complete genes and operons, and reference quality metagenome-assembled genomes.

pb-metagenomics-tools

pb-metagenomics-tools is a collection of recommended tools and common practices workflows for metagenomics profiling and assembly using HiFi reads.

Explore a list of metagenomics analysis using HiFi data.

Access pb-metagenomics tools via GitHub

Pb-16S-nf

The Pb-16S-nf Nextflow pipeline is designed to process HiFi full-length 16S data into high quality amplicon sequence variants (ASVs) using QIIME 2 and DADA2. This pipeline provides a set of visualizations through the QIIME 2 framework for interactive plotting and generates an HTML report for the important statistics and top taxonomies.

Access pb-16S-nf via GitHub

Metagenomics datasets

Application Dataset Download literature Technology Sequencing system
Metagenomic profiling and assembly ZymoBIOMICS Fecal Reference with TruMatrix Technology (human) Microbiome and metagenome sequencing with HiFi reads HiFi long read Revio system
Metagenomic profiling and assembly ZymoBIOMICS Fecal Reference with TruMatrix Technology (human) Microbiome and metagenome sequencing with HiFi reads HiFi long read Sequel IIe system
Metagenomic profiling and assembly 20 strain mock microbial community – ATCC MSA-1003 – shotgun Microbiome and metagenome sequencing with HiFi reads HiFi long read Sequel II system
Metagenomic profiling and assembly Human gut microbiome pooled standards Microbiome and metagenome sequencing with HiFi reads HiFi long read Sequel IIe system
Full-length 16S sequencing 20 strain mock microbial community – ATCC MSA-1003 – 16S Microbiome and metagenome sequencing with HiFi reads HiFi long read Sequel II system

Sprite decoration

  1. Publication: Chen X, Harting J, Farrow E, et al. Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing. The American Journal of Human Genetics. 2023;0(0). doi:10.1016/j.ajhg.2023.01.001
  2. Publication: Dolzhenko, E., et al. (2023). Resolving the unsolved: Comprehensive assessment of tandem repeats at scale. bioRxiv 2023.05.12.540470.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.