Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.


Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.


You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences' rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences

HiFi Reads Add Unparalleled Accuracy to the Long-Read Sequencing Arsenal

Monday, July 29, 2019

To enable better understanding of biology, sequencing data must be accurate and complete. This is especially true when seeking out variants and determining their implications.

Luckily, technical and software improvements for SMRT Sequencing are making it easier to efficiently generate genome assemblies with unparalleled accuracy.

As presented in a webinar by PacBio Staff Scientist Sarah Kingan (@drsarahdoom) and GoogleAI Genomics Project Lead Andrew Carroll (@acarroll_ATG), HiFi reads enabled by circular consensus sequencing (CCS) on the new Sequel II System challenge the notion that sequencing technologies require a tradeoff between length and accuracy.

Highly accurate long reads (HiFi reads) offer the benefits of long-read sequencing and the accuracy of short reads.


Kingan highlighted several benefits to using HiFi data for genome assembly: 

  • Higher accuracy of assemblies due to the high inherent base quality of HiFi reads
  • Dramatic time-savings in generating a genome assembly
  • Algorithmic improvement in the FALCON assembler that enhance the performance of HiFi assemblies

HiFi reads generated with CCS use single-molecule consensus, which increases their accuracy over traditional multi-molecule consensus.

HiFi reads are extremely accurate because they utilize single-molecule consensus, rather than multiple-molecule consensus, which is required for traditional long-read assembly methods. The resulting HiFi assemblies have higher base accuracy than assemblies produced by continuous long reads.

HiFi reads are also more efficiently produced by CCS due to algorithmic enhancements that reduce compute time. CCS for a single SMRT Cell 8M run on the Sequel II System will be able to be completed in 3.5 hours with the upcoming software release.

Because the HiFi reads are already error corrected, the genome assembly process is simplified and streamlined, requiring only 20% of the compute time for a human genome compared to a continuous long read assembly.


HiFi reads reduce compute time and simplify the genome assembly process.


HiFi data needed HiFi-ready assembly tools

In order to make the most of these improvements, some assembly and analytical programs have also been modified.

Modifications to FALCON-Unzip have improved the contiguity of complex genome assemblies generated with HiFi reads.

While testing the system on several human and animal genomes, Kingan said the PacBio team achieved equivalent or higher contiguity in multiple species, such as the fruit fly and bluefin tuna. But in a complex plant genome such as rice, with its multitude of repeat-induced overlaps, the results weren’t as robust. 

So Kingan and colleagues modified the FALCON-Unzip assembler to make the most of the higher accuracy HiFi reads. By ignoring indel differences, they were able to better assemble the plant genomes. These latest features will be added soon to the already-incorporated improvements of faster read tracking and polishing.


Deep learning digs deeper

When it comes to assessing the “unknown unknowns,” artificial intelligence and machine learning is better than even the most robust human-designed algorithms, said GoogleAI’s Carroll. 

His team has developed DeepVariant, a germline variant caller distinguished by its best-in-class accuracy. The open source program is also extensible – it can be re-trained for new technologies without writing new software — and this is exactly what his team did, in order to better handle HiFi data.

HiFi read errors are different from short-read errors, Carroll explained. Short-read data can lead to mapping complexity and coverage variability. HiFi reads are much more mappable and uniform, but can have noisier indel lengths in homopolymers, he said.

Carroll’s team fed the DeepVariant program millions of examples and labels from Genome in a Bottle to update weights in the model. Considering the range of uses and needs of PacBio users, they included data collected on both SMRT Cells 1M and 8M, featuring a variety of insert sizes and coverage levels.


Re-training of the DeepVariant machine learning algorithm led to increased accuracy for detecting insertion and deletion (Indel) events.


Better training yields better results

The team saw somewhat improved SNP accuracy, which was already very high; substantially improved indel accuracy; and robust, more uniform coverage titrations. The developers were surprised to see that DeepVariant was also able to call some structural variants without specifically being trained to do so. And the improved DeepVariant 8.0 was able to confidently call regions that were previously deemed “difficult,” “non-confident,” or “non-callable.”

“AI-based programs actually benefit from more data and more difficult – and different – data,” Carroll said. “There are thousands more variants we can now call confidently.” Improved haplotype phasing and the ability to call variants in other HiFi data types are also on the horizon, Carroll said.


Watch the complete webinar and visit www.pacb.com/HiFi to learn more:

Subscribe for blog updates: