PacBio HiFi sequencing for the de novo genome assembly of microbial, fungal, plant, animal and human genomes has been described in hundreds of publications, and its best-in-class performance has been validated by the scientific community in many benchmark comparisons. In another recent technology comparison study entitled “Benchmarking of long-read sequencing, assemblers and polishers for yeast genome” by Dr. Chen-Guang Liu, Dr. Zhuo Wang and collaborators, the researchers performed a very detailed evaluation of PacBio HiFi and Oxford Nanopore Technologies sequencing and assemblers for the widely used Saccharomyces cerevisiae (SC) strain S288C, as well as two additional industrial yeast strains.
The researchers concluded that the recommended optimal sequencing depth was significantly different for the two technologies:
“For high-quality genome construction of fungi such as yeast, the sequencing depth should not be less than 80X for ONT … and 20X for HiFi data.”
Despite this four-fold lower coverage of the HiFi datasets, the authors observed that the HiFi sequencing-based assemblies still beat the higher-depth ONT assemblies by a large margin, stating:
- “using HiFi reads constructed the best assemblies”
- “the contig numbers of assemblies obtained by the HiFi data are lower than that of the ONT data”
- “the N50 and complete gene number from BUSCO of assemblies obtained by HiFi were slightly higher than that of ONT, indicating that the genome quality from HiFi reads has an overall improvement compared to ONT”
- “genome construction based on HiFi data consumed less computational resources compared to ONT,” and “the CPU time and memory usage of HiFi datasets are smaller than that of ONT datasets at the same depth”
The researchers observed very similar results for the additional two yeast strain assemblies, writing: “HiFi data obtained genome with the highest N50 in both strains. And both HiFi pipelines have significantly higher complete gene numbers on SC assemblies than that of ONT pipelines.”
And all of this with four times less data to begin with. Therefore, it is important to be mindful that, due to the different qualities of the underlying sequencing technologies, the required amount of data for a given project can be drastically different. PacBio HiFi reads have the most accurate and complete sequence information content (and now even including methylation), often requiring far less data and computational requirements than other technologies and thereby having the lowest overall project costs, while consistently producing best-in-class results and biological insights.