Menu
May 6, 2020  |  Data analysis

Data Release: HiFi Sequencing Results for Plants, Animals, and Microbes

UPDATE — November 17, 2020: This paper is now published in Scientific Data.
ORIGINAL POST
It’s been more than a year since we introduced HiFi sequencing to generate highly accurate long reads. In that time, we’ve seen many PacBio users make HiFi sequencing their go-to setting because it’s simple, reliable, and cost-effective. For scientists who have yet to generate their own HiFi data, we thought it might be helpful to publish a few data sets for exploration and analysis.
In a new preprint, we have released HiFi data sets for five samples: mouse, frog, maize, strawberry, and a mock metagenome community. We like to think there’s a data set for everyone here, whatever your research area of interest! Working with any of these HiFi read collections should offer a great introduction to this sequencing mode and show you why we often hear how easy it is to analyze HiFi data compared to traditional long reads.
Consistent with previous reports, the HiFi data generated for these five organisms yielded excellent accuracy, with average read qualities ranging from 99.84% to 99.97%. With that kind of accuracy we look forward to seeing what interesting biology our collaborators find within this data.
 

Organism SRA HiFi Data Yield (Gb) Average Read Length (kb) Average Read Quality
Mus musculus SRR11606870 66.5 17.1 31
Zea mays SRR11606869 48.1 15.6 30
Fragaria ananassa SRR11606867 29.7 21.7 28
Rana muscosa SRR11606868 180.1 15.7 31
MSA-1003 SRR11606871 59.1 10.4 35

 
The five HiFi data sets generated on the Sequel II System
 
In addition to letting scientists get a fresh look at HiFi data, we hope this release will encourage development of new applications and software for the benefit of the entire sequencing community. New and improved tools for assembling polyploid genomes or calling variants in non-model organisms are just a couple of areas we hope to see grow.
For those of you who want to use existing software to explore these datasets, here are some tools that we find useful for working with HiFi reads:

For this data release, we’d like to thank all of the collaborators who helped to generate and present these results: Jane Landolin (@jlandolin), Nicholas Maurer, David Kudrna, Michael Hardigan, Cynthia Steiner, Steven Knapp (@knapp1955), Doreen Ware, and Beth Shapiro (@bonesandbugs).
And congratulations to the PacBio team members who led the charge on this effort: Ting Hon, Kristin Mars, Greg Young (@PacbioGreg), Yu-Chih Tsai, Joseph Karalius (@JoeyKaralius), Paul Peluso, and David Rank.
Access all of the data sets in the preprint, ‘Highly accurate long-read HiFi sequencing data for five complex genomes.

Interested in finding out more about HiFi data for sequencing your organism of interest? Get in touch with a PacBio scientist to scope out your project.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.