Pacific Biosciences Highlights Increased Focus on Human Genome Research at AGBT 2014; Releasing 54x Coverage Human Genome Data for de novo Assembly
Monday, February 10, 2014
MENLO PARK, Calif., Feb. 10, 2014 (GLOBE NEWSWIRE) — Pacific Biosciences of California, Inc. (Nasdaq:PACB), provider of the PacBio® RS II DNA Sequencing System, announced that its Single Molecule, Real-Time (SMRT®) Sequencing technology will be featured in nine podium presentations and 29 posters at this year’s Advances in Genome Biology and Technology (AGBT) meeting, with more than half presenting research on human and other complex genomes. In addition, on Wednesday the company will publicly release a long-read dataset for generating
the first de novo human genome assembly from PacBio-only sequence reads.
“Recent performance increases on the PacBio RS II — notably substantial improvements in throughput — are allowing scientists to approach much larger genome sequencing projects. The list of recent accomplishments using SMRT Sequencing includes agriculturally valuable plants such as spinach, model organisms such as Drosophila and Arabidopsis, and now, very high-quality work on human genomes,” said Michael Hunkapiller, Chief Executive Officer of Pacific Biosciences. “We are delighted with the diversity of applications that will be showcased by our customers this year at AGBT, including unprecedented work on the human genome.”
Pacific Biosciences used its P5-C3 sequencing chemistry to generate ~54x coverage on a well-studied human haploid cell line (CHM1htert), which is being utilized as part of a National Institutes of Health project to sequence and assemble an alternate reference genome (the so-called “platinum genome”), an effort led by Rick Wilson from the Washington University in St.
Louis and Evan Eichler from the University of Washington in collaboration with investigators from the National Center for Biotechnology Information (NCBI).
The human genome dataset is being deposited in the public domain to offer the bioinformatics and scientific communities an additional dataset to accelerate the understanding of genome-wide variation at all genome size scales, and to improve assembly techniques. To demonstrate the value of using PacBio long-read data to create de novo assemblies of human genomes following the general Hierarchical Genome Assembly Process (HGAP), Pacific Biosciences collaborated with Google to leverage the Google Cloud Platform for the most computationally intensive part of the assembly pipeline. In a single day, the pipeline executed 405,000 CPU hours to align the long reads to each
other. These data were transferred back to the company to complete the assembly process, which resulted in a 3.25 Gb assembly with a contig N50 of 4.38 Mb, and with the longest contig being 44 Mb. This represents over an order of magnitude better N50 than the most recent reference-guided assembly using Illumina® sequencing and BAC-clone finishing on the same sample, which had a total assembly size of 2.83 Gb and a contig N50 of 144 kb.
Deanna Church, currently of Personalis, Inc. and previously at NCBI as a founding member of the Genome Reference Consortium, commented: “The human reference assembly is central to all modern sequence analysis; therefore it is critical that the assembly is of the highest possible quality. The Genome Reference Consortium has been working towards this goal for years, and the CHM1 resource was invaluable for some of the improvements in GRCh38 (the latest human reference version of released by NCBI). However, even the latest GRCh38 assembly has regions that need improvement and additional sequencing technologies like SMRT Sequencing will clearly be necessary to continue improving the reference assembly. Personalis has also worked to develop advanced human reference versions and applauds new accomplishments by others in this area.”
PacBio’s 54x data initiative was a follow-on project from the October 2013 release of a 10x coverage dataset of a human genome for detecting structural variation relative to the human reference genome. Human genomes harbor many potentially medically relevant structural variations, which are often difficult or impossible to resolve using short-read technologies. To date, most studies on human variation consist of resequencing and comparing to a human reference genome. However, in order to comprehensively assess genetic variation between humans, de novo assemblies and subsequent comparison between genomes are desirable.
The unique value of this dataset will be described in several presentations at AGBT, including a talk by Gene Myers of the Max Planck Institute for Molecular Cell Biology and Genetics titled “A De Novo Whole Genome Shotgun Assembler for Noisy Long Read Data.” The data will also be discussed in a presentation by PacBio’s Senior Director of Bioinformatics Jason Chin titled “String Graph Assembly For Diploid Genomes With Long Reads,” and in a company workshop on Friday hosted by Chief Scientific Officer Jonas Korlach.
The human genome dataset will be summarized and accessible via the PacBio blog on the morning of Wednesday, February 12. More information about PacBio-related activities at AGBT 2014 is available at www.pacb.com/agbt.
About the PacBio RS II and SMRT Sequencing
Pacific Biosciences’ Single Molecule, Real-Time (SMRT) Sequencing technology achieves the industry’s longest read lengths and highest consensus accuracy,i,ii along with the least degree of biasiii. These characteristics, combined with its ability to detect many types of DNA base modifications (e.g., methylation) as part of the sequencing process, mean the PacBio RS II provides a window into critical biological processes and medically, agriculturally, and industrially relevant genetic and genomic variation that can only be revealed with SMRT Sequencing technology.
About Pacific Biosciences
Pacific Biosciences of California, Inc. (Nasdaq:PACB) offers the PacBio® RS II DNA Sequencing System to help scientists solve genetically complex problems. Based on its novel Single Molecule, Real-Time (SMRT®) technology, the company’s products enable: targeted sequencing to more comprehensively characterize genetic variations; de novo genome assembly to more fully identify, annotate and decipher genomic structures; and DNA base modification identification to help characterize epigenetic regulation and DNA damage. By providing access to information that was previously inaccessible, Pacific Biosciences enables scientists to increase their understanding of biological systems.
More information is available at: www.pacb.com.
i Koren et al., “Reducing assembly complexity of microbial genomes with single-molecule sequencing.” Genome Biology, 14:R10.1 (2013).
ii Chin et al., “Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.” Nature Methods, 10; 563-569 (2013).
iii Ross et al. Characterizing and measuring bias in sequence data. Genome Biol 14: R51 (2013).
CONTACT: For Pacific Biosciences:
For Pacific Biosciences
Source: Pacific Biosciences of California, Inc.
News Provided by Acquire Media