+

X

Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.

X

Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.

IMAGES ARE PROVIDED BY Pacific Biosciences ON AN "AS-IS" BASIS. Pacific Biosciences DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT, OWNERSHIP, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL Pacific Biosciences BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES OF ANY KIND WHATSOEVER WITH RESPECT TO THE IMAGES.

You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences’ rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences
Contact:

Identifying Structural Variants in NA12878 from Low-Fold Coverage Sequencing on the PacBio Sequel System

Wednesday, October 19, 2016

Recent de novo assemblies of individual human genomes have uncovered thousands of structural variants, many of which are accessible only with PacBio long reads [1-3].

Personal Genome PacBio Coverage Deletions ≥50 bp Insertions ≥50 bp
CHM1 [1] 41-fold 6,111 9,638
HX1 [2] 103-fold 9,891 10,284
AK1 [3] 101-fold 7,358 10,077

 

A similar increase in structural variant sensitivity relative to short-read methods has been demonstrated with low-fold coverage PacBio sequencing interpreted against the reference genome [4].  To demonstrate and evaluate the low-fold coverage approach on the PacBio Sequel System, we generated approximately 10-fold coverage of the well-studied human sample NA12878.

Methods

Purified DNA for NA12878 was obtained from Coriell, sheared to an average size of 25 kb, converted to SMRTbell templates, and size selected to 15 kb on the BluePippin system (Sage Science). The resulting library was loaded on 10 SMRT Cells. Each SMRT Cell was run for 6 hours on the Sequel System with chemistry v1.2 (an older chemistry than was used for recently released Arabidopsis data, which uses the newer chemistry v1.2.1 and has a yield of about 5 Gb per SMRT Cell and read length N50 of 16.4kb).  In total, the runs generated 32.8 Gb of data contained in 3.4 million reads with half of the bases in reads longer than 11.8 kb.

Sequencing Metrics

SMRT Cells 10
Run Time 60 hrs
Number of Bases 32.8 Gb
Number of Reads 3.4 M
Read Length N50 11,823 bp

 

Reads were mapped to the GRCh37 human reference genome with NGM-LR [5], and structural variants were called with PBHoney [6].  A total of 7,386 deletions and 7,445 insertions of at least 50 bp were identified and comprise the “10-fold SV call set.”

Visualizing Structural Variants

Ongoing improvements to the IGV browser [7] (available now in the development version) improve visualization for PacBio reads and structural variants. With these updates, IGV provides a clear representation of deletions, insertions, and trinucleotide repeats, and shows how long reads span structural variants.

Heterozygous 315 bp deletion at chrX:116,454,160-116,454,859

chrx_116454160_116454859

 

Homozygous 328 bp insertion at chr10:92,213,800-92,216,245

chr10_92213800_92216245

 

FMR1 trinucleotide repeat small expansion at chrX:146,993,200-146,993,950

chrx_146993200_146993950

 

Evaluation of 10-fold Call Set

To quantify sensitivity, the 10-fold SV call set was compared to a merged NA12878 “truth” set from the 1000 Genomes Project [8] and Genome in a Bottle [9].

Set Platform Deletions ≥50 bp Insertions ≥50 bp
truth: 1000 Genomes + GIAB [8,9] Illumina 3,021 1,090
10-fold SV call set PacBio Sequel 7,386 7,445

 

The 10-fold SV call set recalls 86% of truth set deletions and 81% of insertions.  Moreover, it includes thousands of deletions and insertions that are not in the truth sets, most of which are directly confirmed by a FALCON-Unzip de novo assembly from 60-fold PacBio RS II coverage.

In summary, this 10-fold SV call set demonstrates that low-fold coverage sequencing on the PacBio Sequel System is an affordable, effective approach for identifying structural variants and provides much improved sensitivity compared to short-read approaches.  We are excited to see how this approach will be extended and applied to study genetic variation in disease cohorts, in human populations, and in other organisms.

 

figure-2
Data Availability

To illustrate the low-fold coverage structural variant calling workflow, the NA12878 Sequel data is available for analysis on DNAnexus.

 

[1] Chaisson MJ, et al. (2015). Nature, 517(7536):608-11.

[2] Shi L, et al. (2016). Nat Commun, 7:12065.

[3] Seo JS, et al. (2016). Nature, 538(7624):243-7.

[4] English AC, et al. (2014) BMC Bioinformatics, 15:180.

[5] https://github.com/philres/nextgenmap-lr

[6] English AC, et al. (2015). BMC Genomics, 16:286.

[7] Robinson JT, et al. (2011). Nat Biotechnol, 29(1):24-6.

[8] Parikh H, et al. (2016). BMC Genomics, 17:64.

[9] Sudmant PH, et al. (2015). Nature, 526(7571):75-81.

Subscribe for blog updates:

Archives