Broad Institute Scientists Use Sequel II System for Trios, Structural Variant Detection
Monday, June 17, 2019
As we geared up for the launch of our new Sequel II System, we had the good fortune of working closely with several expert customers in an early access program. Recently, three of those customers reported on their experience with the new sequencing system in a webinar. In this blog series, we’ll be summarizing each speaker’s presentation, and the full recording is available to view.
First up was Kiran Garimella (@KiranGarimella), a senior computational scientist at the Broad Institute who focused on the use of HiFi reads, which are long (>10 kb) and accurate (>99%) sequences produced by the Sequel II System with circular consensus sequencing (CCS). Garimella and the team at the Broad Institute used the early access program to sequence trios from the Human Genome Structural Variation Consortium (HGSVC), clinical samples, and tumor/normal pairs.
Garimella reported average raw yields of 300 Gb per SMRT Cells 8M across 32 runs. Using a cloud-based pipeline he developed, the Broad Institute processed raw reads into HiFi reads and variant calls in 1-2 days. The HiFi reads, which averaged 10 subread passes, achieved quality scores from Q23 to Q25, which is comparable to the Q24 to Q25 of recent short-read data from Platinum Genomes. Garimella called the level of accuracy “remarkable” for long reads. “We’re very impressed by the PacBio Sequel II data,” Garimella added.
Garimella used the HiFi reads to look at structural variation and haplotype phasing, which has been difficult to detect with short reads. He showed an example of a heterozygous structural variant in the well characterized NA12878 that is clear in HiFi reads but difficult to detect with short reads. He also showed an example of variant calling in complex loci like the HLA genes. This is “why the Broad is so excited about long-read sequencing,” he added.
The NA12878 HiFi dataset, and others from the HGSVC, will be released publicly to help with establishing ground truth benchmarks for structural variation.
For more details, watch Garimella’s full presentation: