June 1, 2021

Structural variant combining Illumina and low-coverage PacBio

Author(s): Carroll, A.

Structural variant calling combining Illumina and low-coverage Pacbio Detection of large genomic variation (structural variants) has proven challenging using short-read methods. Long-read approaches which can span these large events have promise to dramatically expand the ability to accurately call structural variants. Although sequencing with Pacific Biosciences (Pacbio) long-read technology has become increasingly high throughput, generating high coverage with the technology can still be limiting and investigators often would like to know what pacbio coverages are adequate to call structural variants. Here, we present a method to identify a substantially higher fraction of structural variants in the human genome using low-coverage pacbio data by multiple strategies for ensembling data types and algorithms. Algorithmically, we combine three structural variant callers: PBHoney by Adam English, Sniffles by Fritz Sedlazeck, and Parliament by Adam English (which we have modified to improve for speed). Parliament itself uses a combination of Pacbio and Illumina data with a number of short-read callers (Breakdancer, Pindel, Crest, CNVnator, Delly, and Lumpy). We show that the outputs of these three programs are largely complementary to each other, with each able to uniquely access different sets of structural variants at different coverages. Combining them together can more than double the recall of true structural variants from a truth set relative to sequencing with Illumina alone, with substantial improvements even at low pacbio coverages (3x – 7x). This allows us to present for the first time cost-benefit tradeoffs to investigators about how much pacbio sequencing will yield what improvements in SV-calling. This work also builds upon the foundational work of Genome in a Bottle led by Justin Zook in establishing a truth set for structural variants in the Ashkenazim-Jewish trio data recently released. This work demonstrates the power of this benchmark set – one of the first of its kind for structural variation data – to help understand and refine the accuracies of calling structural variants with a number of approaches.

Organization: DNAnexus
Year: 2016

View Conference Poster

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.