Two Review Articles Assess Structural Variation in Human Genomes
Thursday, December 5, 2019
Two recent review articles discuss the idea that structural variants (SVs) — genetic differences that involve at least 50 base pairs — are numerous, important to human biology, and best detected with long reads. The authors review years of studies that have applied PacBio SMRT Sequencing to identify around 20,000 SVs per human genome. The reviews also report on cases in which SMRT Sequencing has helped scientists discover pathogenic variants that explain diseases for which there had previously been no clear genetic cause.
In Nature Reviews Genetics, Steve Ho, Alexander Urban, and Ryan Mills from the University of Michigan and Stanford University consider the algorithms and detection platforms that have enabled a wave of new discovery related to SVs. SVs cannot be reliably detected using short reads since many of the variants are significantly longer than those reads. “Because of this, the degree to which contemporary genomics has studied SNVs compared with SVs is significantly skewed,” Ho et al. write. “A recent analysis found that PacBio long reads were approximately three times more sensitive than a short-read ensemble maximized for sensitivity, implying that a large subset of SVs, many 50–2,000 bp in length, are unresolvable without long reads.”
Ho et al. also discuss the software tools that are useful for calling SVs in long reads, including Sniffles and pbsv. They summarize important projects that have used SMRT Sequencing to look for these variants — including Euan Ashley’s publication on Carney complex and Naomichi Matsumoto’s report on a large deletion that causes epilepsy.
The other review appears in Genome Biology, contributed by lead authors Medhat Mahmoud and Nastassia Gobet, senior author Fritz Sedlazeck, and collaborators at the University of Lausanne, Baylor College of Medicine, and other institutions. “Recent research into structural variants (SVs) has established their importance to medicine and molecular biology, elucidating their role in various diseases, regulation of gene expression, ethnic diversity, and large-scale chromosome evolution,” the authors state. “SVs are increasingly being recognized as an important class of variants, which need to be considered in evolutionary, population, and clinical genomics.”
The team reviews the value of long reads for finding SVs, noting that they “are advantageous for SV calling because they can span repetitive or other problematic regions.” The scientists also walk through the pros and cons of various alignment and SV-calling tools developed for long reads, including NGMLR, minimap2, Sniffles, and pbsv.
SMRT Sequencing provides high precision and recall for SVs in a human genome with just one SMRT Cell on the Sequel II System. By multiplexing two samples per SMRT Cell 8M, the approximate reagent cost is $670 per sample to detect structural variants.