April 21, 2020 |

Fast and accurate genomic analyses using genome graphs.

Authors: Rakocevic, Goran and Semenyuk, Vladimir and Lee, Wan-Ping and Spencer, James and Browning, John and Johnson, Ivan J and Arsenijevic, Vladan and Nadj, Jelena and Ghose, Kaushik and Suciu, Maria C and Ji, Sun-Gou and Demir, Gülfem and Li, Lizao and Toptas, Berke Ç and Dolgoborodov, Alexey and Pollex, Björn and Spulber, Iosif and Glotova, Irina and Kómár, Péter and Stachyra, Andrew L and Li, Yilong and Popovic, Milos and Källberg, Morten and Jain, Amit and Kural, Deniz

The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5?h using a system with 36?CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.

Journal: Nature genetics
DOI: 10.1038/s41588-018-0316-4
Year: 2019

Read publication

ALS case study

Support

Fast and accurate genomic analyses using genome graphs.

Talk with an expert