April 21, 2020  |  

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Authors: Chaisson, Mark J P and Sanders, Ashley D and Zhao, Xuefang and Malhotra, Ankit and Porubsky, David and Rausch, Tobias and Gardner, Eugene J and Rodriguez, Oscar L and Guo, Li and Collins, Ryan L and Fan, Xian and Wen, Jia and Handsaker, Robert E and Fairley, Susan and Kronenberg, Zev N and Kong, Xiangmeng and Hormozdiari, Fereydoun and Lee, Dillon and Wenger, Aaron M and Hastie, Alex R and Antaki, Danny and Anantharaman, Thomas and Audano, Peter A and Brand, Harrison and Cantsilieris, Stuart and Cao, Han and Cerveira, Eliza and Chen, Chong and Chen, Xintong and Chin, Chen-Shan and Chong, Zechen and Chuang, Nelson T and Lambert, Christine C and Church, Deanna M and Clarke, Laura and Farrell, Andrew and Flores, Joey and Galeev, Timur and Gorkin, David U and Gujral, Madhusudan and Guryev, Victor and Heaton, William Haynes and Korlach, Jonas and Kumar, Sushant and Kwon, Jee Young and Lam, Ernest T and Lee, Jong Eun and Lee, Joyce and Lee, Wan-Ping and Lee, Sau Peng and Li, Shantao and Marks, Patrick and Viaud-Martinez, Karine and Meiers, Sascha and Munson, Katherine M and Navarro, Fabio C P and Nelson, Bradley J and Nodzak, Conor and Noor, Amina and Kyriazopoulou-Panagiotopoulou, Sofia and Pang, Andy W C and Qiu, Yunjiang and Rosanio, Gabriel and Ryan, Mallory and Stütz, Adrian and Spierings, Diana C J and Ward, Alistair and Welch, AnneMarie E and Xiao, Ming and Xu, Wei and Zhang, Chengsheng and Zhu, Qihui and Zheng-Bradley, Xiangqun and Lowy, Ernesto and Yakneen, Sergei and McCarroll, Steven and Jun, Goo and Ding, Li and Koh, Chong Lek and Ren, Bing and Flicek, Paul and Chen, Ken and Gerstein, Mark B and Kwok, Pui-Yan and Lansdorp, Peter M and Marth, Gabor T and Sebat, Jonathan and Shi, Xinghua and Bashir, Ali and Ye, Kai and Devine, Scott E and Talkowski, Michael E and Mills, Ryan E and Marschall, Tobias and Korbel, Jan O and Eichler, Evan E and Lee, Charles

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (=50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

Journal: Nature communications
DOI: 10.1038/s41467-018-08148-z
Year: 2019

Read Publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.