Menu
July 7, 2019  |  

Extensive sequencing of seven human genomes to characterize benchmark reference materials.

Authors: Zook, Justin M and Catoe, David and McDaniel, Jennifer and Vang, Lindsay and Spies, Noah and Sidow, Arend and Weng, Ziming and Liu, Yuling and Mason, Christopher E and Alexander, Noah and Henaff, Elizabeth and McIntyre, Alexa B R and Chandramohan, Dhruva and Chen, Feng and Jaeger, Erich and Moshrefi, Ali and Pham, Khoa and Stedman, William and Liang, Tiffany and Saghbini, Michael and Dzakula, Zeljko and Hastie, Alex and Cao, Han and Deikus, Gintaras and Schadt, Eric and Sebra, Robert and Bashir, Ali and Truty, Rebecca M and Chang, Christopher C and Gulbahce, Natali and Zhao, Keyan and Ghosh, Srinka and Hyland, Fiona and Fu, Yutao and Chaisson, Mark and Xiao, Chunlin and Trow, Jonathan and Sherry, Stephen T and Zaranek, Alexander W and Ball, Madeleine and Bobe, Jason and Estep, Preston and Church, George M and Marks, Patrick and Kyriazopoulou-Panagiotopoulou, Sofia and Zheng, Grace X Y and Schnall-Levin, Michael and Ordonez, Heather S and Mudivarti, Patrice A and Giorda, Kristina and Sheng, Ying and Rypdal, Karoline Bjarnesdatter and Salit, Marc

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

Journal: Scientific data
DOI: 10.1038/sdata.2016.25
Year: 2016

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.