April 21, 2020  |  

Telomere-to-telomere assembly of a complete human X chromosome

Authors: Miga, Karen H and Koren, Sergey and Rhie, Arang and Vollger, Mitchell R and Gershman, Ariel and Bzikadze, Andrey and Brooks, Shelise and Howe, Edmund and Porubsky, David and Logsdon, Glennis A and Schneider, Valerie A and Potapova, Tamara and Wood, Jonathan and Chow, William and Armstrong, Joel and Fredrickson, Jeanne and Pak, Evgenia and Tigyi, Kristof and Kremitzki, Milinn and Markovic, Christopher and Maduro, Valerie and Dutra, Amalia and Bouffard, Gerard G and Chang, Alexander M and Hansen, Nancy F and Thibaud-Nissen, Franc coise and Schmitt, Anthony D and Belton, Jon-Matthew and Selvaraj, Siddarth and Dennis, Megan Y and Soto, Daniela C and Sahasrabudhe, Ruta and Kaya, Gulhan and Quick, Josh and Loman, Nicholas J and Holmes, Nadine and Loose, Matthew and Surti, Urvashi and Risques, Rosa ana and Lindsay, Tina A. Graves and Fulton, Robert and Hall, Ira and Paten, Benedict and Howe, Kerstin and Timp, Winston and Young, Alice and Mullikin, James C and Pevzner, Pavel A and Sullivan, Beth A and Eichler, Evan E and Phillippy, Adam M.

After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38, along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3, we reconstructed the ~2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes.

Journal: BioRxiv
DOI: 10.1101/735928
Year: 2019

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.