September 22, 2019  |  

Long-read sequencing and de novo assembly of a Chinese genome.

Authors: Shi, Lingling and Guo, Yunfei and Dong, Chengliang and Huddleston, John and Yang, Hui and Han, Xiaolu and Fu, Aisi and Li, Quan and Li, Na and Gong, Siyi and Lintner, Katherine E and Ding, Qiong and Wang, Zou and Hu, Jiang and Wang, Depeng and Wang, Feng and Wang, Lin and Lyon, Gholson J and Guan, Yongtao and Shen, Yufeng and Evgrafov, Oleg V and Knowles, James A and Thibaud-Nissen, Francoise and Schneider, Valerie and Yu, Chack-Yung and Zhou, Libing and Eichler, Evan E and So, Kwok-Fai and Wang, Kai

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93?Gb (contig N50: 8.3?Mb, scaffold N50: 22.0?Mb, including 39.3?Mb N-bases), together with 206?Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8?Mb of HX1-specific sequences, including 4.1?Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.

Journal: Nature communications
DOI: 10.1038/ncomms12065
Year: 2016

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.