July 7, 2019  |  

Building two indica rice reference genomes with PacBio long-read and Illumina paired-end sequencing data.

Authors: Zhang, Jianwei and Chen, Ling-Ling and Sun, Shuai and Kudrna, Dave and Copetti, Dario and Li, Weiming and Mu, Ting and Jiao, Wen-Biao and Xing, Feng and Lee, Seunghee and Talag, Jayson and Song, Jia-Ming and Du, Bogu and Xie, Weibo and Luo, Meizhong and Maldonado, Carlos Ernesto and Goicoechea, Jose Luis and Xiong, Lizhong and Wu, Changyin and Xing, Yongzhong and Zhou, Dao-Xiu and Yu, Sibin and Zhao, Yu and Wang, Gongwei and Yu, Yeisoo and Luo, Yijie and Hurtado, Beatriz Elena Padilla and Danowitz, Ann and Wing, Rod A and Zhang, Qifa

Over the past 30 years, we have performed many fundamental studies on two Oryza sativa subsp. indica varieties, Zhenshan 97 (ZS97) and Minghui 63 (MH63). To improve the resolution of many of these investigations, we generated two reference-quality reference genome assemblies using the most advanced sequencing technologies. Using PacBio SMRT technology, we produced over 108 (ZS97) and 174 (MH63) Gb of raw sequence data from 166 (ZS97) and 209 (MH63) pools of BAC clones, and generated ~97 (ZS97) and ~74 (MH63) Gb of paired-end whole-genome shotgun (WGS) sequence data with Illumina sequencing technology. With these data, we successfully assembled two platinum standard reference genomes that have been publicly released. Here we provide the full sets of raw data used to generate these two reference genome assemblies. These data sets can be used to test new programs for better genome assembly and annotation, aid in the discovery of new insights into genome structure, function, and evolution, and help to provide essential support to biological research in general.

Journal: Scientific data
DOI: 10.1038/sdata.2016.76
Year: 2016

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.