July 19, 2019  |  

Extensive sequence divergence between the reference genomes of two elite indica rice varieties Zhenshan 97 and Minghui 63.

Asian cultivated rice consists of two subspecies: Oryza sativa subsp. indica and O. sativa subsp. japonica Despite the fact that indica rice accounts for over 70% of total rice production worldwide and is genetically much more diverse, a high-quality reference genome for indica rice has yet to be published. We conducted map-based sequencing of two indica rice lines, Zhenshan 97 (ZS97) and Minghui 63 (MH63), which represent the two major varietal groups of the indica subspecies and are the parents of an elite Chinese hybrid. The genome sequences were assembled into 237 (ZS97) and 181 (MH63) contigs, with an accuracy >99.99%, and covered 90.6% and 93.2% of their estimated genome sizes. Comparative analyses of these two indica genomes uncovered surprising structural differences, especially with respect to inversions, translocations, presence/absence variations, and segmental duplications. Approximately 42% of nontransposable element related genes were identical between the two genomes. Transcriptome analysis of three tissues showed that 1,059-2,217 more genes were expressed in the hybrid than in the parents and that the expressed genes in the hybrid were much more diverse due to their divergence between the parental genomes. The public availability of two high-quality reference genomes for the indica subspecies of rice will have large-ranging implications for plant biology and crop genetic improvement.

July 7, 2019  |  

Sequencing and de novo assembly of a near complete indica rice genome.

A high-quality reference genome is critical for understanding genome structure, genetic variation and evolution of an organism. Here we report the de novo assembly of an indica rice genome Shuhui498 (R498) through the integration of single-molecule sequencing and mapping data, genetic map and fosmid sequence tags. The 390.3?Mb assembly is estimated to cover more than 99% of the R498 genome and is more continuous than the current reference genomes of japonica rice Nipponbare (MSU7) and Arabidopsis thaliana (TAIR10). We annotate high-quality protein-coding genes in R498 and identify genetic variations between R498 and Nipponbare and presence/absence variations by comparing them to 17 draft genomes in cultivated rice and its closest wild relatives. Our results demonstrate how to de novo assemble a highly contiguous and near-complete plant genome through an integrative strategy. The R498 genome will serve as a reference for the discovery of genes and structural variations in rice.

July 7, 2019  |  

Indica rice genome assembly, annotation and mining of blast disease resistance genes.

Rice is a major staple food crop in the world. Over 80 % of rice cultivation area is under indica rice. Currently, genomic resources are lacking for indica as compared to japonica rice. In this study, we generated deep-sequencing data (Illumina and Pacific Biosciences sequencing) for one of the indica rice cultivars, HR-12 from India.We assembled over 86 % (389 Mb) of rice genome and annotated 56,284 protein-coding genes from HR-12 genome using Illumina and PacBio sequencing. Comprehensive comparative analyses between indica and japonica subspecies genomes revealed a large number of indica specific variants including SSRs, SNPs and InDels. To mine disease resistance genes, we sequenced few indica rice cultivars that are reported to be highly resistant (Tetep and Tadukan) and susceptible (HR-12 and Co-39) against blast fungal isolates in many countries including India. Whole genome sequencing of rice genotypes revealed high rate of mutations in defense related genes (NB-ARC, LRR and PK domains) in resistant cultivars as compared to susceptible. This study has identified R-genes Pi-ta and Pi54 from durable indica resistant cultivars; Tetep and Tadukan, which can be used in marker assisted selection in rice breeding program.This is the first report of whole genome sequencing approach to characterize Indian rice germplasm. The genomic resources from our work will have a greater impact in understanding global rice diversity, genetics and molecular breeding.

July 7, 2019  |  

Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences.

Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool-Genome Puzzle Master (GPM)-that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules.With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS CONTACTS: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

July 7, 2019  |  

Building two indica rice reference genomes with PacBio long-read and Illumina paired-end sequencing data.

Over the past 30 years, we have performed many fundamental studies on two Oryza sativa subsp. indica varieties, Zhenshan 97 (ZS97) and Minghui 63 (MH63). To improve the resolution of many of these investigations, we generated two reference-quality reference genome assemblies using the most advanced sequencing technologies. Using PacBio SMRT technology, we produced over 108 (ZS97) and 174 (MH63) Gb of raw sequence data from 166 (ZS97) and 209 (MH63) pools of BAC clones, and generated ~97 (ZS97) and ~74 (MH63) Gb of paired-end whole-genome shotgun (WGS) sequence data with Illumina sequencing technology. With these data, we successfully assembled two platinum standard reference genomes that have been publicly released. Here we provide the full sets of raw data used to generate these two reference genome assemblies. These data sets can be used to test new programs for better genome assembly and annotation, aid in the discovery of new insights into genome structure, function, and evolution, and help to provide essential support to biological research in general.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.