Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16?kilobases) reads with random errors, we assembled 99% (244?megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4?megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
The genome of a tiny resurrection plant has been sequenced using PacBio’s long-read single-molecule real-time sequencing technology, aiding the understanding of extreme desiccation tolerance. The genome contiguity is comparable to that of genomes sequenced using far more laborious approaches.
The genus Brachypodium contains annual and perennial species with both diploid and polyploid genomes. Like the annual species B. distachyon, some of the perennial and polyploid species have traits compatible with use as a model system (e.g. small genomes, rapid generation time, self-fertile and easy to grow). Thus, there is an opportunity to leverage the resources and knowledge developed for B. distachyon to use other Brachypodium species as models for perenniality and the regulation and evolution of polyploid genomes. There are two factors driving an increased interest in perenniality. First, several perennial grasses are being developed as biomass crops for the sustainable production of biofuel and it would be useful to have a perennial model system to rapidly test biotechnological crop improvement strategies for undesirable impacts on perenniality and winter hardiness. In addition, a deeper understanding of the molecular mechanisms underlying perenniality could be used to design strategies for improving energy crops, for example, by changing resource allocation during growth or by altering the onset of dormancy. The second factor driving increased interest in perenniality is the potential environmental benefits of developing perennial grain crops. B. sylvaticum is a perennial with attributes suitable for use as a perennial model system. A high efficiency transformation system has been developed and a genome sequencing project is underway. Since many important crops, including emerging biomass crops, are polyploid, there is a pressing need to understand the rules governing the evolution and regulation of polyploid genomes. Unfortunately, it is difficult to study polyploid crop genomes because of their size and the difficulty of manipulating those plants in the laboratory. By contrast, B. hybridum has a small polyploid genome and is easy to work with in the laboratory. In addition, analysis of the B. hybridum genome, will be greatly aided by the genome sequences of the two extant diploid species (B. distachyon and B. stacei) that apparently gave rise to B. hybridum. Availability of high quality reference genomes for these three species will be a powerful resource for the study of polyploidy.
Grasses provide the bulk of human calories but improvement in grass yields is hindered by the characteristically large and complex genomes of these species; the genomes of wheat, maize, and sugar cane are 17,000, 2300, and 10,000 Mb, respectively. Brachypodium distachyon has one of the smallest genomes of all grasses at 272 Mb, and a number of key traits that make it a good model grass. Brachypodium was the fourth sequenced grass genome, after rice, Sorghum, and maize, and was the first sequenced in the Pooideae subfamily, a diverse group that includes wheat, barley, oat, and rye. The Brachypodium genome was sequenced using a whole genome shotgun approach with Sanger sequencing and is nearly complete with 99.6 % of the sequences anchored to five chromosomes. Sequencing of Brachypodium enabled comparative genomic analysis of grass genomes and shed light on processes involved in chromosome fusions and maintenance of a small genome. The high-quality Brachypodium genome sequence provides a framework for gene expression atlases, resequencing, quantitative trait loci (QTL) mapping, GWAS, and ENCODE datasets. The wealth of Brachypodium genomic resources have cemented its utility as a model organism and will facilitate translational work for improving the grasses that feed the world.
Complete genome sequence of Leifsonia xyli subsp. cynodontis strain DSM46306, a gram-positive bacterial pathogen of grasses.
We announce the complete genome sequence of Leifsonia xyli subsp. cynodontis, a vascular pathogen of Bermuda grass. The species also comprises Leifsonia xyli subsp. xyli, a sugarcane pathogen. Since these two subspecies have genome sequences available, a comparative analysis will contribute to our understanding of the differences in their biology and host specificity.
New insights into structural organization and gene duplication in a 1.75-Mb genomic region harboring the a-gliadin gene family in Aegilops tauschii, the source of wheat D genome.
Among the wheat prolamins important for its end-use traits, a-gliadins are the most abundant, and are also a major cause of food-related allergies and intolerances. Previous studies of various wheat species estimated that between 25 and 150 a-gliadin genes reside in the Gli-2 locus regions. To better understand the evolution of this complex gene family, the DNA sequence of a 1.75-Mb genomic region spanning the Gli-2 locus was analyzed in the diploid grass, Aegilops tauschii, the ancestral source of D genome in hexaploid bread wheat. Comparison with orthologous regions from rice, sorghum, and Brachypodium revealed rapid and dynamic changes only occurring to the Ae. tauschii Gli-2 region, including insertions of high numbers of non-syntenic genes and a high rate of tandem gene duplications, the latter of which have given rise to 12 copies of a-gliadin genes clustered within a 550-kb region. Among them, five copies have undergone pseudogenization by various mutation events. Insights into the evolutionary relationship of the duplicated a-gliadin genes were obtained from their genomic organization, transcription patterns, transposable element insertions and phylogenetic analyses. An ancestral glutamate-like receptor (GLR) gene encoding putative amino acid sensor in all four grass species has duplicated only in Ae. tauschii and generated three more copies that are interspersed with the a-gliadin genes. Phylogenetic inference and different gene expression patterns support functional divergence of the Ae. tauschii GLR copies after duplication. Our results suggest that the duplicates of a-gliadin and GLR genes have likely taken different evolutionary paths; conservation for the former and neofunctionalization for the latter.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.