The availability of plant reference genomes has ushered in a new era of crop genomics. More than 100 plant genomes have been sequenced since 2000, 63% of which are crop species. These genome sequences provide insight into architecture, evolution and novel aspects of crop genomes such as the retention of key agronomic traits after whole genome duplication events. Some crops have very large, polyploid, repeat-rich genomes, which require innovative strategies for sequencing, assembly and analysis. Even low quality reference genomes have the potential to improve crop germplasm through genome-wide molecular markers, which decrease expensive phenotyping and breeding cycles. The next stage of plant genomics will require draft genome refinement, building resources for crop wild relatives, resequencing broad diversity panels, and plant ENCODE projects to better understand the complexities of these highly diverse genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Construction of a reference genetic map of Raphanus sativus based on genotyping by whole-genome resequencing.
This manuscript provides a genetic map of Raphanus sativus that has been used as a reference genetic map for an ongoing genome sequencing project. The map was constructed based on genotyping by whole-genome resequencing of mapping parents and F 2 population. Raphanus sativus is an annual vegetable crop species of the Brassicaceae family and is one of the key plants in the seed industry, especially in East Asia. Assessment of the R. sativus genome provides fundamental resources for crop improvement as well as the study of crop genome structure and evolution. With the goal of anchoring genome sequence assemblies of R. sativus cv. WK10039 whose genome has been sequenced onto the chromosomes, we developed a reference genetic map based on genotyping of two parents (maternal WK10039 and paternal WK10024) and 93 individuals of the F2 mapping population by whole-genome resequencing. To develop high-confidence genetic markers, ~83 Gb of parental lines and ~591 Gb of mapping population data were generated as Illumina 100 bp paired-end reads. High stringent sequence analysis of the reads mapped to the 344 Mb of genome sequence scaffolds identified a total of 16,282 SNPs and 150 PCR-based markers. Using a subset of the markers, a high-density genetic map was constructed from the analysis of 2,637 markers spanning 1,538 cM with 1,000 unique framework loci. The genetic markers integrated 295 Mb of genome sequences to the cytogenetically defined chromosome arms. Comparative analysis of the chromosome-anchored sequences with Arabidopsis thaliana and Brassica rapa revealed that the R. sativus genome has evident triplicated sub-genome blocks and the structure of gene space is highly similar to that of B. rapa. The genetic map developed in this study will serve as fundamental genomic resources for the study of R. sativus.
A draft genome of field pennycress (Thlaspi arvense) provides tools for the domestication of a new winter biofuel crop.
Field pennycress (Thlaspi arvense L.) is being domesticated as a new winter cover crop and biofuel species for the Midwestern United States that can be double-cropped between corn and soybeans. A genome sequence will enable the use of new technologies to make improvements in pennycress. To generate a draft genome, a hybrid sequencing approach was used to generate 47 Gb of DNA sequencing reads from both the Illumina and PacBio platforms. These reads were used to assemble 6,768 genomic scaffolds. The draft genome was annotated using the MAKER pipeline, which identified 27,390 predicted protein-coding genes, with almost all of these predicted peptides having significant sequence similarity to Arabidopsis proteins. A comprehensive analysis of pennycress gene homologues involved in glucosinolate biosynthesis, metabolism, and transport pathways revealed high sequence conservation compared with other Brassicaceae species, and helps validate the assembly of the pennycress gene space in this draft genome. Additional comparative genomic analyses indicate that the knowledge gained from years of basic Brassicaceae research will serve as a powerful tool for identifying gene targets whose manipulation can be predicted to result in improvements for pennycress. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.
Pseudomonas brassicacearum DF41, a Gram-negative soil bacterium, is able to suppress the fungal pathogen Sclerotinia sclerotiorum through a process known as biological control. Here, we present a 6.8-Mb assembly of its genome, which is the second fully assembled genome of a P. brassicacearum strain.
De novo assembly and characterization of the complete chloroplast genome of radish (Raphanus sativus L.).
Radish (Raphanus sativus L.) is an edible root vegetable crop that is cultivated worldwide and whose genome has been sequenced. Here we report the complete nucleotide sequence of the radish cultivar WK10039 chloroplast (cp) genome, along with a de novo assembly strategy using whole genome shotgun sequence reads obtained by next generation sequencing. The radish cp genome is 153,368 bp in length and has a typical quadripartite structure, composed of a pair of inverted repeat regions (26,217 bp each), a large single copy region (83,170 bp), and a small single copy region (17,764 bp). The radish cp genome contains 87 predicted protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence analysis revealed the presence of 91 simple sequence repeats (SSRs) in the radish cp genome. Phylogenetic analysis of 62 protein-coding gene sequences from the 17 cp genomes of the Brassicaceae family suggested that the radish cp genome is most closely related to the cp genomes of Brassica rapa and Brassicanapus. Comparisons with the B. rapa and B. napus cp genomes revealed highly divergent intergenic sequences and introns that can potentially be developed as diagnostic cp markers. Synonymous and nonsynonymous substitutions of cp genes suggested that nucleotide substitutions have occurred at similar rates in most genes. The complete sequence of the radish cp genome would serve as a valuable resource for the development of new molecular markers and the study of the phylogenetic relationships of Raphanus species in the Brassicaceae family. Copyright © 2014 Elsevier B.V. All rights reserved.
Root-associated fungal microbiota of nonmycorrhizal Arabis alpina and its contribution to plant phosphorus nutrition.
Most land plants live in association with arbuscular mycorrhizal (AM) fungi and rely on this symbiosis to scavenge phosphorus (P) from soil. The ability to establish this partnership has been lost in some plant lineages like the Brassicaceae, which raises the question of what alternative nutrition strategies such plants have to grow in P-impoverished soils. To understand the contribution of plant-microbiota interactions, we studied the root-associated fungal microbiome of Arabis alpina (Brassicaceae) with the hypothesis that some of its components can promote plant P acquisition. Using amplicon sequencing of the fungal internal transcribed spacer 2, we studied the root and rhizosphere fungal communities of A. alpina growing under natural and controlled conditions including low-P soils and identified a set of 15 fungal taxa consistently detected in its roots. This cohort included a Helotiales taxon exhibiting high abundance in roots of wild A. alpina growing in an extremely P-limited soil. Consequently, we isolated and subsequently reintroduced a specimen from this taxon into its native P-poor soil in which it improved plant growth and P uptake. The fungus exhibited mycorrhiza-like traits including colonization of the root endosphere and P transfer to the plant. Genome analysis revealed a link between its endophytic lifestyle and the expansion of its repertoire of carbohydrate-active enzymes. We report the discovery of a plant-fungus interaction facilitating the growth of a nonmycorrhizal plant under native P-limited conditions, thus uncovering a previously underestimated role of root fungal microbiota in P cycling.
The Leavenworthia self-incompatibility locus (S locus) consists of paralogs (Lal2, SCRL) of the canonical Brassicaceae S locus genes (SRK, SCR), and is situated in a genomic position that differs from the ancestral one in the Brassicaceae. Unexpectedly, in a small number of Leavenworthia alabamica plants examined, sequences closely resembling exon 1 of SRK have been found, but the function of these has remained unclear. BAC cloning and expression analyses were employed to characterize these SRK-like sequences. An SRK-positive Bacterial Artificial Chromosome clone was found to contain complete SRK and SCR sequences located close by one another in the derived genomic position of the Leavenworthia S locus, and in place of the more typical Lal2 and SCRL sequences. These sequences are expressed in stigmas and anthers, respectively, and crossing data show that the SRK/SCR haplotype is functional in self-incompatibility. Population surveys indicate that < 5% of Leavenworthia S loci possess such alleles. An ancestral translocation or recombination event involving SRK/SCR and Lal2/SCRL likely occurred, together with neofunctionalization of Lal2/SCRL, and both haplotype groups now function as Leavenworthia S locus alleles. These findings suggest that S locus alleles can have distinctly different evolutionary origins.© 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Radish, Raphanus sativus L., is a member of Brassicaceae, to which Arabidopsis thaliana, a model plant in plant biology, belongs, as do other Brassica species including important crops. However, genetic and genomic studies of radish have been behind those of Arabidopsis and Brassica. In this decade, much effort has been made to develop genetic resources for radish, e.g., DNA markers, genetic maps, and whole genome sequences. Studies using the obtained information have revealed the genome structure of radish in terms of ancestral karyotype and have also prompted the identification of genes for agronomically important traits in radish through a map-based cloning strategy and quantitative trait locus analysis. In this chapter, we review the evolving development of radish genetic map in the past 15 years and the current status of genome sequencing of radish. We also introduce the latest strategy for the construction of a high-density genetic map using next-generation sequencing technology and propose a prospective direction of genetics and genomics research in radish which would be helpful for researchers and breeders in their efforts to promote radish breeding programs efficiently.
Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.
This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.
Genome sequence and annotation of Colletotrichum higginsianum, a causal agent of crucifer anthracnose disease.
Colletotrichum higginsianum is an ascomycete fungus causing anthracnose disease on numerous cultivated plants in the family Brassicaceae, as well as the model plant Arabidopsis thaliana We report an assembly of the nuclear genome and gene annotation of this pathogen, which was obtained using a combination of PacBio long-read sequencing and optical mapping. Copyright © 2016 Zampounis et al.
The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection.
The Brassica genus encompasses three diploid and three allopolyploid genomes, but a clear understanding of the evolution of agriculturally important traits via polyploidy is lacking. We assembled an allopolyploid Brassica juncea genome by shotgun and single-molecule reads integrated to genomic and genetic maps. We discovered that the A subgenomes of B. juncea and Brassica napus each had independent origins. Results suggested that A subgenomes of B. juncea were of monophyletic origin and evolved into vegetable-use and oil-use subvarieties. Homoeolog expression dominance occurs between subgenomes of allopolyploid B. juncea, in which differentially expressed genes display more selection potential than neutral genes. Homoeolog expression dominance in B. juncea has facilitated selection of glucosinolate and lipid metabolism genes in subvarieties used as vegetables and for oil production. These homoeolog expression dominance relationships among Brassicaceae genomes have contributed to selection response, predicting the directional effects of selection in a polyploid crop genome.