Menu
September 22, 2019  |  

Comparative genomics of smut pathogens: Insights from orphans and positively selected genes into host specialization.

Host specialization is a key evolutionary process for the diversification and emergence of new pathogens. However, the molecular determinants of host range are poorly understood. Smut fungi are biotrophic pathogens that have distinct and narrow host ranges based on largely unknown genetic determinants. Hence, we aimed to expand comparative genomics analyses of smut fungi by including more species infecting different hosts and to define orphans and positively selected genes to gain further insights into the genetics basis of host specialization. We analyzed nine lineages of smut fungi isolated from eight crop and non-crop hosts: maize, barley, sugarcane, wheat, oats, Zizania latifolia (Manchurian rice), Echinochloa colona (a wild grass), and Persicaria sp. (a wild dicot plant). We assembled two new genomes: Ustilago hordei (strain Uhor01) isolated from oats and U. tritici (strain CBS 119.19) isolated from wheat. The smut genomes were of small sizes, ranging from 18.38 to 24.63 Mb. U. hordei species experienced genome expansions due to the proliferation of transposable elements and the amount of these elements varied among the two strains. Phylogenetic analysis confirmed that Ustilago is not a monophyletic genus and, furthermore, detected misclassification of the U. tritici specimen. The comparison between smut pathogens of crop and non-crop hosts did not reveal distinct signatures, suggesting that host domestication did not play a dominant role in shaping the evolution of smuts. We found that host specialization in smut fungi likely has a complex genetic basis: different functional categories were enriched in orphans and lineage-specific selected genes. The diversification and gain/loss of effector genes are probably the most important determinants of host specificity.


September 22, 2019  |  

Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics.

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.


September 22, 2019  |  

Evolutionary history of bacteriophages in the genus Paraburkholderia.

The genus Paraburkholderia encompasses mostly environmental isolates with diverse predicted lifestyles. Genome analyses have shown that bacteriophages form a considerable portion of some Paraburkholderia genomes. Here, we analyzed the evolutionary history of prophages across all Paraburkholderia spp. Specifically, we investigated to what extent the presence of prophages and their distribution affect the diversity/diversification of Paraburkholderia spp., as well as to what extent phages coevolved with their respective hosts. Particular attention was given to the presence of CRISPR-Cas arrays as a reflection of past interactions with phages. We thus analyzed 36 genomes of Paraburkholderia spp., including those of 11 new strains, next to those of three Burkholderia species. Most genomes were found to contain at least one full prophage sequence. The highest number was found in Paraburkholderia sp. strain MF2-27; the nine prophages found amount to up to 4% of its genome. Among all prophages, potential moron genes (e.g., DNA adenine methylase) were found that might be advantageous for host cell fitness. Co-phylogenetic analyses indicated the existence of complex evolutionary scenarios between the different Paraburkholderia hosts and their prophages, including short-term co-speciation, duplication, host-switching and phage loss events. Analysis of the CRISPR-Cas systems showed a record of diverse, potentially recent, phage infections. We conclude that, overall, different phages have interacted in diverse ways with their Paraburkholderia hosts over evolutionary time.


September 22, 2019  |  

Gene duplication and evolution dynamics in the homeologous regions harboring multiple prolamin and resistance gene families in hexaploid wheat.

Improving end-use quality and disease resistance are important goals in wheat breeding. The genetic loci controlling these traits are highly complex, consisting of large families of prolamin and resistance genes with members present in all three homeologous A, B, and D genomes in hexaploid bread wheat. Here, orthologous regions harboring both prolamin and resistance gene loci were reconstructed and compared to understand gene duplication and evolution in different wheat genomes. Comparison of the two orthologous D regions from the hexaploid wheat Chinese Spring and the diploid progenitor Aegilops tauschii revealed their considerable difference due to the presence of five large structural variations with sizes ranging from 100 kb to 2 Mb. As a result, 44% of the Ae. tauschii and 71% of the Chinese Spring sequences in the analyzed regions, including 79 genes, are not shared. Gene rearrangement events, including differential gene duplication and deletion in the A, B, and D regions, have resulted in considerable erosion of gene collinearity in the analyzed regions, suggesting rapid evolution of prolamin and resistance gene families after the separation of the three wheat genomes. We hypothesize that this fast evolution is attributed to the co-evolution of the two gene families dispersed within a high recombination region. The identification of a full set of prolamin genes facilitated transcriptome profiling and revealed that the A genome contributes the least to prolamin expression because of its smaller number of expressed intact genes and their low expression levels, while the B and D genomes contribute similarly.


September 22, 2019  |  

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.


September 22, 2019  |  

A mosaic monoploid reference sequence for the highly complex genome of sugarcane.

Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after their divergence.


September 22, 2019  |  

Comparative genomics of Pseudomonas sp. strain SI-3 associated with macroalga Ulva prolifera, the causative species for green tide in the Yellow Sea.

Algae-bacteria associations occurred widely in marine habitats, however, contributions of bacteria to macroalgal blooming were almost unknown. In this study, a potential endophytic strain SI-3 was isolated from Ulva prolifera, the causative species for the world’s largest green tide in the Yellow Sea, following a strict bleaching treatment to eliminate epiphytes. The genomic sequence of SI-3 was determined in size of 4.8 Mb and SI-3 was found to be mostly closed to Pseudomonas stutzeri. To evaluate the characteristics of SI-3 as a potential endophyte, the genomes of SI-3 and other 20 P. stutzeri strains were compared. We found that SI-3 had more strain-specific genes than most of the 20 P. stutzeri strains. Clusters of Orthologous Groups (COGs) analysis revealed that SI-3 had a higher proportion of genes assigned to transcriptional regulation and signal transduction compared with the 20 P. stutzeri strains, including four rhizosphere bacteria, indicating a complicated interaction network between SI-3 and its host. P. stutzeri is renowned for its metabolic versatility in aromatic compounds degradation. However, significant gene loss was observed in several aromatic compounds degradation pathways in SI-3, which may be an evolutional adaptation that developed upon association with its host. KEGG analysis revealed that dissimilatory nitrate reduction to ammonium (DNRA) and denitrification, two competing dissimilatory nitrate reduction pathways, co-occurred in the genome of SI-3, like most of the other 20 P. stutzeri strains. We speculated that DNRA of SI-3 may contribute a competitive advantage in nitrogen acquisition of U. prolifera by conserving nitrogen in NH4+ form, as in the case of microalgae bloom. Collectively, these data suggest that Pseudomonas sp. strain SI-3 was a suitable candidate for investigation of the algae-bacteria interaction with U. prolifera and the ecological impacts on algal blooming.


September 22, 2019  |  

Fusarium species complex causing Pokkah Boeng in China

Sugarcane is one of the most important crops for sugar production in sugarcane-growing areas. Many biotic and abiotic stresses affected the sugarcane production which leads to severe losses. Pokkah boeng is now playing a very important role due to its economic threats. Currently, the occurrence and rigorousness of pokkah boeng disease have been spread like wildfire from major sugarcane-growing countries. Pokkah boeng is a fungal disease that can cause serious yield losses in susceptible varieties. Infection of the disease is caused either by spores or ascospores. It may cause serious yield losses in commercial plantings. However, there have been many reported outbreaks of the disease which have looked spectacular but have caused trade and industry loss. Fusarium species complex is the major causal agent of this disease around the world, but some researchers have documented the increased importance of Fusarium. Three Fusarium species have been identified to cause the sugarcane pokkah boeng disease in China. Moreover, Fusarium may be accompanied of its mycotoxin production, genomic sequencing, and association with nitrogen application in China. Many studies on disease investigations, breeding of disease-resistant varieties, and strategy of disease control have also been carried out in China.


September 22, 2019  |  

A chromosome scale assembly of the model desiccation tolerant grass Oropetium thomaeum

Oropetium thomaeum is an emerging model for desiccation tolerance and genome size evolution in grasses. A high-quality draft genome of Oropetium was recently sequenced, but the lack of a chromosome scale assembly has hindered comparative analyses and downstream functional genomics. Here, we reassembled Oropetium, and anchored the genome into ten chromosomes using Hi-C based chromatin interactions. A combination of high-resolution RNAseq data and homology-based gene prediction identified thousands of new, conserved gene models that were absent from the V1 assembly. This includes thousands of new genes with high expression across a desiccation timecourse. The sorghum and Oropetium genomes have a surprising degree of chromosome-level collinearity, and several chromosome pairs have near perfect synteny. Other chromosomes are collinear in the gene rich chromosome arms but have experienced pericentric translocations. Together, these resources will be useful for the grass comparative genomic community and further establish Oropetium as a model resurrection plant.


September 22, 2019  |  

Analysis of the Gli-D2 locus identifies a genetic target for simultaneously improving the breadmaking and health-related traits of common wheat.

Gliadins are a major component of wheat seed proteins. However, the complex homoeologous Gli-2 loci (Gli-A2, -B2 and -D2) that encode the a-gliadins in commercial wheat are still poorly understood. Here we analyzed the Gli-D2 locus of Xiaoyan 81 (Xy81), a winter wheat cultivar. A total of 421.091 kb of the Gli-D2 sequence was assembled from sequencing multiple bacterial artificial clones, and 10 a-gliadin genes were annotated. Comparative genomic analysis showed that Xy81 carried only eight of the a-gliadin genes of the D genome donor Aegilops tauschii, with two of them each experiencing a tandem duplication. A mutant line lacking Gli-D2 (DLGliD2) consistently exhibited better breadmaking quality and dough functionalities than its progenitor Xy81, but without penalties in other agronomic traits. It also had an elevated lysine content in the grains. Transcriptome analysis verified the lack of Gli-D2 a-gliadin gene expression in DLGliD2. Furthermore, the transcript and protein levels of protein disulfide isomerase were both upregulated in DLGliD2 grains. Consistent with this finding, DLGliD2 had increased disulfide content in the flour. Our work sheds light on the structure and function of Gli-D2 in commercial wheat, and suggests that the removal of Gli-D2 and the gliadins specified by it is likely to be useful for simultaneously enhancing the end-use and health-related traits of common wheat. Because gliadins and homologous proteins are widely present in grass species, the strategy and information reported here may be broadly useful for improving the quality traits of diverse cereal crops.© 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.


September 22, 2019  |  

Complete sequence of kenaf (Hibiscus cannabinus) mitochondrial genome and comparative analysis with the mitochondrial genomes of other plants.

Plant mitochondrial (mt) genomes are species specific due to the vast of foreign DNA migration and frequent recombination of repeated sequences. Sequencing of the mt genome of kenaf (Hibiscus cannabinus) is essential for elucidating its evolutionary characteristics. In the present study, single-molecule real-time sequencing technology (SMRT) was used to sequence the complete mt genome of kenaf. Results showed that the complete kenaf mt genome was 569,915?bp long and consisted of 62 genes, including 36 protein-coding, 3 rRNA and 23 tRNA genes. Twenty-five introns were found among nine of the 36 protein-coding genes, and five introns were trans-spliced. A comparative analysis with other plant mt genomes showed that four syntenic gene clusters were conserved in all plant mtDNAs. Fifteen chloroplast-derived fragments were strongly associated with mt genes, including the intact sequences of the chloroplast genes psaA, ndhB and rps7. According to the plant mt genome evolution analysis, some ribosomal protein genes and succinate dehydrogenase genes were frequently lost during the evolution of angiosperms. Our data suggest that the kenaf mt genome retained evolutionarily conserved characteristics. Overall, the complete sequencing of the kenaf mt genome provides additional information and enhances our better understanding of mt genomic evolution across angiosperms.


September 22, 2019  |  

The chromosome-level genome assemblies of two rattans (Calamus simplicifolius and Daemonorops jenkinsiana).

Calamus simplicifolius and Daemonorops jenkinsiana are two representative rattans, the most significant material sources for the rattan industry. However, the lack of reference genome sequences is a major obstacle for basic and applied biology on rattan.We produced two chromosome-level genome assemblies of C. simplicifolius and D. jenkinsiana using Illumina, Pacific Biosciences, and Hi-C sequencing data. A total of ~730 Gb and ~682 Gb of raw data covered the predicted genome lengths (~1.98 Gb of C. simplicifolius and ~1.61 Gb of D. jenkinsiana) to ~372 × and ~426 × read depths, respectively. The two de novo genome assemblies, ~1.94 Gb and ~1.58 Gb, were generated with scaffold N50s of ~160 Mb and ~119 Mb in C. simplicifolius and D. jenkinsiana, respectively. The C. simplicifolius and D. jenkinsiana genomes were predicted to harbor ?51,235 and ?53,342 intact protein-coding gene models, respectively. Benchmarking Universal Single-Copy Orthologs evaluation demonstrated that genome completeness reached 96.4% and 91.3% in the C. simplicifolius and D. jenkinsiana genomes, respectively. Genome evolution showed that four Arecaceae plants clustered together, and the divergence time between the two rattans was ~19.3 million years ago. Additionally, we identified 193 and 172 genes involved in the lignin biosynthesis pathway in the C. simplicifolius and D. jenkinsiana genomes, respectively.We present the first de novo assemblies of two rattan genomes (C. simplicifolius and D. jenkinsiana). These data will not only provide a fundamental resource for functional genomics, particularly in promoting germplasm utilization for breeding, but also serve as reference genomes for comparative studies between and among different species.


September 22, 2019  |  

Genomic insights into host adaptation between the wheat stripe rust pathogen (Puccinia striiformis f. sp. tritici) and the barley stripe rust pathogen (Puccinia striiformis f. sp. hordei).

Plant fungal pathogens can rapidly evolve and adapt to new environmental conditions in response to sudden changes of host populations in agro-ecosystems. However, the genomic basis of their host adaptation, especially at the forma specialis level, remains unclear.We sequenced two isolates each representing Puccinia striiformis f. sp. tritici (Pst) and P. striiformis f. sp. hordei (Psh), different formae speciales of the stripe rust fungus P. striiformis highly adapted to wheat and barley, respectively. The divergence of Pst and Psh, estimated to start 8.12 million years ago, has been driven by high nucleotide mutation rates. The high genomic variation within dikaryotic urediniospores of P. striiformis has provided raw genetic materials for genome evolution. No specific gene families have enriched in either isolate, but extensive gene loss events have occurred in both Pst and Psh after the divergence from their most recent common ancestor. A large number of isolate-specific genes were identified, with unique genomic features compared to the conserved genes, including 1) significantly shorter in length; 2) significantly less expressed; 3) significantly closer to transposable elements; and 4) redundant in pathways. The presence of specific genes in one isolate (or forma specialis) was resulted from the loss of the homologues in the other isolate (or forma specialis) by the replacements of transposable elements or losses of genomic fragments. In addition, different patterns and numbers of telomeric repeats were observed between the isolates.Host adaptation of P. striiformis at the forma specialis level is a complex pathogenic trait, involving not only virulence-related genes but also other genes. Gene loss, which might be adaptive and driven by transposable element activities, provides genomic basis for host adaptation of different formae speciales of P. striiformis.


September 22, 2019  |  

Genomic approaches for studying crop evolution.

Understanding how crop plants evolved from their wild relatives and spread around the world can inform about the origins of agriculture. Here, we review how the rapid development of genomic resources and tools has made it possible to conduct genetic mapping and population genetic studies to unravel the molecular underpinnings of domestication and crop evolution in diverse crop species. We propose three future avenues for the study of crop evolution: establishment of high-quality reference genomes for crops and their wild relatives; genomic characterization of germplasm collections; and the adoption of novel methodologies such as archaeogenetics, epigenomics, and genome editing.


September 22, 2019  |  

Complete genome sequencing and analysis of endophytic Sphingomonas sp. LK11 and its potential in plant growth.

Our study aimed to elucidate the plant growth-promoting characteristics and the structure and composition of Sphingomonas sp. LK11 genome using the single molecule real-time (SMRT) sequencing technology of Pacific Biosciences. The results revealed that LK11 produces different types of gibberellins (GAs) in pure culture and significantly improves soybean plant growth by influencing endogenous GAs compared with non-inoculated control plants. Detailed genomic analyses revealed that the Sphingomonas sp. LK11 genome consists of a circular chromosome (3.78 Mbp; 66.2% G+C content) and two circular plasmids (122,975 bps and 34,160 bps; 63 and 65% G+C content, respectively). Annotation showed that the LK11 genome consists of 3656 protein-coding genes, 59 tRNAs, and 4 complete rRNA operons. Functional analyses predicted that LK11 encodes genes for phosphate solubilization and nitrate/nitrite ammonification, which are beneficial for promoting plant growth. Genes for production of catalases, superoxide dismutase, and peroxidases that confer resistance to oxidative stress in plants were also identified in LK11. Moreover, genes for trehalose and glycine betaine biosynthesis were also found in LK11 genome. Similarly, Sphingomonas spp. analysis revealed an open pan-genome and a total of 8507 genes were identified in the Sphingomonas spp. pan-genome and about 1356 orthologous genes were found to comprise the core genome. However, the number of genomes analyzed was not enough to describe complete gene sets. Our findings indicated that the genetic makeup of Sphingomonas sp. LK11 can be utilized as an eco-friendly bioresource for cleaning contaminated sites and promoting growth of plants confronted with environmental perturbations.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.