Menu
July 7, 2019

Genome sequencing reveals the origin of the allotetraploid Arabidopsis suecica.

Polyploidy is an example of instantaneous speciation when it involves the formation of a new cytotype that is incompatible with the parental species. Because new polyploid individuals are likely to be rare, establishment of a new species is unlikely unless polyploids are able to reproduce through self-fertilization (selfing), or asexually. Conversely, selfing (or asexuality) makes it possible for polyploid species to originate from a single individual-a bona fide speciation event. The extent to which this happens is not known. Here, we consider the origin of Arabidopsis suecica, a selfing allopolyploid between Arabidopsis thaliana and Arabidopsis arenosa, which has hitherto been considered to be an example of a unique origin. Based on whole-genome re-sequencing of 15 natural A. suecica accessions, we identify ubiquitous shared polymorphism with the parental species, and hence conclusively reject a unique origin in favor of multiple founding individuals. We further estimate that the species originated after the last glacial maximum in Eastern Europe or central Eurasia (rather than Sweden, as the name might suggest). Finally, annotation of the self-incompatibility loci in A. suecica revealed that both loci carry non-functional alleles. The locus inherited from the selfing A. thaliana is fixed for an ancestral non-functional allele, whereas the locus inherited from the outcrossing A. arenosa is fixed for a novel loss-of-function allele. Furthermore, the allele inherited from A. thaliana is predicted to transcriptionally silence the allele inherited from A. arenosa, suggesting that loss of self-incompatibility may have been instantaneous.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

A novel inversion in the chloroplast genome of marama (Tylosema esculentum).

Tylosema esculentum (marama bean) is being developed as a possible crop for resource-poor farmers in arid regions of Southern Africa. As part of the molecular characterization of this species, the chloroplast genome has been assembled from next-generation sequencing using both Illumina and Pac-Bio data. The genome is of typical organization with a large single-copy region and a small single-copy region separated by a pair of inverted repeats and covers 161537 bp. It contains a unique inversion not present in any other legumes, even in the closest relatives for which the complete chloroplast genome is available, and two complete copies of the ycf1 gene. These data extend the range of variability of legume chloroplast genomes. The sequencing of multiple individuals has identified two different chloroplast genomes which were geographically separated. The current sampling is limited so that the extent of the intraspecific variation is still to be determined, leaving open the question of legume chloroplast genomes adapted to particular arid environments.© The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.


July 7, 2019

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Assembly of complex genomes using short reads remains a major challenge, which usually yields highly fragmented assemblies. Generation of ultradense linkage maps is promising for anchoring such assemblies, but traditional linkage mapping methods are hindered by the infrequency and unevenness of meiotic recombination that limit attainable map resolution. Here we develop a sequencing-based “in vitro” linkage mapping approach (called RadMap), where chromosome breakage and segregation are realized by generating hundreds of “subhaploid” fosmid/bacterial-artificial-chromosome clone pools, and by restriction site-associated DNA sequencing of these clone pools to produce an ultradense whole-genome restriction map to facilitate genome scaffolding. A bootstrap-based minimum spanning tree algorithm is developed for grouping and ordering of genome-wide markers and is implemented in a user-friendly, integrated software package (AMMO). We perform extensive analyses to validate the power and accuracy of our approach in the model plant Arabidopsis thaliana and human. We also demonstrate the utility of RadMap for enhancing the contiguity of a variety of whole-genome shotgun assemblies generated using either short Illumina reads (300 bp) or long PacBio reads (6-14 kb), with up to 15-fold improvement of N50 (~816 kb-3.7 Mb) and high scaffolding accuracy (98.1-98.5%). RadMap outperforms BioNano and Hi-C when input assembly is highly fragmented (contig N50 = 54 kb). RadMap can capture wide-range contiguity information and provide an efficient and flexible tool for high-resolution physical mapping and scaffolding of highly fragmented assemblies. Copyright © 2017 Dou et al.


July 7, 2019

Draft nuclear genome sequence of the liquid hydrocarbon–accumulating green microalga Botryococcus braunii race B (Showa).

Botryococcus braunii has long been known as a prodigious producer of liquid hydrocarbon oils that can be converted into combustion engine fuels. This draft genome for the B race of B. braunii will allow researchers to unravel important hydrocarbon biosynthetic pathways and identify possible regulatory networks controlling this unusual metabolism. Copyright © 2017 Browne et al.


July 7, 2019

Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses.

WUSCHEL-related homeobox (WOX) family members play significant roles in plant growth and development, such as in embryo patterning, stem-cell maintenance, and lateral organ formation. The recently published cotton genome sequences allow us to perform comprehensive genome-wide analysis and characterization of WOX genes in cotton.In this study, we identified 21, 20, and 38 WOX genes in Gossypium arboreum (2n = 26, A2), G. raimondii (2n = 26, D5), and G. hirsutum (2n = 4x = 52, (AD)t), respectively. Sequence logos showed that homeobox domains were significantly conserved among the WOX genes in cotton, Arabidopsis, and rice. A total of 168 genes from three typical monocots and six dicots were naturally divided into three clades, which were further classified into nine sub-clades. A good collinearity was observed in the synteny analysis of the orthologs from At and Dt (t represents tetraploid) sub-genomes. Whole genome duplication (WGD) and segmental duplication within At and Dt sub-genomes played significant roles in the expansion of WOX genes, and segmental duplication mainly generated the WUS clade. Copia and Gypsy were the two major types of transposable elements distributed upstream or downstream of WOX genes. Furthermore, through comparison, we found that the exon/intron pattern was highly conserved between Arabidopsis and cotton, and the homeobox domain loci were also conserved between them. In addition, the expression pattern in different tissues indicated that the duplicated genes in cotton might have acquired new functions as a result of sub-functionalization or neo-functionalization. The expression pattern of WOX genes under different stress treatments showed that the different genes were induced by different stresses.In present work, WOX genes, classified into three clades, were identified in the upland cotton genome. Whole genome and segmental duplication were determined to be the two major impetuses for the expansion of gene numbers during the evolution. Moreover, the expression patterns suggested that the duplicated genes might have experienced a functional divergence. Together, these results shed light on the evolution of the WOX gene family, and would be helpful in future research.


July 7, 2019

Hybrid assembly with long and short reads improves discovery of gene family expansions.

Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation.We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation.Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies.Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations.


July 7, 2019

The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae) revealed by whole-genome sequencing

Mangroves invade some very marginal habitats for woody plants—at the interface between land and sea. Since mangroves anchor tropical coastal communities globally, their origin, diversification and adaptation are of scientific significance, particularly at a time of global climate change. In this study, a combination of single-molecule long reads and the more conventional short reads are generated from Rhizophora apiculata for the de novo assembly of its genome to a near chromosome level. The longest scaffold, N50 and N90 for the R. apiculata genome, are 13.3 Mb, 5.4 Mb and 1.0 Mb, respectively. Short reads for the genomes and transcriptomes of eight related species are also generated. We find that the ancestor of Rhizophoreae experienced a whole-genome duplication ~70 Myrs ago, which is followed rather quickly by colonization and species diversification. Mangroves exhibit pan-exome modifications of amino acid (AA) usage as well as unusual AA substitutions among closely related species. The usage and substitution of AAs, unique among plants surveyed, is correlated with the rapid evolution of proteins in mangroves. A small subset of these substitutions is associated with mangroves’ highly specialized traits (vivipary and red bark) thought to be adaptive in the intertidal habitats. Despite the many adaptive features, mangroves are among the least genetically diverse plants, likely the result of continual habitat turnovers caused by repeated rises and falls of sea level in the geologically recent past. Mangrove genomes thus inform about their past evolutionary success as well as portend a possibly difficult future.


July 7, 2019

Chromosome-level genome assembly and transcriptome of the green alga Chromochloris zofingiensis illuminates astaxanthin production.

Microalgae have potential to help meet energy and food demands without exacerbating environmental problems. There is interest in the unicellular green alga Chromochloris zofingiensis, because it produces lipids for biofuels and a highly valuable carotenoid nutraceutical, astaxanthin. To advance understanding of its biology and facilitate commercial development, we present a C. zofingiensis chromosome-level nuclear genome, organelle genomes, and transcriptome from diverse growth conditions. The assembly, derived from a combination of short- and long-read sequencing in conjunction with optical mapping, revealed a compact genome of ~58 Mbp distributed over 19 chromosomes containing 15,274 predicted protein-coding genes. The genome has uniform gene density over chromosomes, low repetitive sequence content (~6%), and a high fraction of protein-coding sequence (~39%) with relatively long coding exons and few coding introns. Functional annotation of gene models identified orthologous families for the majority (~73%) of genes. Synteny analysis uncovered localized but scrambled blocks of genes in putative orthologous relationships with other green algae. Two genes encoding beta-ketolase (BKT), the key enzyme synthesizing astaxanthin, were found in the genome, and both were up-regulated by high light. Isolation and molecular analysis of astaxanthin-deficient mutants showed that BKT1 is required for the production of astaxanthin. Moreover, the transcriptome under high light exposure revealed candidate genes that could be involved in critical yet missing steps of astaxanthin biosynthesis, including ABC transporters, cytochrome P450 enzymes, and an acyltransferase. The high-quality genome and transcriptome provide insight into the green algal lineage and carotenoid production.


July 7, 2019

N-glycan maturation mutants in Lotus japonicus for basic and applied glycoprotein research.

Studies of protein N-glycosylation are important for answering fundamental questions on the diverse functions of glycoproteins in plant growth and development. Here we generated and characterised a comprehensive collection of Lotus japonicusLORE1 insertion mutants, each lacking the activity of one of the 12 enzymes required for normal N-glycan maturation in the glycosylation machinery. The inactivation of the individual genes resulted in altered N-glycan patterns as documented using mass spectrometry and glycan-recognising antibodies, indicating successful identification of null mutations in the target glyco-genes. For example, both mass spectrometry and immunoblotting experiments suggest that proteins derived from the a1,3-fucosyltransferase (Lj3fuct) mutant completely lacked a1,3-core fucosylation. Mass spectrometry also suggested that the Lotus japonicus convicilin 2 was one of the main glycoproteins undergoing differential expression/N-glycosylation in the mutants. Demonstrating the functional importance of glycosylation, reduced growth and seed production phenotypes were observed for the mutant plants lacking functional mannosidase I, N-acetylglucosaminyltransferase I, and a1,3-fucosyltransferase, even though the relative protein composition and abundance appeared unaffected. The strength of our N-glycosylation mutant platform is the broad spectrum of resulting glycoprotein profiles and altered physiological phenotypes that can be produced from single, double, triple and quadruple mutants. This platform will serve as a valuable tool for elucidating the functional role of protein N-glycosylation in plants. Furthermore, this technology can be used to generate stable plant mutant lines for biopharmaceutical production of glycoproteins displaying relative homogeneous and mammalian-like N-glycosylation features.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


July 7, 2019

ALUMINUM RESISTANCE TRANSCRIPTION FACTOR 1 (ART1) contributes to natural variation in aluminum resistance in diverse genetic backgrounds of rice (O. sativa)

Abstract Transcription factors (TFs) regulate the expression of other genes to indirectly mediate stress resistance mechanisms. Therefore, when studying TF-mediated stress resistance, it is important to understand how TFs interact with genes in the genetic background. Here, we fine-mapped the aluminum (Al) resistance QTL Alt12.1 to a 44-kb region containing six genes. Among them is ART1, which encodes a C2H2-type zinc finger TF required for Al resistance in rice. The mapping parents, Al-resistant cv Azucena (tropical japonica) and Al-sensitive cv IR64 (indica), have extensive sequence polymorphism within the ART1 coding region, but similar ART1 expression levels. Using reciprocal near-isogenic lines (NILs) we examined how allele-swapping the Alt12.1 locus would affect plant responses to Al. Analysis of global transcriptional responses to Al stress in roots of the NILs alongside their recurrent parents demonstrated that the presence of the Alt12.1 from Al-resistant Azucena led to greater changes in gene expression in response to Al when compared to the Alt12.1 from IR64 in both genetic backgrounds. The presence of the ART1 allele from the opposite parent affected the expression of several genes not previously implicated in rice Al tolerance. We highlight examples where putatively functional variation in cis-regulatory regions of ART1-regulated genes interacts with ART1 to determine gene expression in response to Al. This ART1–promoter interaction may be associated with transgressive variation for Al resistance in the Azucena × IR64 population. These results illustrate how ART1 interacts with the genetic background to contribute to quantitative phenotypic variation in rice Al resistance.


July 7, 2019

Genetic control of plasticity of oil yield for combined abiotic stresses using a joint approach of crop modelling and genome-wide association.

Understanding the genetic basis of phenotypic plasticity is crucial for predicting and managing climate change effects on wild plants and crops. Here, we combined crop modelling and quantitative genetics to study the genetic control of oil yield plasticity for multiple abiotic stresses in sunflower. First, we developed stress indicators to characterize 14 environments for three abiotic stresses (cold, drought and nitrogen) using the SUNFLO crop model and phenotypic variations of three commercial varieties. The computed plant stress indicators better explain yield variation than descriptors at the climatic or crop levels. In those environments, we observed oil yield of 317 sunflower hybrids and regressed it with three selected stress indicators. The slopes of cold stress norm reaction were used as plasticity phenotypes in the following genome-wide association study. Among the 65 534 tested Single Nucleotide Polymorphisms (SNPs), we identified nine quantitative trait loci controlling oil yield plasticity to cold stress. Associated single nucleotide polymorphisms are localized in genes previously shown to be involved in cold stress responses: oligopeptide transporters, lipid transfer protein, cystatin, alternative oxidase or root development. This novel approach opens new perspectives to identify genomic regions involved in genotype-by-environment interaction of a complex traits to multiple stresses in realistic natural or agronomical conditions.© 2017 John Wiley & Sons Ltd.


July 7, 2019

Euglena gracilis genome and transcriptome: organelles, nuclear genome assembly strategies and initial features.

Euglena gracilis is a major component of the aquatic ecosystem and together with closely related species, is ubiquitous worldwide. Euglenoids are an important group of protists, possessing a secondarily acquired plastid and are relatives to the Kinetoplastidae, which themselves have global impact as disease agents. To understand the biology of E. gracilis, as well as to provide further insight into the evolution and origins of the Kinetoplastidae, we embarked on sequencing the nuclear genome; the plastid and mitochondrial genomes are already in the public domain. Earlier studies suggested an extensive nuclear DNA content, with likely a high degree of repetitive sequence, together with significant extrachromosomal elements. To produce a list of coding sequences we have combined transcriptome data from both published and new sources, as well as embarked on de novo sequencing using a combination of 454, Illumina paired end libraries and long PacBio reads. Preliminary analysis suggests a surprisingly large genome approaching 2 Gbp, with a highly fragmented architecture and extensive repeat composition. Over 80% of the RNAseq reads from E. gracilis maps to the assembled genome sequence, which is comparable with the well assembled genomes of T. brucei and T. cruzi. In order to achieve this level of assembly we employed multiple informatics pipelines, which are discussed here. Finally, as a preliminary view of the genome architecture, we discuss the tubulin and calmodulin genes, which highlight potential novel splicing mechanisms.


July 7, 2019

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner.Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly.Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.


July 7, 2019

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT’s computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.