Menu
September 22, 2019

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Echinoderms exhibit several fascinating evolutionary innovations that are rarely seen in the animal kingdom, but how these animals attained such features is not well understood. Here we report the sequencing and analysis of the genome and extensive transcriptomes of the sea cucumber Apostichopus japonicus, a species from a special echinoderm group with extraordinary potential for saponin synthesis, aestivation and organ regeneration. The sea cucumber does not possess a reorganized Hox cluster as previously assumed for all echinoderms, and the spatial expression of Hox7 and Hox11/13b potentially guides the embryo-to-larva axial transformation. Contrary to the typical production of lanosterol in animal cholesterol synthesis, the oxidosqualene cyclase of sea cucumber produces parkeol for saponin synthesis and has “plant-like” motifs suggestive of convergent evolution. The transcriptional factors Klf2 and Egr1 are identified as key regulators of aestivation, probably exerting their effects through a clock gene-controlled process. Intestinal hypometabolism during aestivation is driven by the DNA hypermethylation of various metabolic gene pathways, whereas the transcriptional network of intestine regeneration involves diverse signaling pathways, including Wnt, Hippo and FGF. Decoding the sea cucumber genome provides a new avenue for an in-depth understanding of the extraordinary features of sea cucumbers and other echinoderms.


September 22, 2019

Homogenization of sub-genome secretome gene expression patterns in the allodiploid fungus Verticillium longisporum

Allopolyploidization, genome duplication through interspecific hybridization, is an important evolutionary mechanism that can enable organisms to adapt to environmental changes or stresses. The increased adaptive potential of allopolyploids can be particularly relevant for plant pathogens in their ongoing quest for host immune response evasion. To this end, plant pathogens secrete a plethora of molecules that enable host colonization. Allodiploidization has resulted in the new plant pathogen Verticillium longisporum that infects different hosts than haploid Verticillium species. To reveal the impact of allodiploidization on plant pathogen evolution, we studied the genome and transcriptome dynamics of V. longisporum using next-generation sequencing. V. longisporum genome evolution is characterized by extensive chromosomal rearrangements, between as well as within parental chromosome sets, leading to a mosaic genome structure. In comparison to haploid Verticillium species, V. longisporum genes display stronger signs of positive selection. The expression patterns of the two sub-genomes show remarkable resemblance, suggesting that the parental gene expression patterns homogenized upon hybridization. Moreover, whereas V. longisporum genes encoding secreted proteins frequently display differential expression between the parental sub-genomes in culture medium, expression patterns homogenize upon plant colonization. Collectively, our results illustrate of the adaptive potential of allodiploidy mediated by the interplay of two sub-genomes. Author summary Hybridization followed by whole-genome duplication, so-called allopolyploidization, provides genomic flexibility that is beneficial for survival under stressful conditions or invasiveness into new habitats. Allopolyploidization has mainly been studied in plants, but also occurs in other organisms, including fungi. Verticillium longisporum, an emerging fungal pathogen on brassicaceous plants, arose by allodiploidization between two Verticillium spp. We used comparative genomics to reveal the plastic nature of the V. longisporum genomes, showing that parental chromosome sets recombined extensively, resulting in a mosaic genome pattern. Furthermore, we show that non-synonymous substitutions frequently occurred in V. longisporum. Moreover, we reveal that expression patterns of genes encoding secreted proteins homogenized between the V. longisporum sub-genomes upon plant colonization. In conclusion, our results illustrate the large adaptive potential upon genome hybridization for fungi mediated by genomic plasticity and interaction between sub-genomes.


September 22, 2019

Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.

Luo-han-guo (Siraitia grosvenorii), also called monk fruit, is a member of the Cucurbitaceae family. Monk fruit has become an important area for research because of the pharmacological and economic potential of its noncaloric, extremely sweet components (mogrosides). It is also commonly used in traditional Chinese medicine for the treatment of lung congestion, sore throat, and constipation. Recently, a single reference genome became available for monk fruit, assembled from 36.9x genome coverage reads via Illumina sequencing platforms. This genome assembly has a relatively short (34.2 kb) contig N50 length and lacks integrated annotations. These drawbacks make it difficult to use as a reference in assembling transcriptomes and discovering novel functional genes.Here, we offer a new high-quality draft of the S. grosvenorii genome assembled using 31 Gb (~73.8x) long single molecule real time sequencing reads and polished with ~50 Gb Illumina paired-end reads. The final genome assembly is approximately 469.5 Mb, with a contig N50 length of 432,384 bp, representing a 12.6-fold improvement. We further annotated 237.3 Mb of repetitive sequence and 30,565 consensus protein coding genes with combined evidence. Phylogenetic analysis showed that S. grosvenorii diverged from members of the Cucurbitaceae family approximately 40.9 million years ago. With comprehensive transcriptomic analysis and differential expression testing, we identified 4,606 up-regulated genes in the early fruit compared to the leaf, a number of which were linked to metabolic pathways regulating fruit development and ripening.The availability of this new monk fruit genome assembly, as well as the annotations, will facilitate the discovery of new functional genes and the genetic improvement of monk fruit.


September 22, 2019

De novo genome assembly of Oryza granulata reveals rapid genome expansion and adaptive evolution

The wild relatives of rice have adapted to different ecological environments and constitute a useful reservoir of agronomic traits for genetic improvement. Here we present the ~777?Mb de novo assembled genome sequence of Oryza granulata. Recent bursts of long-terminal repeat retrotransposons, especially RIRE2, led to a rapid twofold increase in genome size after O. granulata speciation. Universal centromeric tandem repeats are absent within its centromeres, while gypsy-type LTRs constitute the main centromere-specific repetitive elements. A total of 40,116 protein-coding genes were predicted in O. granulata, which is close to that of Oryza sativa. Both the copy number and function of genes involved in photosynthesis and energy production have undergone positive selection during the evolution of O. granulata, which might have facilitated its adaptation to the low light habitats. Together, our findings reveal the rapid genome expansion, distinctive centromere organization, and adaptive evolution of O. granulata.


September 22, 2019

HapCHAT: adaptive haplotype assembly for efficiently leveraging high coverage in long reads.

Haplotype assembly is the process of assigning the different alleles of the variants covered by mapped sequencing reads to the two haplotypes of the genome of a human individual. Long reads, which are nowadays cheaper to produce and more widely available than ever before, have been used to reduce the fragmentation of the assembled haplotypes since their ability to span several variants along the genome. These long reads are also characterized by a high error rate, an issue which may be mitigated, however, with larger sets of reads, when this error rate is uniform across genome positions. Unfortunately, current state-of-the-art dynamic programming approaches designed for long reads deal only with limited coverages.Here, we propose a new method for assembling haplotypes which combines and extends the features of previous approaches to deal with long reads and higher coverages. In particular, our algorithm is able to dynamically adapt the estimated number of errors at each variant site, while minimizing the total number of error corrections necessary for finding a feasible solution. This allows our method to significantly reduce the required computational resources, allowing to consider datasets composed of higher coverages. The algorithm has been implemented in a freely available tool, HapCHAT: Haplotype Assembly Coverage Handling by Adapting Thresholds. An experimental analysis on sequencing reads with up to 60 × coverage reveals improvements in accuracy and recall achieved by considering a higher coverage with lower runtimes.Our method leverages the long-range information of sequencing reads that allows to obtain assembled haplotypes fragmented in a lower number of unphased haplotype blocks. At the same time, our method is also able to deal with higher coverages to better correct the errors in the original reads and to obtain more accurate haplotypes as a result.HapCHAT is available at http://hapchat.algolab.eu under the GNU Public License (GPL).


September 22, 2019

RAD sequencing and a hybrid Antarctic fur seal genome assembly reveal rapidly decaying linkage disequilibrium, global population structure and evidence for inbreeding.

Recent advances in high throughput sequencing have transformed the study of wild organisms by facilitating the generation of high quality genome assemblies and dense genetic marker datasets. These resources have the potential to significantly advance our understanding of diverse phenomena at the level of species, populations and individuals, ranging from patterns of synteny through rates of linkage disequilibrium (LD) decay and population structure to individual inbreeding. Consequently, we used PacBio sequencing to refine an existing Antarctic fur seal (Arctocephalus gazella) genome assembly and genotyped 83 individuals from six populations using restriction site associated DNA (RAD) sequencing. The resulting hybrid genome comprised 6,169 scaffolds with an N50 of 6.21 Mb and provided clear evidence for the conservation of large chromosomal segments between the fur seal and dog (Canis lupus familiaris). Focusing on the most extensively sampled population of South Georgia, we found that LD decayed rapidly, reaching the background level by around 400 kb, consistent with other vertebrates but at odds with the notion that fur seals experienced a strong historical bottleneck. We also found evidence for population structuring, with four main Antarctic island groups being resolved. Finally, appreciable variance in individual inbreeding could be detected, reflecting the strong polygyny and site fidelity of the species. Overall, our study contributes important resources for future genomic studies of fur seals and other pinnipeds while also providing a clear example of how high throughput sequencing can generate diverse biological insights at multiple levels of organization. Copyright © 2018 Humble et al.


September 22, 2019

High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant.

Salvia splendens Ker-Gawler, scarlet or tropical sage, is a tender herbaceous perennial widely introduced and seen in public gardens all over the world. With few molecular resources, breeding is still restricted to traditional phenotypic selection, and the genetic mechanisms underlying phenotypic variation remain unknown. Hence, a high-quality reference genome will be very valuable for marker-assisted breeding, genome editing, and molecular genetics.We generated 66 Gb and 37 Gb of raw DNA sequences, respectively, from whole-genome sequencing of a largely homozygous scarlet sage inbred line using Pacific Biosciences (PacBio) single-molecule real-time and Illumina HiSeq sequencing platforms. The PacBio de novo assembly yielded a final genome with a scaffold N50 size of 3.12 Mb and a total length of 808 Mb. The repetitive sequences identified accounted for 57.52% of the genome sequence, and ?54,008 protein-coding genes were predicted collectively with ab initio and homology-based gene prediction from the masked genome. The divergence time between S. splendens and Salvia miltiorrhiza was estimated at 28.21 million years ago (Mya). Moreover, 3,797 species-specific genes and 1,187 expanded gene families were identified for the scarlet sage genome.We provide the first genome sequence and gene annotation for the scarlet sage. The availability of these resources will be of great importance for further breeding strategies, genome editing, and comparative genomics among related species.


September 22, 2019

A graph-based approach to diploid genome assembly.

Constructing high-quality haplotype-resolved de novo assemblies of diploid genomes is important for revealing the full extent of structural variation and its role in health and disease. Current assembly approaches often collapse the two sequences into one haploid consensus sequence and, therefore, fail to capture the diploid nature of the organism under study. Thus, building an assembler capable of producing accurate and complete diploid assemblies, while being resource-efficient with respect to sequencing costs, is a key challenge to be addressed by the bioinformatics community.We present a novel graph-based approach to diploid assembly, which combines accurate Illumina data and long-read Pacific Biosciences (PacBio) data. We demonstrate the effectiveness of our method on a pseudo-diploid yeast genome and show that we require as little as 50× coverage Illumina data and 10× PacBio data to generate accurate and complete assemblies. Additionally, we show that our approach has the ability to detect and phase structural variants.https://github.com/whatshap/whatshap.Supplementary data are available at Bioinformatics online.


September 22, 2019

Large-scale gene losses underlie the genome evolution of parasitic plant Cuscuta australis.

Dodders (Cuscuta spp., Convolvulaceae) are root- and leafless parasitic plants. The physiology, ecology, and evolution of these obligate parasites are poorly understood. A high-quality reference genome of Cuscuta australis was assembled. Our analyses reveal that Cuscuta experienced accelerated molecular evolution, and Cuscuta and the convolvulaceous morning glory (Ipomoea) shared a common whole-genome triplication event before their divergence. C. australis genome harbors 19,671 protein-coding genes, and importantly, 11.7% of the conserved orthologs in autotrophic plants are lost in C. australis. Many of these gene loss events likely result from its parasitic lifestyle and the massive changes of its body plan. Moreover, comparison of the gene expression patterns in Cuscuta prehaustoria/haustoria and various tissues of closely related autotrophic plants suggests that Cuscuta haustorium formation requires mostly genes normally involved in root development. The C. australis genome provides important resources for studying the evolution of parasitism, regressive evolution, and evo-devo in plant parasites.


September 22, 2019

A mosaic monoploid reference sequence for the highly complex genome of sugarcane.

Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after their divergence.


September 22, 2019

Genome analysis of the ancient tracheophyte Selaginella tamariscina reveals evolutionary features relevant to the acquisition of desiccation tolerance.

Resurrection plants, which are the “gifts” of natural evolution, are ideal models for studying the genetic basis of plant desiccation tolerance. Here, we report a high-quality genome assembly of 301 Mb for the diploid spike moss Selaginella tamariscina, a primitive vascular resurrection plant. We predicated 27 761 protein-coding genes from the assembled S. tamariscina genome, 11.38% (2363) of which showed significant expression changes in response to desiccation. Approximately 60.58% of the S. tamariscina genome was annotated as repetitive DNA, which is an almost 2-fold increase of that in the genome of desiccation-sensitive Selaginella moellendorffii. Genomic and transcriptomic analyses highlight the unique evolution and complex regulations of the desiccation response in S. tamariscina, including species-specific expansion of the oleosin and pentatricopeptide repeat gene families, unique genes and pathways for reactive oxygen species generation and scavenging, and enhanced abscisic acid (ABA) biosynthesis and potentially distinct regulation of ABA signaling and response. Comparative analysis of chloroplast genomes of several Selaginella species revealed a unique structural rearrangement and the complete loss of chloroplast NAD(P)H dehydrogenase (NDH) genes in S. tamariscina, suggesting a link between the absence of the NDH complex and desiccation tolerance. Taken together, our comparative genomic and transcriptomic analyses reveal common and species-specific desiccation tolerance strategies in S. tamariscina, providing significant insights into the desiccation tolerance mechanism and the evolution of resurrection plants. Copyright © 2018 The Author. Published by Elsevier Inc. All rights reserved.


September 22, 2019

Genomic variation among and within six Juglans species.

Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are ‘draft’ genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata. Copyright © 2018 Stevens et al.


September 22, 2019

Genome survey of the freshwater mussel Venustaconcha ellipsiformis (Bivalvia: Unionida) using a hybrid de novo assembly approach.

Freshwater mussels (Bivalvia: Unionida) serve an important role as aquatic ecosystem engineers but are one of the most critically imperilled groups of animals. Here, we used a combination of sequencing strategies to assemble and annotate a draft genome of Venustaconcha ellipsiformis, which will serve as a valuable genomic resource given the ecological value and unique “doubly uniparental inheritance” mode of mitochondrial DNA transmission of freshwater mussels. The genome described here was obtained by combining high-coverage short reads (65× genome coverage of Illumina paired-end and 11× genome coverage of mate-pairs sequences) with low-coverage Pacific Biosciences long reads (0.3× genome coverage). Briefly, the final scaffold assembly accounted for a total size of 1.54?Gb (366,926 scaffolds, N50?=?6.5 kb, with 2.3% of “N” nucleotides), representing 86% of the predicted genome size of 1.80?Gb, while over one third of the genome (37.5%) consisted of repeated elements and >85% of the core eukaryotic genes were recovered. Given the repeated genetic bottlenecks of V. ellipsiformis populations as a result of glaciations events, heterozygosity was also found to be remarkably low (0.6%), in contrast to most other sequenced bivalve species. Finally, we reassembled the full mitochondrial genome and found six polymorphic sites with respect to the previously published reference. This resource opens the way to comparative genomics studies to identify genes related to the unique adaptations of freshwater mussels and their distinctive mitochondrial inheritance mechanism.


September 22, 2019

A chromosome scale assembly of the model desiccation tolerant grass Oropetium thomaeum

Oropetium thomaeum is an emerging model for desiccation tolerance and genome size evolution in grasses. A high-quality draft genome of Oropetium was recently sequenced, but the lack of a chromosome scale assembly has hindered comparative analyses and downstream functional genomics. Here, we reassembled Oropetium, and anchored the genome into ten chromosomes using Hi-C based chromatin interactions. A combination of high-resolution RNAseq data and homology-based gene prediction identified thousands of new, conserved gene models that were absent from the V1 assembly. This includes thousands of new genes with high expression across a desiccation timecourse. The sorghum and Oropetium genomes have a surprising degree of chromosome-level collinearity, and several chromosome pairs have near perfect synteny. Other chromosomes are collinear in the gene rich chromosome arms but have experienced pericentric translocations. Together, these resources will be useful for the grass comparative genomic community and further establish Oropetium as a model resurrection plant.


September 22, 2019

A synthetic-diploid benchmark for accurate variant-calling evaluation.

Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.