Menu
September 22, 2019

Genome survey of the freshwater mussel Venustaconcha ellipsiformis (Bivalvia: Unionida) using a hybrid de novo assembly approach.

Freshwater mussels (Bivalvia: Unionida) serve an important role as aquatic ecosystem engineers but are one of the most critically imperilled groups of animals. Here, we used a combination of sequencing strategies to assemble and annotate a draft genome of Venustaconcha ellipsiformis, which will serve as a valuable genomic resource given the ecological value and unique “doubly uniparental inheritance” mode of mitochondrial DNA transmission of freshwater mussels. The genome described here was obtained by combining high-coverage short reads (65× genome coverage of Illumina paired-end and 11× genome coverage of mate-pairs sequences) with low-coverage Pacific Biosciences long reads (0.3× genome coverage). Briefly, the final scaffold assembly accounted for a total size of 1.54?Gb (366,926 scaffolds, N50?=?6.5 kb, with 2.3% of “N” nucleotides), representing 86% of the predicted genome size of 1.80?Gb, while over one third of the genome (37.5%) consisted of repeated elements and >85% of the core eukaryotic genes were recovered. Given the repeated genetic bottlenecks of V. ellipsiformis populations as a result of glaciations events, heterozygosity was also found to be remarkably low (0.6%), in contrast to most other sequenced bivalve species. Finally, we reassembled the full mitochondrial genome and found six polymorphic sites with respect to the previously published reference. This resource opens the way to comparative genomics studies to identify genes related to the unique adaptations of freshwater mussels and their distinctive mitochondrial inheritance mechanism.


September 22, 2019

Sequencing of Panax notoginseng genome reveals genes involved in disease resistance and ginsenoside biosynthesis

Background: Panax notoginseng is a traditional Chinese herb with high medicinal and economic value. There has been considerable research on the pharmacological activities of ginsenosides contained in Panax spp.; however, very little is known about the ginsenoside biosynthetic pathway. Results: We reported the first de novo genome of 2.36 Gb of sequences from P. notoginseng with 35,451 protein-encoding genes. Compared to other plants, we found notable gene family contraction of disease-resistance genes in P. notoginseng, but notable expansion for several ATP-binding cassette (ABC) transporter subfamilies, such as the Gpdr subfamily, indicating that ABCs might be an additional mechanism for the plant to cope with biotic stress. Combining eight transcriptomes of roots and aerial parts, we identified several key genes, their transcription factor binding sites and all their family members involved in the synthesis pathway of ginsenosides in P. notoginseng, including dammarenediol synthase, CYP716 and UGT71. Conclusions: The complete genome analysis of P. notoginseng, the first in genus Panax, will serve as an important reference sequence for improving breeding and cultivation of this important nutraceutical and medicinal but vulnerable plant species.


September 22, 2019

A chromosome scale assembly of the model desiccation tolerant grass Oropetium thomaeum

Oropetium thomaeum is an emerging model for desiccation tolerance and genome size evolution in grasses. A high-quality draft genome of Oropetium was recently sequenced, but the lack of a chromosome scale assembly has hindered comparative analyses and downstream functional genomics. Here, we reassembled Oropetium, and anchored the genome into ten chromosomes using Hi-C based chromatin interactions. A combination of high-resolution RNAseq data and homology-based gene prediction identified thousands of new, conserved gene models that were absent from the V1 assembly. This includes thousands of new genes with high expression across a desiccation timecourse. The sorghum and Oropetium genomes have a surprising degree of chromosome-level collinearity, and several chromosome pairs have near perfect synteny. Other chromosomes are collinear in the gene rich chromosome arms but have experienced pericentric translocations. Together, these resources will be useful for the grass comparative genomic community and further establish Oropetium as a model resurrection plant.


September 22, 2019

A reference genome of the Chinese hamster based on a hybrid assembly strategy.

Accurate and complete genome sequences are essential in biotechnology to facilitate genome-based cell engineering efforts. The current genome assemblies for Cricetulus griseus, the Chinese hamster, are fragmented and replete with gap sequences and misassemblies, consistent with most short-read-based assemblies. Here, we completely resequenced C. griseus using single molecule real time sequencing and merged this with Illumina-based assemblies. This generated a more contiguous and complete genome assembly than either technology alone, reducing the number of scaffolds by >28-fold, with 90% of the sequence in the 122 longest scaffolds. Most genes are now found in single scaffolds, including up- and downstream regulatory elements, enabling improved study of noncoding regions. With >95% of the gap sequence filled, important Chinese hamster ovary cell mutations have been detected in draft assembly gaps. This new assembly will be an invaluable resource for continued basic and pharmaceutical research.© 2018 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc.


September 22, 2019

Analysis of the draft genome of the red seaweed Gracilariopsis chorda provides insights into genome size evolution in Rhodophyta.

Red algae (Rhodophyta) underwent two phases of large-scale genome reduction during their early evolution. The red seaweeds did not attain genome sizes or gene inventories typical of other multicellular eukaryotes. We generated a high-quality 92.1 Mb draft genome assembly from the red seaweed Gracilariopsis chorda, including methylation and small (s)RNA data. We analyzed these and other Archaeplastida genomes to address three questions: 1) What is the role of repeats and transposable elements (TEs) in explaining Rhodophyta genome size variation, 2) what is the history of genome duplication and gene family expansion/reduction in these taxa, and 3) is there evidence for TE suppression in red algae? We find that the number of predicted genes in red algae is relatively small (4,803-13,125 genes), particularly when compared with land plants, with no evidence of polyploidization. Genome size variation is primarily explained by TE expansion with the red seaweeds having the largest genomes. Long terminal repeat elements and DNA repeats are the major contributors to genome size growth. About 8.3% of the G. chorda genome undergoes cytosine methylation among gene bodies, promoters, and TEs, and 71.5% of TEs contain methylated-DNA with 57% of these regions associated with sRNAs. These latter results suggest a role for TE-associated sRNAs in RNA-dependent DNA methylation to facilitate silencing. We postulate that the evolution of genome size in red algae is the result of the combined action of TE spread and the concomitant emergence of its epigenetic suppression, together with other important factors such as changes in population size.


September 22, 2019

Human copy number variants are enriched in regions of low mappability.

Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.


September 22, 2019

Genomic signatures of mitonuclear coevolution across populations of Tigriopus californicus.

The copepod Tigriopus californicus shows extensive population divergence and is becoming a model for understanding allopatric differentiation and the early stages of speciation. Here, we report a high-quality reference genome for one population (~190?megabases across 12 scaffolds, and ~15,500 protein-coding genes). Comparison with other arthropods reveals 2,526 genes presumed to be specific to T. californicus, with an apparent proliferation of genes involved in ion transport and receptor activity. Beyond the reference population, we report re-sequenced genomes of seven additional populations, spanning the continuum of reproductive isolation. Populations show extreme mitochondrial DNA divergence, with higher levels of amino acid differentiation than observed in other taxa. Across the nuclear genome, we find elevated protein evolutionary rates and positive selection in genes predicted to interact with mitochondrial DNA and the proteins and RNA it encodes in multiple pathways. Together, these results support the hypothesis that rapid mitochondrial evolution drives compensatory nuclear evolution within isolated populations, thereby providing a potentially important mechanism for causing intrinsic reproductive isolation.


September 22, 2019

A synthetic-diploid benchmark for accurate variant-calling evaluation.

Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.


September 22, 2019

Biology and genome of a newly discovered sibling species of Caenorhabditis elegans.

A ‘sibling’ species of the model organism Caenorhabditis elegans has long been sought for use in comparative analyses that would enable deep evolutionary interpretations of biological phenomena. Here, we describe the first sibling species of C. elegans, C. inopinata n. sp., isolated from fig syconia in Okinawa, Japan. We investigate the morphology, developmental processes and behaviour of C. inopinata, which differ significantly from those of C. elegans. The 123-Mb C. inopinata genome was sequenced and assembled into six nuclear chromosomes, allowing delineation of Caenorhabditis genome evolution and revealing unique characteristics, such as highly expanded transposable elements that might have contributed to the genome evolution of C. inopinata. In addition, C. inopinata exhibits massive gene losses in chemoreceptor gene families, which could be correlated with its limited habitat area. We have developed genetic and molecular techniques for C. inopinata; thus C. inopinata provides an exciting new platform for comparative evolutionary studies.


September 22, 2019

Conservation genomics of the declining North American bumblebee Bombus terricola reveals inbreeding and selection on immune genes.

The yellow-banded bumblebee Bombus terricola was common in North America but has recently declined and is now on the IUCN Red List of threatened species. The causes of B. terricola’s decline are not well understood. Our objectives were to create a partial genome and then use this to estimate population data of conservation interest, and to determine whether genes showing signs of recent selection suggest a specific cause of decline. First, we generated a draft partial genome (contig set) for B. terricola, sequenced using Pacific Biosciences RS II at an average depth of 35×. Second, we sequenced the individual genomes of 22 bumblebee gynes from Ontario and Quebec using Illumina HiSeq 2500, each at an average depth of 20×, which were used to improve the PacBio genome calls and for population genetic analyses. The latter revealed that several samples had long runs of homozygosity, and individuals had high inbreeding coefficient F, consistent with low effective population size. Our data suggest that B. terricola’s effective population size has decreased orders of magnitude from pre-Holocene levels. We carried out tests of selection to identify genes that may have played a role in ameliorating environmental stressors underlying B. terricola’s decline. Several immune-related genes have signatures of recent positive selection, which is consistent with the pathogen-spillover hypothesis for B. terricola’s decline. The new B. terricola contig set can help solve the mystery of bumblebee decline by enabling functional genomics research to directly assess the health of pollinators and identify the stressors causing declines.


September 22, 2019

Complete sequence of kenaf (Hibiscus cannabinus) mitochondrial genome and comparative analysis with the mitochondrial genomes of other plants.

Plant mitochondrial (mt) genomes are species specific due to the vast of foreign DNA migration and frequent recombination of repeated sequences. Sequencing of the mt genome of kenaf (Hibiscus cannabinus) is essential for elucidating its evolutionary characteristics. In the present study, single-molecule real-time sequencing technology (SMRT) was used to sequence the complete mt genome of kenaf. Results showed that the complete kenaf mt genome was 569,915?bp long and consisted of 62 genes, including 36 protein-coding, 3 rRNA and 23 tRNA genes. Twenty-five introns were found among nine of the 36 protein-coding genes, and five introns were trans-spliced. A comparative analysis with other plant mt genomes showed that four syntenic gene clusters were conserved in all plant mtDNAs. Fifteen chloroplast-derived fragments were strongly associated with mt genes, including the intact sequences of the chloroplast genes psaA, ndhB and rps7. According to the plant mt genome evolution analysis, some ribosomal protein genes and succinate dehydrogenase genes were frequently lost during the evolution of angiosperms. Our data suggest that the kenaf mt genome retained evolutionarily conserved characteristics. Overall, the complete sequencing of the kenaf mt genome provides additional information and enhances our better understanding of mt genomic evolution across angiosperms.


September 22, 2019

Protocol: a versatile, inexpensive, high-throughput plant genomic DNA extraction method suitable for genotyping-by-sequencing.

The recent development of next-generation sequencing DNA marker technologies, such as genotyping-by-sequencing (GBS), generates thousands of informative single nucleotide polymorphism markers in almost any species, regardless of genomic resources. This enables poorly resourced or “orphan” crops/species access to high-density, high-throughput marker platforms which have revolutionised population genetics studies and plant breeding. DNA quality underpins success of GBS methods as the DNA must be amenable to restriction enzyme digestion and sequencing. A barrier to implementing GBS technologies is access to inexpensive, high-throughput extraction methods that yield sequencing-quality genomic DNA (gDNA) from plants. Several high-throughput DNA extraction methods are available, but typically provide low yield or poor quality gDNA, or are costly (US$6-$9/sample) for consumables.We modified a non-organic solvent protocol to extract microgram quantities (1-13 µg) of sequencing-quality high molecular weight gDNA inexpensively in 96-well plates from either fresh, freeze-dried or silica gel-dried plant tissue. The protocol was effective for several easy and difficult-to-extract forage, crop, horticultural and common model species including Trifolium, Medicago, Lolium, Secale, Festuca, Malus, Oryza, and Arabidopsis. The extracted DNA was of high molecular weight and digested readily with restriction enzymes. Contrasting with other extraction protocols we assessed, Illumina-based sequencing of GBS libraries developed from this gDNA had very uniform high quality base-calls to the end of sequence reads. Furthermore, DNA extracted using this method has been sequenced successfully with the PacBio long-read platform. The protocol is scalable, readily automated without requirement for fume hoods, requires approximately three hours to process 192 samples (384-576 samples/day), and is inexpensive at US$0.62/sample for consumables.This versatile, scalable and simple protocol yields high molecular weight genomic DNA suitable for restriction enzyme digestion and next-generation sequencing applications including GBS and long-read sequencing platforms such as PacBio. The low cost, high-throughput, and extraction of high quality gDNA from a range of fresh and dried source plant material makes this method suitable for many sequencing and genotyping applications including large-scale sample screening underpinning breeding programmes.


September 22, 2019

Complete genome of streamlined marine actinobacterium Pontimonas salivibrio strain CL-TW6T adapted to coastal planktonic lifestyle.

Pontimonas salivibrio strain CL-TW6T (=KCCM 90105?=?JCM18206) was characterized as the type strain of a new genus within the Actinobacterial family Microbacteriaceae. It was isolated from a coastal marine environment in which members of Microbactericeae have not been previously characterized.The genome of P. salivibrio CL-TW6T was a single chromosome of 1,760,810 bp. Genomes of this small size are typically found in bacteria growing slowly in oligotrophic zones and said to be streamlined. Phylogenetic analysis showed it to represent a lineage originating in the Microbacteriaceae radiation occurring before the snowball Earth glaciations, and to have a closer relationship with some streamlined bacteria known through metagenomic data. Several genomic characteristics typical of streamlined bacteria are found: %G?+?C is lower than non-streamlined members of the phylum; there are a minimal number of rRNA and tRNA genes, fewer paralogs in most gene families, and only two sigma factors; there is a noticeable absence of some nonessential metabolic pathways, including polyketide synthesis and catabolism of some amino acids. There was no indication of any phage genes or plasmids, however, a system of active insertion elements was present. P. salivibrio appears to be unusual in having polyrhamnose-based cell wall oligosaccharides instead of mycolic acid or teichoic acid-based oligosaccharides. Oddly, it conducts sulfate assimilation apparently for sulfating cell wall components, but not for synthesizing amino acids. One gene family it has more of, rather than fewer of, are toxin/antitoxin systems, which are thought to down-regulate growth during nutrient deprivation or other stressful conditions.Because of the relatively small number of paralogs and its relationship to the heavily characterized Mycobacterium tuberculosis, we were able to heavily annotate the genome of P. salivibrio CL-TW6T. Its streamlined status and relationship to streamlined metagenomic constructs makes it an important reference genome for study of the streamlining concept. The final evolutionary trajectory of CL-TW6 T was to adapt to growth in a non-oligotrophic coastal zone. To understand that adaptive process, we give a thorough accounting of gene content, contrasting with both oligotrophic streamlined bacteria and large genome bacteria, and distinguishing between genes derived by vertical and horizontal descent.


September 22, 2019

The complete mitochondrial genome of the early flowering plant Nymphaea colorata is highly repetitive with low recombination.

Mitochondrial genomes of flowering plants (angiosperms) are highly dynamic in genome structure. The mitogenome of the earliest angiosperm Amborella is remarkable in carrying rampant foreign DNAs, in contrast to Liriodendron, the other only known early angiosperm mitogenome that is described as ‘fossilized’. The distinctive features observed in the two early flowering plant mitogenomes add to the current confusions of what early flowering plants look like. Expanded sampling would provide more details in understanding the mitogenomic evolution of early angiosperms. Here we report the complete mitochondrial genome of water lily Nymphaea colorata from Nymphaeales, one of the three orders of the earliest angiosperms.Assembly of data from Pac-Bio long-read sequencing yielded a circular mitochondria chromosome of 617,195 bp with an average depth of 601×. The genome encoded 41 protein coding genes, 20 tRNA and three rRNA genes with 25 group II introns disrupting 10 protein coding genes. Nearly half of the genome is composed of repeated sequences, which contributed substantially to the intron size expansion, making the gross intron length of the Nymphaea mitochondrial genome one of the longest among angiosperms, including an 11.4-Kb intron in cox2, which is the longest organellar intron reported to date in plants. Nevertheless, repeat mediated homologous recombination is unexpectedly low in Nymphaea evidenced by 74 recombined reads detected from ten recombinationally active repeat pairs among 886,982 repeat pairs examined. Extensive gene order changes were detected in the three early angiosperm mitogenomes, i.e. 38 or 44 events of inversions and translocations are needed to reconcile the mitogenome of Nymphaea with Amborella or Liriodendron, respectively. In contrast to Amborella with six genome equivalents of foreign mitochondrial DNA, not a single horizontal gene transfer event was observed in the Nymphaea mitogenome.The Nymphaea mitogenome resembles the other available early angiosperm mitogenomes by a similarly rich 64-coding gene set, and many conserved gene clusters, whereas stands out by its highly repetitive nature and resultant remarkable intron expansions. The low recombination level in Nymphaea provides evidence for the predominant master conformation in vivo with a highly substoichiometric set of rearranged molecules.


September 22, 2019

Orphan legumes growing in dry environments: Marama bean as a case study.

Plants have developed morphological, physiological, biochemical, cellular, and molecular mechanisms to survive in drought-stricken environments with little or no water caused by below-average precipitation. In this mini-review, we highlight the characteristics that allows marama bean [Tylosema esculentum (Burchell) Schreiber], an example of an orphan legume native to arid regions of southwestern Southern Africa, to flourish under an inhospitable climate and dry soil conditions where no other agricultural crop competes in this agro-ecological zone. Orphan legumes are often better suited to withstand such harsh growth environments due to development of survival strategies using a combination of different traits and responses. Recent findings on questions on marama bean speciation, hybridization, population dynamics, and the evolutionary history of the bean and mechanisms by which the bean is able to extract and conserve water and nutrients from its environment as well as aspects of morphological and physiological adaptation will be reviewed. The importance of the soil microbiome and the genetic diversity in this species, and their interplay, as a reservoir for improvement will also be considered. In particular, the application of the newly established marama bean genome sequence will facilitate both the identification of important genes involved in the interaction with the soil microbiome and the identification of the diversity within the wild germplasm for genes involved drought tolerance. Since predicted future changes in climatic conditions, with less water availability for plant growth, will severely affect agricultural productivity, an understanding of the mechanisms of unique adaptations in marama bean to such conditions may also provide insights as to how to improve the performance of the major crops.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.