Menu
September 22, 2019  |  

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Echinoderms exhibit several fascinating evolutionary innovations that are rarely seen in the animal kingdom, but how these animals attained such features is not well understood. Here we report the sequencing and analysis of the genome and extensive transcriptomes of the sea cucumber Apostichopus japonicus, a species from a special echinoderm group with extraordinary potential for saponin synthesis, aestivation and organ regeneration. The sea cucumber does not possess a reorganized Hox cluster as previously assumed for all echinoderms, and the spatial expression of Hox7 and Hox11/13b potentially guides the embryo-to-larva axial transformation. Contrary to the typical production of lanosterol in animal cholesterol synthesis, the oxidosqualene cyclase of sea cucumber produces parkeol for saponin synthesis and has “plant-like” motifs suggestive of convergent evolution. The transcriptional factors Klf2 and Egr1 are identified as key regulators of aestivation, probably exerting their effects through a clock gene-controlled process. Intestinal hypometabolism during aestivation is driven by the DNA hypermethylation of various metabolic gene pathways, whereas the transcriptional network of intestine regeneration involves diverse signaling pathways, including Wnt, Hippo and FGF. Decoding the sea cucumber genome provides a new avenue for an in-depth understanding of the extraordinary features of sea cucumbers and other echinoderms.


September 22, 2019  |  

Genome Assembly.

Genome assembly uses sequence similarity to go from sequencing reads to longer contiguous sequences (contigs). Scaffolds are contigs linked together by gaps where the order and orientation of the contigs is known but the exact sequence connecting two contigs is unknown, represented by Ns which estimate the gap length. Here we describe recommendations for genome assembly for different sequencing technologies, describe organelle assembly, and review how to perform assembly quality control.


September 22, 2019  |  

Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.

Luo-han-guo (Siraitia grosvenorii), also called monk fruit, is a member of the Cucurbitaceae family. Monk fruit has become an important area for research because of the pharmacological and economic potential of its noncaloric, extremely sweet components (mogrosides). It is also commonly used in traditional Chinese medicine for the treatment of lung congestion, sore throat, and constipation. Recently, a single reference genome became available for monk fruit, assembled from 36.9x genome coverage reads via Illumina sequencing platforms. This genome assembly has a relatively short (34.2 kb) contig N50 length and lacks integrated annotations. These drawbacks make it difficult to use as a reference in assembling transcriptomes and discovering novel functional genes.Here, we offer a new high-quality draft of the S. grosvenorii genome assembled using 31 Gb (~73.8x) long single molecule real time sequencing reads and polished with ~50 Gb Illumina paired-end reads. The final genome assembly is approximately 469.5 Mb, with a contig N50 length of 432,384 bp, representing a 12.6-fold improvement. We further annotated 237.3 Mb of repetitive sequence and 30,565 consensus protein coding genes with combined evidence. Phylogenetic analysis showed that S. grosvenorii diverged from members of the Cucurbitaceae family approximately 40.9 million years ago. With comprehensive transcriptomic analysis and differential expression testing, we identified 4,606 up-regulated genes in the early fruit compared to the leaf, a number of which were linked to metabolic pathways regulating fruit development and ripening.The availability of this new monk fruit genome assembly, as well as the annotations, will facilitate the discovery of new functional genes and the genetic improvement of monk fruit.


September 22, 2019  |  

High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant.

Salvia splendens Ker-Gawler, scarlet or tropical sage, is a tender herbaceous perennial widely introduced and seen in public gardens all over the world. With few molecular resources, breeding is still restricted to traditional phenotypic selection, and the genetic mechanisms underlying phenotypic variation remain unknown. Hence, a high-quality reference genome will be very valuable for marker-assisted breeding, genome editing, and molecular genetics.We generated 66 Gb and 37 Gb of raw DNA sequences, respectively, from whole-genome sequencing of a largely homozygous scarlet sage inbred line using Pacific Biosciences (PacBio) single-molecule real-time and Illumina HiSeq sequencing platforms. The PacBio de novo assembly yielded a final genome with a scaffold N50 size of 3.12 Mb and a total length of 808 Mb. The repetitive sequences identified accounted for 57.52% of the genome sequence, and ?54,008 protein-coding genes were predicted collectively with ab initio and homology-based gene prediction from the masked genome. The divergence time between S. splendens and Salvia miltiorrhiza was estimated at 28.21 million years ago (Mya). Moreover, 3,797 species-specific genes and 1,187 expanded gene families were identified for the scarlet sage genome.We provide the first genome sequence and gene annotation for the scarlet sage. The availability of these resources will be of great importance for further breeding strategies, genome editing, and comparative genomics among related species.


September 22, 2019  |  

Hotspots of independent and multiple rounds of LTR-retrotransposon bursts in Brassica species

Long terminal repeat retrotransposons (LTR-RTs) are a predominant group of plant transposable elements (TEs) that are an important component of plant genomes. A large number of LTR-RTs have been annotated in the genomes of the agronomically important oil and vegetable crops of the genus Brassica. Herein, full-length LTR-RTs in the genomes of Brassica and other closely related species were systematically analyzed. The full-length LTR-RT content varied greatly (from 0.43% to 23.4%) between different species, with Gypsy-like LTR-RTs constituting a primary group across these genomes. More importantly, many annotated LTR-RTs (from 10.03% to 33.25% of all detected LTR-RTs) were found to be enriched in localized hotspot regions. Furthermore, all of the analyzed species showed evidence of having experienced at least one round of a LTR-RT burst, with Raphanus sativus experiencing three or more. Moreover, these relatively ancient LTR-RT amplifications exhibited a clear expansion at specific time points. To gain a further understanding of this timing, Brassica rapa, B. oleracea, and R. sativus were examined for the presence of syntenic regions, but none were present. These findings indicate that these LTR-RT burst events were not inherited from a common ancestor, but instead were species-specific bursts that occurred after the divergence of Brassica species. This study further exemplifies the complexities of TE amplifications during the evolution of plant genomes and suggests that these LTR-RT bursts play an important role in genome expansion and divergence in Brassica species.


September 22, 2019  |  

Comparative genomic analysis of Bacillus thuringiensis reveals molecular adaptation to copper tolerance

Bacillus thuringiensis is a type of Gram positive and rod shaped bacterium that is found in a wide range of habitats. Despite the intensive studies conducted on this bacterium, most of the information available are related to its pathogenic characteristics, with only a limited number of publications mentioning its ability to survive in extreme environments. Recently, a B. thuringiensis MCMY1 strain was successfully isolated from a copper contaminated site in Mamut Copper Mine, Sabah. This study aimed to conduct a comparative genomic analysis by using the genome sequence of MCMY1 strain published in GenBank (PRJNA374601) as a target genome for comparison with other available B. thuringiensis genomes at the GenBank. Whole genome alignment, Fragment all-against-all comparison analysis, phylogenetic reconstruction and specific copper genes comparison were applied to all forty-five B. thuringiensis genomes to reveal the molecular adaptation to copper tolerance. The comparative results indicated that B. thuringiensis MCMY1 strain is closely related to strain Bt407 and strain IS5056. This strain harbors almost all available copper genes annotated from the forty-five B. thuringiensis genomes, except for the gene for Magnesium and cobalt efflux protein (CorC) which plays an indirect role in reducing the oxidative stress that caused by copper and other metal ions. Furthermore, the findings also showed that the Copper resistance gene family, CopABCDZ and its repressor (CsoR) are conserved in almost all sequenced genomes but the presence of the genes for Cytoplasmic copper homeostasis protein (CutC) and CorC across the sample genomes are highly inconsonant. The variation of these genes across the B. thuringiensis genomes suggests that each strain may have adapted to their specific ecological niche. However, further investigations will be need to support this preliminary hypothesis.


September 22, 2019  |  

Genome survey of the freshwater mussel Venustaconcha ellipsiformis (Bivalvia: Unionida) using a hybrid de novo assembly approach.

Freshwater mussels (Bivalvia: Unionida) serve an important role as aquatic ecosystem engineers but are one of the most critically imperilled groups of animals. Here, we used a combination of sequencing strategies to assemble and annotate a draft genome of Venustaconcha ellipsiformis, which will serve as a valuable genomic resource given the ecological value and unique “doubly uniparental inheritance” mode of mitochondrial DNA transmission of freshwater mussels. The genome described here was obtained by combining high-coverage short reads (65× genome coverage of Illumina paired-end and 11× genome coverage of mate-pairs sequences) with low-coverage Pacific Biosciences long reads (0.3× genome coverage). Briefly, the final scaffold assembly accounted for a total size of 1.54?Gb (366,926 scaffolds, N50?=?6.5 kb, with 2.3% of “N” nucleotides), representing 86% of the predicted genome size of 1.80?Gb, while over one third of the genome (37.5%) consisted of repeated elements and >85% of the core eukaryotic genes were recovered. Given the repeated genetic bottlenecks of V. ellipsiformis populations as a result of glaciations events, heterozygosity was also found to be remarkably low (0.6%), in contrast to most other sequenced bivalve species. Finally, we reassembled the full mitochondrial genome and found six polymorphic sites with respect to the previously published reference. This resource opens the way to comparative genomics studies to identify genes related to the unique adaptations of freshwater mussels and their distinctive mitochondrial inheritance mechanism.


September 22, 2019  |  

Genome-based population structure analysis of the strawberry plant pathogen Xanthomonas fragariae reveals two distinct groups that evolved independently before its species description.

Xanthomonas fragariae is a quarantine organism in Europe, causing angular leaf spots on strawberry plants. It is spreading worldwide in strawberry-producing regions due to import of plant material through trade and human activities. In order to resolve the population structure at the strain level, we have employed high-resolution molecular typing tools on a comprehensive strain collection representing global and temporal distribution of the pathogen. Clustered regularly interspaced short palindromic repeat regions (CRISPRs) and variable number of tandem repeats (VNTRs) were identified within the reference genome of X. fragariae LMG 25863 as a potential source of variation. Strains from our collection were whole-genome sequenced and used in order to identify variable spacers and repeats for discriminative purpose. CRISPR spacer analysis and multiple-locus VNTR analysis (MLVA) displayed a congruent population structure, in which two major groups and a total of four subgroups were revealed. The two main groups were genetically separated before the first X. fragariae isolate was described and are potentially responsible for the worldwide expansion of the bacterial disease. Three primer sets were designed for discriminating CRISPR-associated markers in order to streamline group determination of novel isolates. Overall, this study describes typing methods to discriminate strains and monitor the pathogen population structure, more especially in the view of a new outbreak of the pathogen.


September 22, 2019  |  

A chromosome scale assembly of the model desiccation tolerant grass Oropetium thomaeum

Oropetium thomaeum is an emerging model for desiccation tolerance and genome size evolution in grasses. A high-quality draft genome of Oropetium was recently sequenced, but the lack of a chromosome scale assembly has hindered comparative analyses and downstream functional genomics. Here, we reassembled Oropetium, and anchored the genome into ten chromosomes using Hi-C based chromatin interactions. A combination of high-resolution RNAseq data and homology-based gene prediction identified thousands of new, conserved gene models that were absent from the V1 assembly. This includes thousands of new genes with high expression across a desiccation timecourse. The sorghum and Oropetium genomes have a surprising degree of chromosome-level collinearity, and several chromosome pairs have near perfect synteny. Other chromosomes are collinear in the gene rich chromosome arms but have experienced pericentric translocations. Together, these resources will be useful for the grass comparative genomic community and further establish Oropetium as a model resurrection plant.


September 22, 2019  |  

Optical and physical mapping with local finishing enables megabase-scale resolution of agronomically important regions in the wheat genome.

Numerous scaffold-level sequences for wheat are now being released and, in this context, we report on a strategy for improving the overall assembly to a level comparable to that of the human genome.Using chromosome 7A of wheat as a model, sequence-finished megabase-scale sections of this chromosome were established by combining a new independent assembly using a bacterial artificial chromosome (BAC)-based physical map, BAC pool paired-end sequencing, chromosome-arm-specific mate-pair sequencing and Bionano optical mapping with the International Wheat Genome Sequencing Consortium RefSeq v1.0 sequence and its underlying raw data. The combined assembly results in 18 super-scaffolds across the chromosome. The value of finished genome regions is demonstrated for two approximately 2.5 Mb regions associated with yield and the grain quality phenotype of fructan carbohydrate grain levels. In addition, the 50 Mb centromere region analysis incorporates cytological data highlighting the importance of non-sequence data in the assembly of this complex genome region.Sufficient genome sequence information is shown to now be available for the wheat community to produce sequence-finished releases of each chromosome of the reference genome. The high-level completion identified that an array of seven fructosyl transferase genes underpins grain quality and that yield attributes are affected by five F-box-only-protein-ubiquitin ligase domain and four root-specific lipid transfer domain genes. The completed sequence also includes the centromere.


September 22, 2019  |  

Evolution of the U.S. biological select agent Rathayibacter toxicus.

Rathayibacter toxicus is a species of Gram-positive, corynetoxin-producing bacteria that causes annual ryegrass toxicity, a disease often fatal to grazing animals. A phylogenomic approach was employed to model the evolution of R. toxicus to explain the low genetic diversity observed among isolates collected during a 30-year period of sampling in three regions of Australia, gain insight into the taxonomy of Rathayibacter, and provide a framework for studying these bacteria. Analyses of a data set of more than 100 sequenced Rathayibacter genomes indicated that Rathayibacter forms nine species-level groups. R. toxicus is the most genetically distant, and evidence suggested that this species experienced a dramatic event in its evolution. Its genome is significantly reduced in size but is colinear to those of sister species. Moreover, R. toxicus has low intergroup genomic diversity and almost no intragroup genomic diversity between ecologically separated isolates. R. toxicus is the only species of the genus that encodes a clustered regularly interspaced short palindromic repeat (CRISPR) locus and that is known to host a bacteriophage parasite. The spacers, which represent a chronological history of infections, were characterized for information on past events. We propose a three-stage process that emphasizes the importance of the bacteriophage and CRISPR in the genome reduction and low genetic diversity of the R. toxicus species.IMPORTANCERathayibacter toxicus is a toxin-producing species found in Australia and is often fatal to grazing animals. The threat of introduction of the species into the United States led to its inclusion in the Federal Select Agent Program, which makes R. toxicus a highly regulated species. This work provides novel insights into the evolution of R. toxicusR. toxicus is the only species in the genus to have acquired a CRISPR adaptive immune system to protect against bacteriophages. Results suggest that coexistence with the bacteriophage NCPPB3778 led to the massive shrinkage of the R. toxicus genome, species divergence, and the maintenance of low genetic diversity in extant bacterial groups. This work contributes to an understanding of the evolution and ecology of an agriculturally important species of bacteria. Copyright © 2018 Davis et al.


September 22, 2019  |  

Draft genome of Glyptosternon maculatum, an endemic fish from Tibet Plateau.

Mechanisms for high-altitude adaption have attracted widespread interest among evolutionary biologists. Several genome-wide studies have been carried out for endemic vertebrates in Tibet, including mammals, birds, and amphibians. However, little information is available about the adaptive evolution of highland fishes. Glyptosternon maculatum (Regan 1905), also known as Regan or barkley and endemic to the Tibetan Plateau, belongs to the Sisoridae family, order Siluriformes (catfishes). This species lives at an elevation ranging from roughly 2,800 m to 4,200 m. Hence, a high-quality reference genome of G. maculatum provides an opportunity to investigate high-altitude adaption mechanisms of fishes.To obtain a high-quality reference genome sequence of G. maculatum, we combined Pacific Bioscience single-molecule real-time sequencing, Illumina paired-end sequencing, 10X Genomics linked-reads, and BioNano optical map techniques. In total, 603.99 Gb sequencing data were generated. The assembled genome was about 662.34 Mb with scaffold and contig N50 sizes of 20.90 Mb and 993.67 kb, respectively, which captured 83% complete and 3.9% partial vertebrate Benchmarking Universal Single-Copy Orthologs. Repetitive elements account for 35.88% of the genome, and ?22,066 protein-coding genes were predicted from the genome, of which 91.7% have been functionally annotated.We present the first comprehensive de novo genome of G. maculatum. This genetic resource is fundamental for investigating the origin of G. maculatum and will improve our understanding of high-altitude adaption of fishes. The assembled genome can also be used as reference for future population genetic studies of G. maculatum.


September 22, 2019  |  

Draft genome sequence of wild Prunus yedoensis reveals massive inter-specific hybridization between sympatric flowering cherries.

Hybridization is an important evolutionary process that results in increased plant diversity. Flowering Prunus includes popular cherry species that are appreciated worldwide for their flowers. The ornamental characteristics were acquired both naturally and through artificially hybridizing species with heterozygous genomes. Therefore, the genome of hybrid flowering Prunus presents important challenges both in plant genomics and evolutionary biology.We use long reads to sequence and analyze the highly heterozygous genome of wild Prunus yedoensis. The genome assembly covers >?93% of the gene space; annotation identified 41,294 protein-coding genes. Comparative analysis of the genome with 16 accessions of six related taxa shows that 41% of the genes were assigned into the maternal or paternal state. This indicates that wild P. yedoensis is an F1 hybrid originating from a cross between maternal P. pendula f. ascendens and paternal P. jamasakura, and it can be clearly distinguished from its confusing taxon, Yoshino cherry. A focused analysis of the S-locus haplotypes of closely related taxa distributed in a sympatric natural habitat suggests that reduced restriction of inter-specific hybridization due to strong gametophytic self-incompatibility is likely to promote complex hybridization of wild Prunus species and the development of a hybrid swarm.We report the draft genome assembly of a natural hybrid Prunus species using long-read sequencing and sequence phasing. Based on a comprehensive comparative genome analysis with related taxa, it appears that cross-species hybridization in sympatric habitats is an ongoing process that facilitates the diversification of flowering Prunus.


September 22, 2019  |  

Antagonistic pleiotropy in the bifunctional surface protein FadL (OmpP1) during adaptation of Haemophilus influenzae to chronic lung infection associated with chronic obstructive pulmonary disease.

Tracking bacterial evolution during chronic infection provides insights into how host selection pressures shape bacterial genomes. The human-restricted opportunistic pathogen nontypeable Haemophilus influenzae (NTHi) infects the lower airways of patients suffering chronic obstructive pulmonary disease (COPD) and contributes to disease progression. To identify bacterial genetic variation associated with bacterial adaptation to the COPD lung, we sequenced the genomes of 92 isolates collected from the sputum of 13 COPD patients over 1 to 9?years. Individuals were colonized by distinct clonal types (CTs) over time, but the same CT was often reisolated at a later time or found in different patients. Although genomes from the same CT were nearly identical, intra-CT variation due to mutation and recombination occurred. Recurrent mutations in several genes were likely involved in COPD lung adaptation. Notably, nearly a third of CTs were polymorphic for null alleles of ompP1 (also called fadL), which encodes a bifunctional membrane protein that both binds the human carcinoembryonic antigen-related cell adhesion molecule 1 (hCEACAM1) receptor and imports long-chain fatty acids (LCFAs). Our computational studies provide plausible three-dimensional models for FadL’s interaction with hCEACAM1 and LCFA binding. We show that recurrent fadL mutations are likely a case of antagonistic pleiotropy, since loss of FadL reduces NTHi’s ability to infect epithelia but also increases its resistance to bactericidal LCFAs enriched within the COPD lung. Supporting this interpretation, truncated fadL alleles are common in publicly available NTHi genomes isolated from the lower airway tract but rare in others. These results shed light on molecular mechanisms of bacterial pathoadaptation and guide future research toward developing novel COPD therapeutics.IMPORTANCE Nontypeable Haemophilus influenzae is an important pathogen in patients with chronic obstructive pulmonary disease (COPD). To elucidate the bacterial pathways undergoing in vivo evolutionary adaptation, we compared bacterial genomes collected over time from 13 COPD patients and identified recurrent genetic changes arising in independent bacterial lineages colonizing different patients. Besides finding changes in phase-variable genes, we found recurrent loss-of-function mutations in the ompP1 (fadL) gene. We show that loss of OmpP1/FadL function reduces this bacterium’s ability to infect cells via the hCEACAM1 epithelial receptor but also increases its resistance to bactericidal fatty acids enriched within the COPD lung, suggesting a case of antagonistic pleiotropy that restricts ?fadL strains’ niche. These results show how H. influenzae adapts to host-generated inflammatory mediators in the COPD airways. Copyright © 2018 Moleres et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.