Menu
September 22, 2019

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

Arachis monticola (2n = 4x = 40) is the only allotetraploid wild peanut within the Arachis genus and section, with an AABB-type genome of ~2.7 Gb in size. The AA-type subgenome is derived from diploid wild peanut Arachis duranensis, and the BB-type subgenome is derived from diploid wild peanut Arachis ipaensis. A. monticola is regarded either as the direct progenitor of the cultivated peanut or as an introgressive derivative between the cultivated peanut and wild species. The large polyploidy genome structure and enormous nearly identical regions of the genome make the assembly of chromosomal pseudomolecules very challenging. Here we report the first reference quality assembly of the A. monticola genome, using a series of advanced technologies. The final whole genome of A. monticola is ~2.62 Gb and has a contig N50 and scaffold N50 of 106.66 Kb and 124.92 Mb, respectively. The vast majority (91.83%) of the assembled sequence was anchored onto the 20 pseudo-chromosomes, and 96.07% of assemblies were accurately separated into AA- and BB- subgenomes. We demonstrated efficiency of the current state of the strategy for de novo assembly of the highly complex allotetraploid species, wild peanut (A. monticola), based on whole-genome shotgun sequencing, single molecule real-time sequencing, high-throughput chromosome conformation capture technology, and BioNano optical genome maps. These combined technologies produced reference-quality genome of the allotetraploid wild peanut, which is valuable for understanding the peanut domestication and evolution within the Arachis genus and among legume crops.


September 22, 2019

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Echinoderms exhibit several fascinating evolutionary innovations that are rarely seen in the animal kingdom, but how these animals attained such features is not well understood. Here we report the sequencing and analysis of the genome and extensive transcriptomes of the sea cucumber Apostichopus japonicus, a species from a special echinoderm group with extraordinary potential for saponin synthesis, aestivation and organ regeneration. The sea cucumber does not possess a reorganized Hox cluster as previously assumed for all echinoderms, and the spatial expression of Hox7 and Hox11/13b potentially guides the embryo-to-larva axial transformation. Contrary to the typical production of lanosterol in animal cholesterol synthesis, the oxidosqualene cyclase of sea cucumber produces parkeol for saponin synthesis and has “plant-like” motifs suggestive of convergent evolution. The transcriptional factors Klf2 and Egr1 are identified as key regulators of aestivation, probably exerting their effects through a clock gene-controlled process. Intestinal hypometabolism during aestivation is driven by the DNA hypermethylation of various metabolic gene pathways, whereas the transcriptional network of intestine regeneration involves diverse signaling pathways, including Wnt, Hippo and FGF. Decoding the sea cucumber genome provides a new avenue for an in-depth understanding of the extraordinary features of sea cucumbers and other echinoderms.


September 22, 2019

A comprehensive understanding of the biocontrol potential of Bacillus velezensis LM2303 against Fusarium head blight.

Fusarium head blight (FHB) mainly caused by F. graminearum, always brings serious damage to wheat production worldwide. In this study, we found that strain LM2303 had strong antagonist activity against F. graminearum and significantly reduced disease severity of FHB with the control efficiency of 72.3% under field conditions. To gain a comprehensive understanding of the biocontrol potential of strain LM2303 against FHB, an integrated approach of genome mining and chemical analysis was employed. The whole genome of strain LM2303 was obtained and analyzed, showing the largest number of genes/gene clusters associated with biocontrol functions as compared with the known biocontrol strains (FZB42, M75, CAU B946). And strain LM2303 was accurately determined as a member of the B. velezensis clade using the phylogenomic analysis of single-copy core genes. Through genome mining, 13 biosynthetic gene clusters(BGCs) encoding secondary metabolites with biocontrol functions were identified, which were further confirmed through chemical analyses such as UHPLC-ESI-MS, including three antifungal metabolites (fengycin B, iturin A, and surfactin A), eight antibacterial metabolites (surfactin A, butirosin, plantazolicin and hydrolyzed plantazolicin, kijanimicin, bacilysin, difficidin, bacillaene A and bacillaene B, 7-o-malonyl macrolactin A and 7-o-succinyl macrolactin A), the siderophore bacillibactin, molybdenum cofactor and teichuronic acid. In addition, genes/gene clusters involved in plant colonization, plant growth promotion and induced systemic resistance were also found and analyzed, along with the corresponding metabolites. Finally, four different mechanisms of strain LM2303 involved in the biocontrol of FHB were putatively obtained. This work provides better insights into a mechanistic understanding of strain LM2303 in control of FHB, reinforcing the higher potential of this strain as a powerful biocontrol strain agent (BCA) for FHB control. The results also provide scientific reference and comparison for other biocontrol strains.


September 22, 2019

Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.

Luo-han-guo (Siraitia grosvenorii), also called monk fruit, is a member of the Cucurbitaceae family. Monk fruit has become an important area for research because of the pharmacological and economic potential of its noncaloric, extremely sweet components (mogrosides). It is also commonly used in traditional Chinese medicine for the treatment of lung congestion, sore throat, and constipation. Recently, a single reference genome became available for monk fruit, assembled from 36.9x genome coverage reads via Illumina sequencing platforms. This genome assembly has a relatively short (34.2 kb) contig N50 length and lacks integrated annotations. These drawbacks make it difficult to use as a reference in assembling transcriptomes and discovering novel functional genes.Here, we offer a new high-quality draft of the S. grosvenorii genome assembled using 31 Gb (~73.8x) long single molecule real time sequencing reads and polished with ~50 Gb Illumina paired-end reads. The final genome assembly is approximately 469.5 Mb, with a contig N50 length of 432,384 bp, representing a 12.6-fold improvement. We further annotated 237.3 Mb of repetitive sequence and 30,565 consensus protein coding genes with combined evidence. Phylogenetic analysis showed that S. grosvenorii diverged from members of the Cucurbitaceae family approximately 40.9 million years ago. With comprehensive transcriptomic analysis and differential expression testing, we identified 4,606 up-regulated genes in the early fruit compared to the leaf, a number of which were linked to metabolic pathways regulating fruit development and ripening.The availability of this new monk fruit genome assembly, as well as the annotations, will facilitate the discovery of new functional genes and the genetic improvement of monk fruit.


September 22, 2019

RAD sequencing and a hybrid Antarctic fur seal genome assembly reveal rapidly decaying linkage disequilibrium, global population structure and evidence for inbreeding.

Recent advances in high throughput sequencing have transformed the study of wild organisms by facilitating the generation of high quality genome assemblies and dense genetic marker datasets. These resources have the potential to significantly advance our understanding of diverse phenomena at the level of species, populations and individuals, ranging from patterns of synteny through rates of linkage disequilibrium (LD) decay and population structure to individual inbreeding. Consequently, we used PacBio sequencing to refine an existing Antarctic fur seal (Arctocephalus gazella) genome assembly and genotyped 83 individuals from six populations using restriction site associated DNA (RAD) sequencing. The resulting hybrid genome comprised 6,169 scaffolds with an N50 of 6.21 Mb and provided clear evidence for the conservation of large chromosomal segments between the fur seal and dog (Canis lupus familiaris). Focusing on the most extensively sampled population of South Georgia, we found that LD decayed rapidly, reaching the background level by around 400 kb, consistent with other vertebrates but at odds with the notion that fur seals experienced a strong historical bottleneck. We also found evidence for population structuring, with four main Antarctic island groups being resolved. Finally, appreciable variance in individual inbreeding could be detected, reflecting the strong polygyny and site fidelity of the species. Overall, our study contributes important resources for future genomic studies of fur seals and other pinnipeds while also providing a clear example of how high throughput sequencing can generate diverse biological insights at multiple levels of organization. Copyright © 2018 Humble et al.


September 22, 2019

Two ancestral genes shaped the Xanthomonas campestris TAL effector gene repertoire.

Xanthomonas transcription activator-like effectors (TALEs) are injected inside plant cells to promote host susceptibility by enhancing transcription of host susceptibility genes. TALE-encoding (tal) genes were thought to be absent from Brassicaceae-infecting Xanthomonas campestris (Xc) genomes based on four reference genomic sequences. We discovered tal genes in 26 of 49 Xc strains isolated worldwide and used a combination of single molecule real time (SMRT) and tal amplicon sequencing to yield a near-complete description of the TALEs found in Xc (Xc TALome). The 53 sequenced tal genes encode 21 distinct DNA binding domains that sort into seven major DNA binding specificities. In silico analysis of the Brassica rapa promoterome identified a repertoire of predicted TALE targets, five of which were experimentally validated using quantitative reverse transcription polymerase chain reaction. The Xc TALome shows multiple signs of DNA rearrangements that probably drove its evolution from two ancestral tal genes. We discovered that Tal12a and Tal15a of Xcc strain Xca5 contribute together in the development of disease symptoms on susceptible B. oleracea var. botrytis cv Clovis. This large and polymorphic repertoire of TALEs opens novel perspectives for elucidating TALE-mediated susceptibility of Brassicaceae to black rot disease and for understanding the molecular processes underlying TALE evolution.© 2018 The Authors New Phytologist © 2018 New Phytologist Trust.


September 22, 2019

Complete genome sequence provides insights into the biodrying-related microbial function of Bacillus thermoamylovorans isolated from sewage sludge biodrying material.

To enable the development of microbial agents and identify suitable candidate used for biodrying, the existence and function of Bacillus thermoamylovorans during sewage sludge biodrying merits investigation. This study isolated a strain of B. thermoamylovorans during sludge biodrying, submitted it for complete genome sequencing and analyzed its potential microbial functions. After biodrying, the moisture content of the biodrying material decreased from 66.33% to 50.18%, and B. thermoamylovorans was the ecologically dominant Bacillus, with the primary annotations associated with amino acid transport and metabolism (9.53%) and carbohydrate transport and metabolism (8.14%). It contains 96 carbohydrate-active- enzyme-encoding gene counts, mainly distributed in glycoside hydrolases (33.3%) and glycosyl transferases (27.1%). The virulence factors are mainly associated with biosynthesis of capsule and polysaccharide capsule. This work indicates that among the biodrying microorganisms, B. thermoamylovorans has good potential for degrading recalcitrant and readily degradable components, thus being a potential microbial agent used to improve biodrying. Copyright © 2018 Elsevier Ltd. All rights reserved.


September 22, 2019

A high-quality genome sequence of Rosa chinensis to elucidate ornamental traits.

Rose is the world’s most important ornamental plant, with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Roses are outbred and can have various ploidy levels. Our objectives were to develop a high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short reads, and anchoring to a high-density genetic map, and to study the genome structure and genetic basis of major ornamental traits. We produced a doubled haploid rose line (‘HapOB’) from Rosa chinensis ‘Old Blush’ and generated a rose genome assembly anchored to seven pseudo-chromosomes (512?Mb with N50 of 3.4?Mb and 564 contigs). The length of 512?Mb represents 90.1-96.1% of the estimated haploid genome size of rose. Of the assembly, 95% is contained in only 196 contigs. The anchoring was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features, including the pericentromeric regions, through annotation of transposable element families and positioned centromeric repeats using fluorescent in situ hybridization. The rose genome displays extensive synteny with the Fragaria vesca genome, and we delineated only two major rearrangements. Genetic diversity was analysed using resequencing data of seven diploid and one tetraploid Rosa species selected from various sections of the genus. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and the number of flower petals. A rose APETALA2/TOE homologue is proposed to be the major regulator of petal number in rose. This reference sequence is an important resource for studying polyploidization, meiosis and developmental processes, as we demonstrated for flower and prickle development. It will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.


September 22, 2019

High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant.

Salvia splendens Ker-Gawler, scarlet or tropical sage, is a tender herbaceous perennial widely introduced and seen in public gardens all over the world. With few molecular resources, breeding is still restricted to traditional phenotypic selection, and the genetic mechanisms underlying phenotypic variation remain unknown. Hence, a high-quality reference genome will be very valuable for marker-assisted breeding, genome editing, and molecular genetics.We generated 66 Gb and 37 Gb of raw DNA sequences, respectively, from whole-genome sequencing of a largely homozygous scarlet sage inbred line using Pacific Biosciences (PacBio) single-molecule real-time and Illumina HiSeq sequencing platforms. The PacBio de novo assembly yielded a final genome with a scaffold N50 size of 3.12 Mb and a total length of 808 Mb. The repetitive sequences identified accounted for 57.52% of the genome sequence, and ?54,008 protein-coding genes were predicted collectively with ab initio and homology-based gene prediction from the masked genome. The divergence time between S. splendens and Salvia miltiorrhiza was estimated at 28.21 million years ago (Mya). Moreover, 3,797 species-specific genes and 1,187 expanded gene families were identified for the scarlet sage genome.We provide the first genome sequence and gene annotation for the scarlet sage. The availability of these resources will be of great importance for further breeding strategies, genome editing, and comparative genomics among related species.


September 22, 2019

N6-methyladenine DNA modification in the human genome.

DNA N6-methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic. Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for ~0.051% of the total adenines. [G/C]AGG[C/T] was the most significantly associated motif with 6mA modification. 6mA sites were enriched in the coding regions and mark actively transcribed genes in human cells. DNA 6mA and N6-demethyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompanied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our results demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumorigenesis. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019

Genotype-Corrector: improved genotype calls for genetic mapping in F2 and RIL populations.

F2 and recombinant inbred lines (RILs) populations are very commonly used in plant genetic mapping studies. Although genome-wide genetic markers like single nucleotide polymorphisms (SNPs) can be readily identified by a wide array of methods, accurate genotype calling remains challenging, especially for heterozygous loci and missing data due to low sequencing coverage per individual. Therefore, we developed Genotype-Corrector, a program that corrects genotype calls and imputes missing data to improve the accuracy of genetic mapping. Genotype-Corrector can be applied in a wide variety of genetic mapping studies that are based on low coverage whole genome sequencing (WGS) or Genotyping-by-Sequencing (GBS) related techniques. Our results show that Genotype-Corrector achieves high accuracy when applied to both synthetic and real genotype data. Compared with using raw or only imputed genotype calls, the linkage groups built by corrected genotype data show much less noise and significant distortions can be corrected. Additionally, Genotype-Corrector compares favorably to the popular imputation software LinkImpute and Beagle in both F2 and RIL populations. Genotype-Corrector is publicly available on GitHub at https://github.com/freemao/Genotype-Corrector .


September 22, 2019

A mosaic monoploid reference sequence for the highly complex genome of sugarcane.

Sugarcane (Saccharum spp.) is a major crop for sugar and bioenergy production. Its highly polyploid, aneuploid, heterozygous, and interspecific genome poses major challenges for producing a reference sequence. We exploited colinearity with sorghum to produce a BAC-based monoploid genome sequence of sugarcane. A minimum tiling path of 4660 sugarcane BAC that best covers the gene-rich part of the sorghum genome was selected based on whole-genome profiling, sequenced, and assembled in a 382-Mb single tiling path of a high-quality sequence. A total of 25,316 protein-coding gene models are predicted, 17% of which display no colinearity with their sorghum orthologs. We show that the two species, S. officinarum and S. spontaneum, involved in modern cultivars differ by their transposable elements and by a few large chromosomal rearrangements, explaining their distinct genome size and distinct basic chromosome numbers while also suggesting that polyploidization arose in both lineages after their divergence.


September 22, 2019

Hotspots of independent and multiple rounds of LTR-retrotransposon bursts in Brassica species

Long terminal repeat retrotransposons (LTR-RTs) are a predominant group of plant transposable elements (TEs) that are an important component of plant genomes. A large number of LTR-RTs have been annotated in the genomes of the agronomically important oil and vegetable crops of the genus Brassica. Herein, full-length LTR-RTs in the genomes of Brassica and other closely related species were systematically analyzed. The full-length LTR-RT content varied greatly (from 0.43% to 23.4%) between different species, with Gypsy-like LTR-RTs constituting a primary group across these genomes. More importantly, many annotated LTR-RTs (from 10.03% to 33.25% of all detected LTR-RTs) were found to be enriched in localized hotspot regions. Furthermore, all of the analyzed species showed evidence of having experienced at least one round of a LTR-RT burst, with Raphanus sativus experiencing three or more. Moreover, these relatively ancient LTR-RT amplifications exhibited a clear expansion at specific time points. To gain a further understanding of this timing, Brassica rapa, B. oleracea, and R. sativus were examined for the presence of syntenic regions, but none were present. These findings indicate that these LTR-RT burst events were not inherited from a common ancestor, but instead were species-specific bursts that occurred after the divergence of Brassica species. This study further exemplifies the complexities of TE amplifications during the evolution of plant genomes and suggests that these LTR-RT bursts play an important role in genome expansion and divergence in Brassica species.


September 22, 2019

Genomic variation among and within six Juglans species.

Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are ‘draft’ genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata. Copyright © 2018 Stevens et al.


September 22, 2019

Genome survey of the freshwater mussel Venustaconcha ellipsiformis (Bivalvia: Unionida) using a hybrid de novo assembly approach.

Freshwater mussels (Bivalvia: Unionida) serve an important role as aquatic ecosystem engineers but are one of the most critically imperilled groups of animals. Here, we used a combination of sequencing strategies to assemble and annotate a draft genome of Venustaconcha ellipsiformis, which will serve as a valuable genomic resource given the ecological value and unique “doubly uniparental inheritance” mode of mitochondrial DNA transmission of freshwater mussels. The genome described here was obtained by combining high-coverage short reads (65× genome coverage of Illumina paired-end and 11× genome coverage of mate-pairs sequences) with low-coverage Pacific Biosciences long reads (0.3× genome coverage). Briefly, the final scaffold assembly accounted for a total size of 1.54?Gb (366,926 scaffolds, N50?=?6.5 kb, with 2.3% of “N” nucleotides), representing 86% of the predicted genome size of 1.80?Gb, while over one third of the genome (37.5%) consisted of repeated elements and >85% of the core eukaryotic genes were recovered. Given the repeated genetic bottlenecks of V. ellipsiformis populations as a result of glaciations events, heterozygosity was also found to be remarkably low (0.6%), in contrast to most other sequenced bivalve species. Finally, we reassembled the full mitochondrial genome and found six polymorphic sites with respect to the previously published reference. This resource opens the way to comparative genomics studies to identify genes related to the unique adaptations of freshwater mussels and their distinctive mitochondrial inheritance mechanism.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.