Menu
April 21, 2020  |  

Whole-genome analysis of the colonization-resistant bacterium Phytobacter sp. SCO41T isolated from Bacillus nematocida B16-fed adult Caenorhabditis elegans.

Colonization resistance is an important attribute for bacterial interactions with hosts, but the mechanism is still not completely clear. In this study, we found that Phytobacter sp. SCO41T can effectively inhibit the in vivo colonization of Bacillus nematocida B16 in Caenorhabditis elegans, and we revealed the colonization resistance mechanism. Three strains of colonization-resistant bacteria, SCO41T, BX15, and BC7, were isolated from the intestines of the free-living nematode C. elegans derived from rotten fruit and soil. The primary characteristics and genome map of one of the three isolates was investigated to explore the underlying mechanism of colonization resistance in C. elegans. In addition, we performed exogenous iron supplementation and gene cluster knockout experiments to validate the sequencing results. The results showed that relationship was close among the three strains, which was identified as belonging to the genus Phytobacter. The type strain is SCO41T (=?CICC 24103T?=?KCTC 52362T). Whole genome analysis showed that csgA, csgB, csgC, csgE, csgF, and csgG were involved in the curli adhesive process and that fepA, fepB, fepC, fepD, and fepG played important roles in SCO41T against the colonization of B. nematocida B16 in C. elegans by competing for iron. Exogenous iron supplementation showed that exogenous iron can increase the colonization of B. nematocida B16, which was additionally confirmed by a deletion mutant strain. The csg gene family contributes to the colonization of SCO41T in C. elegans. Curli potentially contribute to the colonization of SCO41T in C. elegans, and enterobactin has a key role in SCO41T to resist the colonization of B. nematocida B16 by competing for iron.


April 21, 2020  |  

The role of genomic structural variation in the genetic improvement of polyploid crops

Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences (presence–absence variation) and the presence of genes or sequences in multiple copies (copy-number variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation (presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review, we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.


April 21, 2020  |  

A Species-Wide Inventory of NLR Genes and Alleles in Arabidopsis thaliana.

Infectious disease is both a major force of selection in nature and a prime cause of yield loss in agriculture. In plants, disease resistance is often conferred by nucleotide-binding leucine-rich repeat (NLR) proteins, intracellular immune receptors that recognize pathogen proteins and their effects on the host. Consistent with extensive balancing and positive selection, NLRs are encoded by one of the most variable gene families in plants, but the true extent of intraspecific NLR diversity has been unclear. Here, we define a nearly complete species-wide pan-NLRome in Arabidopsis thaliana based on sequence enrichment and long-read sequencing. The pan-NLRome largely saturates with approximately 40 well-chosen wild strains, with half of the pan-NLRome being present in most accessions. We chart NLR architectural diversity, identify new architectures, and quantify selective forces that act on specific NLRs and NLR domains. Our study provides a blueprint for defining pan-NLRomes.Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved.


April 21, 2020  |  

Recompleting the Caenorhabditis elegans genome.

Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted =53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology. © 2019 Yoshimura et al.; Published by Cold Spring Harbor Laboratory Press.


April 21, 2020  |  

Large Enriched Fragment Targeted Sequencing (LEFT-SEQ) Applied to Capture of Wolbachia Genomes.

Symbiosis is a major force of evolutionary change, influencing virtually all aspects of biology, from population ecology and evolution to genomics and molecular/biochemical mechanisms of development and reproduction. A remarkable example is Wolbachia endobacteria, present in some parasitic nematodes and many arthropod species. Acquisition of genomic data from diverse Wolbachia clades will aid in the elucidation of the different symbiotic mechanisms(s). However, challenges of de novo assembly of Wolbachia genomes include the presence in the sample of host DNA: nematode/vertebrate or insect. We designed biotinylated probes to capture large fragments of Wolbachia DNA for sequencing using PacBio technology (LEFT-SEQ: Large Enriched Fragment Targeted Sequencing). LEFT-SEQ was used to capture and sequence four Wolbachia genomes: the filarial nematode Brugia malayi, wBm, (21-fold enrichment), Drosophila mauritiana flies (2 isolates), wMau (11-fold enrichment), and Aedes albopictus mosquitoes, wAlbB (200-fold enrichment). LEFT-SEQ resulted in complete genomes for wBm and for wMau. For wBm, 18 single-nucleotide polymorphisms (SNPs), relative to the wBm reference, were identified and confirmed by PCR. A limit of LEFT-SEQ is illustrated by the wAlbB genome, characterized by a very high level of insertion sequences elements (ISs) and DNA repeats, for which only a 20-contig draft assembly was achieved.


April 21, 2020  |  

Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions.

The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which initially constructs each haplotype sequence and then untangles the assembly graphs utilizing sequence links and synteny information. A comprehensive benchmark analysis reveals that Platanus-allee exhibits high recall and precision, particularly for HDRs. Using this approach, previously unknown HDRs are detected in the human genome, which may uncover novel aspects of genome variability.


April 21, 2020  |  

The genome of the soybean cyst nematode (Heterodera glycines) reveals complex patterns of duplications involved in the evolution of parasitism genes.

Heterodera glycines, commonly referred to as the soybean cyst nematode (SCN), is an obligatory and sedentary plant parasite that causes over a billion-dollar yield loss to soybean production annually. Although there are genetic determinants that render soybean plants resistant to certain nematode genotypes, resistant soybean cultivars are increasingly ineffective because their multi-year usage has selected for virulent H. glycines populations. The parasitic success of H. glycines relies on the comprehensive re-engineering of an infection site into a syncytium, as well as the long-term suppression of host defense to ensure syncytial viability. At the forefront of these complex molecular interactions are effectors, the proteins secreted by H. glycines into host root tissues. The mechanisms of effector acquisition, diversification, and selection need to be understood before effective control strategies can be developed, but the lack of an annotated genome has been a major roadblock.Here, we use PacBio long-read technology to assemble a H. glycines genome of 738 contigs into 123?Mb with annotations for 29,769 genes. The genome contains significant numbers of repeats (34%), tandem duplicates (18.7?Mb), and horizontal gene transfer events (151 genes). A large number of putative effectors (431 genes) were identified in the genome, many of which were found in transposons.This advance provides a glimpse into the host and parasite interplay by revealing a diversity of mechanisms that give rise to virulence genes in the soybean cyst nematode, including: tandem duplications containing over a fifth of the total gene count, virulence genes hitchhiking in transposons, and 107 horizontal gene transfers not reported in other plant parasitic nematodes thus far. Through extensive characterization of the H. glycines genome, we provide new insights into H. glycines biology and shed light onto the mystery underlying complex host-parasite interactions. This genome sequence is an important prerequisite to enable work towards generating new resistance or control measures against H. glycines.


April 21, 2020  |  

Horizontal transfer of a retrotransposon between parasitic nematodes and the common shrew.

As the genomes of more metazoan species are sequenced, reports of horizontal transposon transfers (HTT) have increased. Our understanding of the mechanisms of such events is at an early stage. The close physical relationship between a parasite and its host could facilitate horizontal transfer. To date, two studies have identified horizontal transfer of RTEs, a class of retrotransposable elements, involving parasites: ticks might act as vector for BovB between ruminants and squamates, and AviRTE was transferred between birds and parasitic nematodes.We searched for RTEs shared between nematode and mammalian genomes. Given their physical proximity, it was necessary to detect and remove sequence contamination from the genome datasets, which would otherwise distort the signal of horizontal transfer. We developed an approach that is based on reads instead of genomic sequences to reliably detect contamination. From comparison of 43 RTEs across 197 genomes, we identified a single putative case of horizontal transfer: we detected RTE1_Sar from Sorex araneus, the common shrew, in parasitic nematodes. From the taxonomic distribution and evolutionary analysis, we show that RTE1_Sar was horizontally transferred.We identified a new horizontal RTE transfer in host-parasite interactions, which suggests that it is not uncommon. Further, we present and provide the workflow a read-based method to distinguish between contamination and horizontal transfer.


April 21, 2020  |  

The golden death bacillus Chryseobacterium nematophagum is a novel matrix digesting pathogen of nematodes.

Nematodes represent important pathogens of humans and farmed animals and cause significant health and economic impacts. The control of nematodes is primarily carried out by applying a limited number of anthelmintic compounds, for which there is now widespread resistance being reported. There is a current unmet need to develop novel control measures including the identification and characterisation of natural pathogens of nematodes.Nematode killing bacilli were isolated from a rotten fruit in association with wild free-living nematodes. These bacteria belong to the Chryseobacterium genus (golden bacteria) and represent a new species named Chryseobacterium nematophagum. These bacilli are oxidase-positive, flexirubin-pigmented, gram-negative rods that exhibit gelatinase activity. Caenorhabditis elegans are attracted to and eat these bacteria. Within 3 h of ingestion, however, the bacilli have degraded the anterior pharyngeal chitinous lining and entered the body cavity, ultimately killing the host. Within 24?h, the internal contents of the worms are digested followed by the final digestion of the remaining cuticle over a 2-3-day period. These bacteria will also infect and kill bacterivorous free-living (L1-L3) stages of all tested parasitic nematodes including the important veterinary Trichostrongylids such as Haemonchus contortus and Ostertagia ostertagi. The bacteria exhibit potent collagen-digesting properties, and genome sequencing has identified novel metalloprotease, collagenase and chitinase enzymes representing potential virulence factors.Chryseobacterium nematophagum is a newly discovered pathogen of nematodes that rapidly kills environmental stages of a wide range of key nematode parasites. These bacilli exhibit a unique invasion process, entering the body via the anterior pharynx through the specific degradation of extracellular matrices. This bacterial pathogen represents a prospective biological control agent for important nematode parasites.


September 22, 2019  |  

The state of long non-coding RNA biology.

Transcriptomic studies have demonstrated that the vast majority of the genomes of mammals and other complex organisms is expressed in highly dynamic and cell-specific patterns to produce large numbers of intergenic, antisense and intronic long non-protein-coding RNAs (lncRNAs). Despite well characterized examples, their scaling with developmental complexity, and many demonstrations of their association with cellular processes, development and diseases, lncRNAs are still to be widely accepted as major players in gene regulation. This may reflect an underappreciation of the extent and precision of the epigenetic control of differentiation and development, where lncRNAs appear to have a central role, likely as organizational and guide molecules: most lncRNAs are nuclear-localized and chromatin-associated, with some involved in the formation of specialized subcellular domains. I suggest that a reassessment of the conceptual framework of genetic information and gene expression in the 4-dimensional ontogeny of spatially organized multicellular organisms is required. Together with this and further studies on their biology, the key challenges now are to determine the structure?function relationships of lncRNAs, which may be aided by emerging evidence of their modular structure, the role of RNA editing and modification in enabling epigenetic plasticity, and the role of RNA signaling in transgenerational inheritance of experience.


September 22, 2019  |  

A microbial clock provides an accurate estimate of the postmortem interval in a mouse model system.

Establishing the time since death is critical in every death investigation, yet existing techniques are susceptible to a range of errors and biases. For example, forensic entomology is widely used to assess the postmortem interval (PMI), but errors can range from days to months. Microbes may provide a novel method for estimating PMI that avoids many of these limitations. Here we show that postmortem microbial community changes are dramatic, measurable, and repeatable in a mouse model system, allowing PMI to be estimated within approximately 3 days over 48 days. Our results provide a detailed understanding of bacterial and microbial eukaryotic ecology within a decomposing corpse system and suggest that microbial community data can be developed into a forensic tool for estimating PMI. DOI:http://dx.doi.org/10.7554/eLife.01104.001.


September 22, 2019  |  

Exploiting single-molecule transcript sequencing for eukaryotic gene prediction.

We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes.


September 22, 2019  |  

P_RNA_scaffolder: a fast and accurate genome scaffolder using paired-end RNA-sequencing reads.

Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions.We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly.The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder .


September 22, 2019  |  

Revealing missing human protein isoforms based on Ab initio prediction, RNA-seq and proteomics.

Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.


September 22, 2019  |  

Use of a draft genome of coffee (Coffea arabica) to identify SNPs associated with caffeine content.

Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.