Menu
September 22, 2019

MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs.

There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing, and there is a need to adapt short read tools to long read datasets.We describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences. For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBI-nr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods.All presented methods are implemented in the open source edition of MEGAN, and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community.This article was reviewed by Nicola Segata together with Moreno Zolfo, Pete James Lockhart and Serghei Mangul.This work extends the applicability of the widely-used metagenomic analysis software MEGAN to long reads. Our study suggests that the presented LAST+MEGAN-LR pipeline is sufficiently fast and accurate.


September 22, 2019

Multiscale patterns and drivers of arbuscular mycorrhizal fungal communities in the roots and root-associated soil of a wild perennial herb.

Arbuscular mycorrhizal (AM) fungi form diverse communities and are known to influence above-ground community dynamics and biodiversity. However, the multiscale patterns and drivers of AM fungal composition and diversity are still poorly understood. We sequenced DNA markers from roots and root-associated soil from Plantago lanceolata plants collected across multiple spatial scales to allow comparison of AM fungal communities among neighbouring plants, plant subpopulations, nearby plant populations, and regions. We also measured soil nutrients, temperature, humidity, and community composition of neighbouring plants and nonAM root-associated fungi. AM fungal communities were already highly dissimilar among neighbouring plants (c. 30 cm apart), albeit with a high variation in the degree of similarity at this small spatial scale. AM fungal communities were increasingly, and more consistently, dissimilar at larger spatial scales. Spatial structure and environmental drivers explained a similar percentage of the variation, from 7% to 25%. A large fraction of the variation remained unexplained, which may be a result of unmeasured environmental variables, species interactions and stochastic processes. We conclude that AM fungal communities are highly variable among nearby plants. AM fungi may therefore play a major role in maintaining small-scale variation in community dynamics and biodiversity.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.


September 22, 2019

Next generation sequencing data of a defined microbial mock community.

Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools.


September 22, 2019

Role of clinicogenomics in infectious disease diagnostics and public health microbiology.

Clinicogenomics is the exploitation of genome sequence data for diagnostic, therapeutic, and public health purposes. Central to this field is the high-throughput DNA sequencing of genomes and metagenomes. The role of clinicogenomics in infectious disease diagnostics and public health microbiology was the topic of discussion during a recent symposium (session 161) presented at the 115th general meeting of the American Society for Microbiology that was held in New Orleans, LA. What follows is a collection of the most salient and promising aspects from each presentation at the symposium. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


September 22, 2019

Circular RNA architecture and differentiation during leaf bud to young leaf development in tea (Camellia sinensis).

Circular RNA (circRNA) discovery, expression patterns and experimental validation in developing tea leaves indicates its correlation with circRNA-parental genes and potential roles in ceRNA interaction network. Circular RNAs (circRNAs) have recently emerged as a novel class of abundant endogenous stable RNAs produced by circularization with regulatory potential. However, identification of circRNAs in plants, especially in non-model plants with large genomes, is challenging. In this study, we undertook a systematic identification of circRNAs from different stage tissues of tea plant (Camellia sinensis) leaf development using rRNA-depleted circular RNA-seq. By combining two state-of-the-art detecting tools, we characterized 3174 circRNAs, of which 342 were shared by each approach, and thus considered high-confidence circRNAs. A few predicted circRNAs were randomly chosen, and 20 out of 24 were experimental confirmed by PCR and Sanger sequencing. Similar in other plants, tissue-specific expression was also observed for many C. sinensis circRNAs. In addition, we found that circRNA abundances were positively correlated with the mRNA transcript abundances of their parental genes. qRT-PCR validated the differential expression patterns of circRNAs between leaf bud and young leaf, which also indicated the low expression abundance of circRNAs compared to the standard mRNAs from the parental genes. We predicted the circRNA-microRNA interaction networks, and 54 of the differentially expressed circRNAs were found to have potential tea plant miRNA binding sites. The gene sets encoding circRNAs were significantly enriched in chloroplasts related GO terms and photosynthesis/metabolites biosynthesis related KEGG pathways, suggesting the candidate roles of circRNAs in photosynthetic machinery and metabolites biosynthesis during leaf development.


September 22, 2019

Interpreting microbial biosynthesis in the genomic age: Biological and practical considerations.

Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.


September 22, 2019

First insights into the nature and evolution of antisense transcription in nematodes.

The development of multicellular organisms is coordinated by various gene regulatory mechanisms that ensure correct spatio-temporal patterns of gene expression. Recently, the role of antisense transcription in gene regulation has moved into focus of research. To characterize genome-wide patterns of antisense transcription and to study their evolutionary conservation, we sequenced a strand-specific RNA-seq library of the nematode Pristionchus pacificus.We identified 1112 antisense configurations of which the largest group represents 465 antisense transcripts (ASTs) that are fully embedded in introns of their host genes. We find that most ASTs show homology to protein-coding genes and are overrepresented in proteomic data. Together with the finding, that expression levels of ASTs and host genes are uncorrelated, this indicates that most ASTs in P. pacificus do not represent non-coding RNAs and do not exhibit regulatory functions on their host genes. We studied the evolution of antisense gene pairs across 20 nematode genomes, showing that the majority of pairs is lineage-specific and even the highly conserved vps-4, ddx-27, and sel-2 loci show abundant structural changes including duplications, deletions, intron gains and loss of antisense transcription. In contrast, host genes in general, are remarkably conserved and encode exceptionally long introns leading to unusually large blocks of conserved synteny.Our study has shown that in P. pacificus antisense transcription as such does not define non-coding RNAs but is rather a feature of highly conserved genes with long introns. We hypothesize that the presence of regulatory elements imposes evolutionary constraint on the intron length, but simultaneously, their large size makes them a likely target for translocation of genomic elements including protein-coding genes that eventually end up as ASTs.


September 22, 2019

Microsatellites from Fosterella christophii (Bromeliaceae) by de novo transcriptome sequencing on the Pacific Biosciences RS platform.

Microsatellite markers were developed in Fosterella christophii (Bromeliaceae) to investigate the genetic diversity and population structure within the F. micrantha group, comprising F. christophii, F. micrantha, and F. villosula.Full-length cDNAs were isolated from F. christophii and sequenced on a Pacific Biosciences RS platform. A total of 1590 high-quality consensus isoforms were assembled into 971 unigenes containing 421 perfect microsatellites. Thirty primer sets were designed, of which 13 revealed a high level of polymorphism in three populations of F. christophii, with four to nine alleles per locus. Each of these 13 loci cross-amplified in the closely related species F. micrantha and F. villosula, with one to six and one to 11 alleles per locus, respectively.The new markers are promising tools to study the population genetics of F. christophii and to discover species boundaries within the F. micrantha group.


September 22, 2019

Transcriptome profiling of two ornamental and medicinal papaver herbs.

The Papaver spp. (Papaver rhoeas (Corn poppy) and Papaver nudicaule (Iceland poppy)) genera are ornamental and medicinal plants that are used for the isolation of alkaloid drugs. In this study, we generated 700 Mb of transcriptome sequences with the PacBio platform. They were assembled into 120,926 contigs, and 1185 (82.2%) of the benchmarking universal single-copy orthologs (BUSCO) core genes were completely present in our assembled transcriptome. Furthermore, using 128 Gb of Illumina sequences, the transcript expression was assessed at three stages of Papaver plant development (30, 60, and 90 days), from which we identified 137 differentially expressed transcripts. Furthermore, three co-occurrence heat maps are generated from 51 different plant genomes along with the Papaver transcriptome, i.e., secondary metabolite biosynthesis, isoquinoline alkaloid biosynthesis (BIA) pathway, and cytochrome. Sixty-nine transcripts in the BIA pathway along with 22 different alkaloids (quantified with LC-QTOF-MS/MS) were mapped into the BIA KEGG map (map00950). Finally, we identified 39 full-length cytochrome transcripts and compared them with other genomes. Collectively, this transcriptome data, along with the expression and quantitative metabolite profiles, provides an initial recording of secondary metabolites and their expression related to Papaver plant development. Moreover, these profiles could help to further detail the functional characterization of the various secondary metabolite biosynthesis and Papaver plant development associated problems.


September 22, 2019

Quantitative metaproteomics highlight the metabolic contributions of uncultured phylotypes in a thermophilic anaerobic digester.

In this study, we used multiple meta-omic approaches to characterize the microbial community and the active metabolic pathways of a stable industrial biogas reactor with food waste as the dominant feedstock, operating at thermophilic temperatures (60°C) and elevated levels of free ammonia (367 mg/liter NH3-N). The microbial community was strongly dominated (76% of all 16S rRNA amplicon sequences) by populations closely related to the proteolytic bacterium Coprothermobacter proteolyticus. Multiple Coprothermobacter-affiliated strains were detected, introducing an additional level of complexity seldom explored in biogas studies. Genome reconstructions provided metabolic insight into the microbes that performed biomass deconstruction and fermentation, including the deeply branching phyla Dictyoglomi and Planctomycetes and the candidate phylum “Atribacteria” These biomass degraders were complemented by a synergistic network of microorganisms that convert key fermentation intermediates (fatty acids) via syntrophic interactions with hydrogenotrophic methanogens to ultimately produce methane. Interpretation of the proteomics data also suggested activity of a Methanosaeta phylotype acclimatized to high ammonia levels. In particular, we report multiple novel phylotypes proposed as syntrophic acetate oxidizers, which also exert expression of enzymes needed for both the Wood-Ljungdahl pathway and ß-oxidation of fatty acids to acetyl coenzyme A. Such an arrangement differs from known syntrophic oxidizing bacteria and presents an interesting hypothesis for future studies. Collectively, these findings provide increased insight into active metabolic roles of uncultured phylotypes and presents new synergistic relationships, both of which may contribute to the stability of the biogas reactor.Biogas production through anaerobic digestion of organic waste provides an attractive source of renewable energy and a sustainable waste management strategy. A comprehensive understanding of the microbial community that drives anaerobic digesters is essential to ensure stable and efficient energy production. Here, we characterize the intricate microbial networks and metabolic pathways in a thermophilic biogas reactor. We discuss the impact of frequently encountered microbial populations as well as the metabolism of newly discovered novel phylotypes that seem to play distinct roles within key microbial stages of anaerobic digestion in this stable high-temperature system. In particular, we draft a metabolic scenario whereby multiple uncultured syntrophic acetate-oxidizing bacteria are capable of syntrophically oxidizing acetate as well as longer-chain fatty acids (via the ß-oxidation and Wood-Ljundahl pathways) to hydrogen and carbon dioxide, which methanogens subsequently convert to methane. Copyright © 2016 American Society for Microbiology.


September 22, 2019

Enigmatic Diphyllatea eukaryotes: culturing and targeted PacBio RS amplicon sequencing reveals a higher order taxonomic diversity and global distribution.

The class Diphyllatea belongs to a group of enigmatic unicellular eukaryotes that play a key role in reconstructing the morphological innovation and diversification of early eukaryotic evolution. Despite its evolutionary significance, very little is known about the phylogeny and species diversity of Diphyllatea. Only three species have described morphology, being taxonomically divided by flagella number, two or four, and cell size. Currently, one 18S rRNA Diphyllatea sequence is available, with environmental sequencing surveys reporting only a single partial sequence from a Diphyllatea-like organism. Accordingly, geographical distribution of Diphyllatea based on molecular data is limited, despite morphological data suggesting the class has a global distribution. We here present a first attempt to understand species distribution, diversity and higher order structure of Diphyllatea.We cultured 11 new strains, characterised these morphologically and amplified their rRNA for a combined 18S-28S rRNA phylogeny. We sampled environmental DNA from multiple sites and designed new Diphyllatea-specific PCR primers for long-read PacBio RSII technology. Near full-length 18S rRNA sequences from environmental DNA, in addition to supplementary Diphyllatea sequence data mined from public databases, resolved the phylogeny into three deeply branching and distinct clades (Diphy I – III). Of these, the Diphy III clade is entirely novel, and in congruence with Diphy II, composed of species morphologically consistent with the earlier described Collodictyon triciliatum. The phylogenetic split between the Diphy I and Diphy II?+?III clades corresponds with a morphological division of Diphyllatea into bi- and quadriflagellate cell forms.This altered flagella composition must have occurred early in the diversification of Diphyllatea and may represent one of the earliest known morphological transitions among eukaryotes. Further, the substantial increase in molecular data presented here confirms Diphyllatea has a global distribution, seemingly restricted to freshwater habitats. Altogether, the results reveal the advantage of combining a group-specific PCR approach and long-read high-throughput amplicon sequencing in surveying enigmatic eukaryote lineages. Lastly, our study shows the capacity of PacBio RS when targeting a protist class for increasing phylogenetic resolution.


September 22, 2019

Metagenomic approaches to assess bacteriophages in various environmental niches.

Bacteriophages are ubiquitous and numerous parasites of bacteria and play a critical evolutionary role in virtually every ecosystem, yet our understanding of the extent of the diversity and role of phages remains inadequate for many ecological niches, particularly in cases in which the host is unculturable. During the past 15 years, the emergence of the field of viral metagenomics has drastically enhanced our ability to analyse the so-called viral ‘dark matter’ of the biosphere. Here, we review the evolution of viral metagenomic methodologies, as well as providing an overview of some of the most significant applications and findings in this field of research.


September 22, 2019

Crosstalk between gut microbiota and Sirtuin-3 in colonic inflammation and tumorigenesis.

Colorectal cancer (CRC) is a disease involving a variety of genetic and environmental factors. Sirtuin-3 (Sirt3) is expressed at a low level in cancer tissues of CRC, but it is unclear how Sirt3 modulates colonic tumorigenesis. In this study, we found that gut microbiota play a central role in the resistance to CRC tumor formation in wild-type (WT) mice through APC (Adenomatous Polyposis Coli)-mutant mouse microbiota transfer via Wnt signaling. We also found that Sirt3-deficient mice were hypersusceptible to colonic inflammation and tumor development through altered intestinal integrity and p38 signaling, respectively. Furthermore, susceptibility to colorectal tumorigenesis was aggravated by initial commensal microbiota deletion via Wnt signaling. Mice with Sirt3-deficient microbiota transfer followed by chemically induced colon tumorigenesis had low Sirt3 expression compared to WT control microbiome transfer, mainly due to a decrease in Escherichia/Shigella, as well as an increase in Lactobacillus reuteri and Lactobacillus taiwanensis. Collectively, our data revealed that Sirt3 is an anti-inflammatory and tumor-suppressing gene that interacts with the gut microbiota during colon tumorigenesis.


September 22, 2019

Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding.

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.


September 22, 2019

Soil microclimate changes affect soil fungal communities in a Mediterranean pine forest.

Soil microclimate is a potentially important regulator of the composition of plant-associated fungal communities in climates with significant drought periods. Here, we investigated the spatio-temporal dynamics of soil fungal communities in a Mediterranean Pinus pinaster forest in relation to soil moisture and temperature. Fungal communities in 336 soil samples collected monthly over 1 year from 28 long-term experimental plots were assessed by PacBio sequencing of ITS2 amplicons. Total fungal biomass was estimated by analysing ergosterol. Community changes were analysed in the context of functional traits. Soil fungal biomass was lowest during summer and late winter and highest during autumn, concurrent with a greater relative abundance of mycorrhizal species. Intra-annual spatio-temporal changes in community composition correlated significantly with soil moisture and temperature. Mycorrhizal fungi were less affected by summer drought than free-living fungi. In particular, mycorrhizal species of the short-distance exploration type increased in relative abundance under dry conditions, whereas species of the long-distance exploration type were more abundant under wetter conditions. Our observations demonstrate a potential for compositional and functional shifts in fungal communities in response to changing climatic conditions. Free-living fungi and mycorrhizal species with extensive mycelia may be negatively affected by increasing drought periods in Mediterranean forest ecosystems.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.