Menu
April 21, 2020

metaFlye: scalable long-read metagenome assembly using repeat graphs

Long-read sequencing technologies substantially improved assemblies of many isolate bacterial genomes as compared to fragmented assemblies produced with short-read technologies. However, assembling complex metagenomic datasets remains a challenge even for the state-of-the-art long-read assemblers. To address this gap, we present the metaFlye assembler and demonstrate that it generates highly contiguous and accurate metagenome assemblies. In contrast to short-read metagenomics assemblers that typically fail to reconstruct full-length 16S RNA genes, metaFlye captures many 16S RNA genes within long contigs, thus providing new opportunities for analyzing the microbial “dark matter of life”. We also demonstrate that long-read metagenome assemblers significantly improve full-length plasmid and virus reconstruction as compared to short-read assemblers and reveal many novel plasmids and viruses.


April 21, 2020

A microbial factory for defensive kahalalides in a tripartite marine symbiosis.

Chemical defense against predators is widespread in natural ecosystems. Occasionally, taxonomically distant organisms share the same defense chemical. Here, we describe an unusual tripartite marine symbiosis, in which an intracellular bacterial symbiont (“Candidatus Endobryopsis kahalalidefaciens”) uses a diverse array of biosynthetic enzymes to convert simple substrates into a library of complex molecules (the kahalalides) for chemical defense of the host, the alga Bryopsis sp., against predation. The kahalalides are subsequently hijacked by a third partner, the herbivorous mollusk Elysia rufescens, and employed similarly for defense. “Ca E. kahalalidefaciens” has lost many essential traits for free living and acts as a factory for kahalalide production. This interaction between a bacterium, an alga, and an animal highlights the importance of chemical defense in the evolution of complex symbioses.Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


April 21, 2020

Assembly of long, error-prone reads using repeat graphs.

Accurate genome assembly is hampered by repetitive regions. Although long single molecule sequencing reads are better able to resolve genomic repeats than short-read data, most long-read assembly algorithms do not provide the repeat characterization necessary for producing optimal assemblies. Here, we present Flye, a long-read assembly algorithm that generates arbitrary paths in an unknown repeat graph, called disjointigs, and constructs an accurate repeat graph from these error-riddled disjointigs. We benchmark Flye against five state-of-the-art assemblers and show that it generates better or comparable assemblies, while being an order of magnitude faster. Flye nearly doubled the contiguity of the human genome assembly (as measured by the NGA50 assembly quality metric) compared with existing assemblers.


April 21, 2020

Phylogenetic barriers to horizontal transfer of antimicrobial peptide resistance genes in the human gut microbiota.

The human gut microbiota has adapted to the presence of antimicrobial peptides (AMPs), which are ancient components of immune defence. Despite its medical importance, it has remained unclear whether AMP resistance genes in the gut microbiome are available for genetic exchange between bacterial species. Here, we show that AMP resistance and antibiotic resistance genes differ in their mobilization patterns and functional compatibilities with new bacterial hosts. First, whereas AMP resistance genes are widespread in the gut microbiome, their rate of horizontal transfer is lower than that of antibiotic resistance genes. Second, gut microbiota culturing and functional metagenomics have revealed that AMP resistance genes originating from phylogenetically distant bacteria have only a limited potential to confer resistance in Escherichia coli, an intrinsically susceptible species. Taken together, functional compatibility with the new bacterial host emerges as a key factor limiting the genetic exchange of AMP resistance genes. Finally, our results suggest that AMPs induce highly specific changes in the composition of the human microbiota, with implications for disease risks.


April 21, 2020

MSC: a metagenomic sequence classification algorithm.

Metagenomics is the study of genetic materials directly sampled from natural habitats. It has the potential to reveal previously hidden diversity of microscopic life largely due to the existence of highly parallel and low-cost next-generation sequencing technology. Conventional approaches align metagenomic reads onto known reference genomes to identify microbes in the sample. Since such a collection of reference genomes is very large, the approach often needs high-end computing machines with large memory which is not often available to researchers. Alternative approaches follow an alignment-free methodology where the presence of a microbe is predicted using the information about the unique k-mers present in the microbial genomes. However, such approaches suffer from high false positives due to trading off the value of k with the computational resources. In this article, we propose a highly efficient metagenomic sequence classification (MSC) algorithm that is a hybrid of both approaches. Instead of aligning reads to the full genomes, MSC aligns reads onto a set of carefully chosen, shorter and highly discriminating model sequences built from the unique k-mers of each of the reference sequences.Microbiome researchers are generally interested in two objectives of a taxonomic classifier: (i) to detect prevalence, i.e. the taxa present in a sample, and (ii) to estimate their relative abundances. MSC is primarily designed to detect prevalence and experimental results show that MSC is indeed a more effective and efficient algorithm compared to the other state-of-the-art algorithms in terms of accuracy, memory and runtime. Moreover, MSC outputs an approximate estimate of the abundances.The implementations are freely available for non-commercial purposes. They can be downloaded from https://drive.google.com/open?id=1XirkAamkQ3ltWvI1W1igYQFusp9DHtVl. © The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


April 21, 2020

Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.


April 21, 2020

Differences in resource use lead to coexistence of seed-transmitted microbial populations.

Seeds are involved in the vertical transmission of microorganisms in plants and act as reservoirs for the plant microbiome. They could serve as carriers of pathogens, making the study of microbial interactions on seeds important in the emergence of plant diseases. We studied the influence of biological disturbances caused by seed transmission of two phytopathogenic agents, Alternaria brassicicola Abra43 (Abra43) and Xanthomonas campestris pv. campestris 8004 (Xcc8004), on the structure and function of radish seed microbial assemblages, as well as the nutritional overlap between Xcc8004 and the seed microbiome, to find seed microbial residents capable of outcompeting this pathogen. According to taxonomic and functional inference performed on metagenomics reads, no shift in structure and function of the seed microbiome was observed following Abra43 and Xcc8004 transmission. This lack of impact derives from a limited overlap in nutritional resources between Xcc8004 and the major bacterial populations of radish seeds. However, two native seed-associated bacterial strains belonging to Stenotrophomonas rhizophila displayed a high overlap with Xcc8004 regarding the use of resources; they might therefore limit its transmission. The strategy we used may serve as a foundation for the selection of seed indigenous bacterial strains that could limit seed transmission of pathogens.


April 21, 2020

Metatranscriptomic evidence for classical and RuBisCO-mediated CO2 reduction to methane facilitated by direct interspecies electron transfer in a methanogenic system.

In a staged anaerobic fluidized-bed ceramic membrane bioreactor, metagenomic and metatranscriptomic analyses were performed to decipher the microbial interactions on the granular activated carbon. Metagenome bins, representing the predominating microbes in the bioreactor: syntrophic propionate-oxidizing bacteria (SPOB), acetoclastic Methanothrix concilii, and exoelectrogenic Geobacter lovleyi, were successfully recovered for the reconstruction and analysis of metabolic pathways involved in the transformation of fatty acids to methane. In particular, SPOB degraded propionate into acetate, which was further converted into methane and CO2 by M. concilii via the acetoclastic methanogenesis. Concurrently, G. lovleyi oxidized acetate into CO2, releasing electrons into the extracellular environment. By accepting these electrons through direct interspecies electron transfer (DIET), M. concilii was capable of performing CO2 reduction for further methane formation. Most notably, an alternative RuBisCO-mediated CO2 reduction (the reductive hexulose-phosphate (RHP) pathway) is transcriptionally-active in M. concilii. This RHP pathway enables M. concilii dominance and energy gain by carbon fixation and methanogenesis, respectively via a methyl-H4MPT intermediate, constituting the third methanogenesis route. The complete acetate reduction (2 mole methane formation/1 mole acetate consumption), coupling of acetoclastic methanogenesis and two CO2 reduction pathways, are thermodynamically favorable even under very low substrate condition (down to to 10-5?M level). Such tight interactions via both mediated and direct interspecies electron transfer (MIET and DIET), induced by the conductive GAC promote the overall efficiency of bioenergy processes.


April 21, 2020

Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

Metagenomic sequence classification should be fast, accurate and information-rich. Emerging long-read sequencing technologies promise to improve the balance between these factors but most existing methods were designed for short reads. MetaMaps is a new method, specifically developed for long reads, capable of mapping a long-read metagenome to a comprehensive RefSeq database with >12,000 genomes in <16?GB or RAM on a laptop computer. Integrating approximate mapping with probabilistic scoring and EM-based estimation of sample composition, MetaMaps achieves >94% accuracy for species-level read assignment and r2?>?0.97 for the estimation of sample composition on both simulated and real data when the sample genomes or close relatives are present in the classification database. To address novel species and genera, which are comparatively harder to predict, MetaMaps outputs mapping locations and qualities for all classified reads, enabling functional studies (e.g. gene presence/absence) and detection of incongruities between sample and reference genomes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.