Productivity of ruminant livestock depends on the rumen microbiota, which ferment indigestible plant polysaccharides into nutrients used for growth. Understanding the functions carried out by the rumen microbiota is important for reducing greenhouse gas production by ruminants and for developing biofuels from lignocellulose. We present 410 cultured bacteria and archaea, together with their reference genomes, representing every cultivated rumen-associated archaeal and bacterial family. We evaluate polysaccharide degradation, short-chain fatty acid production and methanogenesis pathways, and assign specific taxa to functions. A total of 336 organisms were present in available rumen metagenomic data sets, and 134 were present in human gut microbiome data sets. Comparison with the human microbiome revealed rumen-specific enrichment for genes encoding de novo synthesis of vitamin B12, ongoing evolution by gene loss and potential vertical inheritance of the rumen microbiome based on underrepresentation of markers of environmental stress. We estimate that our Hungate genome resource represents ~75% of the genus-level bacterial and archaeal taxa present in the rumen.
RNA-seq has revolutionised how scientists can interrogate gene expression. But after years of performing RNA-seq studies with short-read sequencers, many have realised that there is more to be discovered.
Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools.
PCR and omics based techniques to study the diversity, ecology and biology of anaerobic fungi: Insights, challenges andopportunities.
Anaerobic fungi (phylum Neocallimastigomycota) are common inhabitants of the digestive tract of mammalian herbivores, and in the rumen, can account for up to 20% of the microbial biomass. Anaerobic fungi play a primary role in the degradation of lignocellulosic plant material. They also have a syntrophic interaction with methanogenic archaea, which increases their fiber degradation activity. To date, nine anaerobic fungal genera have been described, with further novel taxonomic groupings known to exist based on culture-independent molecular surveys. However, the true extent of their diversity may be even more extensively underestimated as anaerobic fungi continue being discovered in yet unexplored gut and non-gut environments. Additionally many studies are now known to have used primers that provide incomplete coverage of the Neocallimastigomycota. For ecological studies the internal transcribed spacer 1 region (ITS1) has been the taxonomic marker of choice, but due to various limitations the large subunit rRNA (LSU) is now being increasingly used. How the continued expansion of our knowledge regarding anaerobic fungal diversity will impact on our understanding of their biology and ecological role remains unclear; particularly as it is becoming apparent that anaerobic fungi display niche differentiation. As a consequence, there is a need to move beyond the broad generalization of anaerobic fungi as fiber-degraders, and explore the fundamental differences that underpin their ability to exist in distinct ecological niches. Application of genomics, transcriptomics, proteomics and metabolomics to their study in pure/mixed cultures and environmental samples will be invaluable in this process. To date the genomes and transcriptomes of several characterized anaerobic fungal isolates have been successfully generated. In contrast, the application of proteomics and metabolomics to anaerobic fungal analysis is still in its infancy. A central problem for all analyses, however, is the limited functional annotation of anaerobic fungal sequence data. There is therefore an urgent need to expand information held within publicly available reference databases. Once this challenge is overcome, along with improved sample collection and extraction, the application of these techniques will be key in furthering our understanding of the ecological role and impact of anaerobic fungi in the wide range of environments they inhabit.
Microbial toluene biosynthesis was reported in anoxic lake sediments more than three decades ago, but the enzyme catalyzing this biochemically challenging reaction has never been identified. Here we report the toluene-producing enzyme PhdB, a glycyl radical enzyme of bacterial origin that catalyzes phenylacetate decarboxylation, and its cognate activating enzyme PhdA, a radical S-adenosylmethionine enzyme, discovered in two distinct anoxic microbial communities that produce toluene. The unconventional process of enzyme discovery from a complex microbial community (>300,000 genes), rather than from a microbial isolate, involved metagenomics- and metaproteomics-enabled biochemistry, as well as in vitro confirmation of activity with recombinant enzymes. This work expands the known catalytic range of glycyl radical enzymes (only seven reaction types had been characterized previously) and aromatic-hydrocarbon-producing enzymes, and will enable first-time biochemical synthesis of an aromatic fuel hydrocarbon from renewable resources, such as lignocellulosic biomass, rather than from petroleum.
Switchgrass (Panicum virgatum L.) is an important bioenergy crop widely used for lignocellulosic research. While extensive transcriptomic analyses have been conducted on this species using short read-based sequencing techniques, very little has been reliably derived regarding alternatively spliced (AS) transcripts.We present an analysis of transcriptomes of six switchgrass tissue types pooled together, sequenced using Pacific Biosciences (PacBio) single-molecular long-read technology. Our analysis identified 105,419 unique transcripts covering 43,570 known genes and 8795 previously unknown genes. 45,168 are novel transcripts of known genes. A total of 60,096 AS transcripts are identified, 45,628 being novel. We have also predicted 1549 transcripts of genes involved in cell wall construction and remodeling, 639 being novel transcripts of known cell wall genes. Most of the predicted transcripts are validated against Illumina-based short reads. Specifically, 96% of the splice junction sites in all the unique transcripts are validated by at least five Illumina reads. Comparisons between genes derived from our identified transcripts and the current genome annotation revealed that among the gene set predicted by both analyses, 16,640 have different exon-intron structures.Overall, substantial amount of new information is derived from the PacBio RNA data regarding both the transcriptome and the genome of switchgrass.
Capturing single cell genomes of active polysaccharide degraders: an unexpected contribution of Verrucomicrobia.
Microbial hydrolysis of polysaccharides is critical to ecosystem functioning and is of great interest in diverse biotechnological applications, such as biofuel production and bioremediation. Here we demonstrate the use of a new, efficient approach to recover genomes of active polysaccharide degraders from natural, complex microbial assemblages, using a combination of fluorescently labeled substrates, fluorescence-activated cell sorting, and single cell genomics. We employed this approach to analyze freshwater and coastal bacterioplankton for degraders of laminarin and xylan, two of the most abundant storage and structural polysaccharides in nature. Our results suggest that a few phylotypes of Verrucomicrobia make a considerable contribution to polysaccharide degradation, although they constituted only a minor fraction of the total microbial community. Genomic sequencing of five cells, representing the most predominant, polysaccharide-active Verrucomicrobia phylotype, revealed significant enrichment in genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases and esterases, confirming that these organisms were well equipped for the hydrolysis of diverse polysaccharides. Remarkably, this enrichment was on average higher than in the sequenced representatives of Bacteroidetes, which are frequently regarded as highly efficient biopolymer degraders. These findings shed light on the ecological roles of uncultured Verrucomicrobia and suggest specific taxa as promising bioprospecting targets. The employed method offers a powerful tool to rapidly identify and recover discrete genomes of active players in polysaccharide degradation, without the need for cultivation.
Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structures at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake’s water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.
Comparative genomic analysis of Sulfurospirillum cavolei MES reconstructed from the metagenome of an electrosynthetic microbiome.
Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this potent greenhouse gas.
Agaves are succulent monocotyledonous plants native to xeric environments of North America. Because of their adaptations to their environment, including crassulacean acid metabolism (CAM, a water-efficient form of photosynthesis), and existing technologies for ethanol production, agaves have gained attention both as potential lignocellulosic bioenergy feedstocks and models for exploring plant responses to abiotic stress. However, the lack of comprehensive Agave sequence datasets limits the scope of investigations into the molecular-genetic basis of Agave traits.Here, we present comprehensive, high quality de novo transcriptome assemblies of two Agave species, A. tequilana and A. deserti, built from short-read RNA-seq data. Our analyses support completeness and accuracy of the de novo transcriptome assemblies, with each species having a minimum of approximately 35,000 protein-coding genes. Comparison of agave proteomes to those of additional plant species identifies biological functions of gene families displaying sequence divergence in agave species. Additionally, a focus on the transcriptomics of the A. deserti juvenile leaf confirms evolutionary conservation of monocotyledonous leaf physiology and development along the proximal-distal axis.Our work presents a comprehensive transcriptome resource for two Agave species and provides insight into their biology and physiology. These resources are a foundation for further investigation of agave biology and their improvement for bioenergy development.
Bacteria play many important roles in animal digestive systems, including the provision of enzymes critical to digestion. Typically, complex communities of bacteria reside in the gut lumen in direct contact with the ingested materials they help to digest. Here, we demonstrate a previously undescribed digestive strategy in the wood-eating marine bivalve Bankia setacea, wherein digestive bacteria are housed in a location remote from the gut. These bivalves, commonly known as shipworms, lack a resident microbiota in the gut compartment where wood is digested but harbor endosymbiotic bacteria within specialized cells in their gills. We show that this comparatively simple bacterial community produces wood-degrading enzymes that are selectively translocated from gill to gut. These enzymes, which include just a small subset of the predicted wood-degrading enzymes encoded in the endosymbiont genomes, accumulate in the gut to the near exclusion of other endosymbiont-made proteins. This strategy of remote enzyme production provides the shipworm with a mechanism to capture liberated sugars from wood without competition from an endogenous gut microbiota. Because only those proteins required for wood digestion are translocated to the gut, this newly described system reveals which of many possible enzymes and enzyme combinations are minimally required for wood degradation. Thus, although it has historically had negative impacts on human welfare, the shipworm digestive process now has the potential to have a positive impact on industries that convert wood and other plant biomass to renewable fuels, fine chemicals, food, feeds, textiles, and paper products.
Metagenomic binning of a marine sponge microbiome reveals unity in defense but metabolic specialization.
Marine sponges are ancient metazoans that are populated by distinct and highly diverse microbial communities. In order to obtain deeper insights into the functional gene repertoire of the Mediterranean sponge Aplysina aerophoba, we combined Illumina short-read and PacBio long-read sequencing followed by un-targeted metagenomic binning. We identified a total of 37 high-quality bins representing 11 bacterial phyla and two candidate phyla. Statistical comparison of symbiont genomes with selected reference genomes revealed a significant enrichment of genes related to bacterial defense (restriction-modification systems, toxin-antitoxin systems) as well as genes involved in host colonization and extracellular matrix utilization in sponge symbionts. A within-symbionts genome comparison revealed a nutritional specialization of at least two symbiont guilds, where one appears to metabolize carnitine and the other sulfated polysaccharides, both of which are abundant molecules in the sponge extracellular matrix. A third guild of symbionts may be viewed as nutritional generalists that perform largely the same metabolic pathways but lack such extraordinary numbers of the relevant genes. This study characterizes the genomic repertoire of sponge symbionts at an unprecedented resolution and it provides greater insights into the molecular mechanisms underlying microbial-sponge symbiosis.
Novel molecules lncRNAs, tRFs and circRNAs deciphered from next-generation sequencing/RNA sequencing: computational databases and tools.
Powerful next-generation sequencing (NGS) technologies, more specifically RNA sequencing (RNA-seq), have been pivotal toward the detection and analysis and hypotheses generation of novel biomolecules, long noncoding RNAs (lncRNAs), tRNA-derived fragments (tRFs) and circular RNAs (circRNAs). Experimental validation of the occurrence of these biomolecules inside the cell has been reported. Their differential expression and functionally important role in several cancers types as well as other diseases such as Alzheimer’s and cardiovascular diseases have garnered interest toward further studies in this research arena. In this review, starting from a brief relevant introduction to NGS and RNA-seq and the expression and role of lncRNAs, tRFs and circRNAs in cancer, we have comprehensively analyzed the current landscape of databases developed and computational software used for analyses and visualization for this emerging and highly interesting field of these novel biomolecules. Our review will help the end users and research investigators gain information on the existing databases and tools as well as an understanding of the specific features which these offer. This will be useful for the researchers in their proper usage thereby guiding them toward novel hypotheses generation and saving time and costs involved in extensive experimental processes in these three different novel functional RNAs.© The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: firstname.lastname@example.org.
Comparative genomics and transcriptomics depict ericoid mycorrhizal fungi as versatile saprotrophs and plant mutualists.
Some soil fungi in the Leotiomycetes form ericoid mycorrhizal (ERM) symbioses with Ericaceae. In the harsh habitats in which they occur, ERM plant survival relies on nutrient mobilization from soil organic matter (SOM) by their fungal partners. The characterization of the fungal genetic machinery underpinning both the symbiotic lifestyle and SOM degradation is needed to understand ERM symbiosis functioning and evolution, and its impact on soil carbon (C) turnover. We sequenced the genomes of the ERM fungi Meliniomyces bicolor, M. variabilis, Oidiodendron maius and Rhizoscyphus ericae, and compared their gene repertoires with those of fungi with different lifestyles (ecto- and orchid mycorrhiza, endophytes, saprotrophs, pathogens). We also identified fungal transcripts induced in symbiosis. The ERM fungal gene contents for polysaccharide-degrading enzymes, lipases, proteases and enzymes involved in secondary metabolism are closer to those of saprotrophs and pathogens than to those of ectomycorrhizal symbionts. The fungal genes most highly upregulated in symbiosis are those coding for fungal and plant cell wall-degrading enzymes (CWDEs), lipases, proteases, transporters and mycorrhiza-induced small secreted proteins (MiSSPs). The ERM fungal gene repertoire reveals a capacity for a dual saprotrophic and biotrophic lifestyle. This may reflect an incomplete transition from saprotrophy to the mycorrhizal habit, or a versatile life strategy similar to fungal endophytes.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Rhizobium leguminosarum bv. viciae is a soil a-proteobacterium that establishes a diazotrophic symbiosis with different legumes of the Fabeae tribe. The number of genome sequences from rhizobial strains available in public databases is constantly increasing, although complete, fully annotated genome structures from rhizobial genomes are scarce. In this work, we report and analyse the complete genome of R. leguminosarum bv. viciae UPM791. Whole genome sequencing can provide new insights into the genetic features contributing to symbiotically relevant processes such as bacterial adaptation to the rhizosphere, mechanisms for efficient competition with other bacteria, and the ability to establish a complex signalling dialogue with legumes, to enter the root without triggering plant defenses, and, ultimately, to fix nitrogen within the host. Comparison of the complete genome sequences of two strains of R. leguminosarum bv. viciae, 3841 and UPM791, highlights the existence of different symbiotic plasmids and a common core chromosome. Specific genomic traits, such as plasmid content or a distinctive regulation, define differential physiological capabilities of these endosymbionts. Among them, strain UPM791 presents unique adaptations for recycling the hydrogen generated in the nitrogen fixation process.