Biogas reactors operating with protein-rich substrates have high methane potential and industrial value; however, they are highly susceptible to process failure because of the accumulation of ammonia. High ammonia levels cause a decline in acetate-utilizing methanogens and instead promote the conversion of acetate via a two-step mechanism involving syntrophic acetate oxidation (SAO) to H2 and CO2, followed by hydrogenotrophic methanogenesis. Despite the key role of syntrophic acetate-oxidizing bacteria (SAOB), only a few culturable representatives have been characterized. Here we show that the microbiome of a commercial, ammonia-tolerant biogas reactor harbors a deeply branched, uncultured phylotype (unFirm_1) accounting for approximately 5% of the 16S rRNA gene inventory and sharing 88% 16S rRNA gene identity with its closest characterized relative. Reconstructed genome and quantitative metaproteomic analyses imply unFirm_1’s metabolic dominance and SAO capabilities, whereby the key enzymes required for acetate oxidation are among the most highly detected in the reactor microbiome. While culturable SAOB were identified in genomic analyses of the reactor, their limited proteomic representation suggests that unFirm_1 plays an important role in channeling acetate toward methane. Notably, unFirm_1-like populations were found in other high-ammonia biogas installations, conjecturing a broader importance for this novel clade of SAOB in anaerobic fermentations. IMPORTANCE The microbial production of methane or “biogas” is an attractive renewable energy technology that can recycle organic waste into biofuel. Biogas reactors operating with protein-rich substrates such as household municipal or agricultural wastes have significant industrial and societal value; however, they are highly unstable and frequently collapse due to the accumulation of ammonia. We report the discovery of a novel uncultured phylotype (unFirm_1) that is highly detectable in metaproteomic data generated from an ammonia-tolerant commercial reactor. Importantly, unFirm_1 is proposed to perform a key metabolic step in biogas microbiomes, whereby it syntrophically oxidizes acetate to hydrogen and carbon dioxide, which methanogens then covert to methane. Only very few culturable syntrophic acetate-oxidizing bacteria have been described, and all were detected at low in situ levels compared to unFirm_1. Broader comparisons produced the hypothesis that unFirm_1 is a key mediator toward the successful long-term stable operation of biogas production using protein-rich substrates.
Quantitative metaproteomics highlight the metabolic contributions of uncultured phylotypes in a thermophilic anaerobic digester.
In this study, we used multiple meta-omic approaches to characterize the microbial community and the active metabolic pathways of a stable industrial biogas reactor with food waste as the dominant feedstock, operating at thermophilic temperatures (60°C) and elevated levels of free ammonia (367 mg/liter NH3-N). The microbial community was strongly dominated (76% of all 16S rRNA amplicon sequences) by populations closely related to the proteolytic bacterium Coprothermobacter proteolyticus. Multiple Coprothermobacter-affiliated strains were detected, introducing an additional level of complexity seldom explored in biogas studies. Genome reconstructions provided metabolic insight into the microbes that performed biomass deconstruction and fermentation, including the deeply branching phyla Dictyoglomi and Planctomycetes and the candidate phylum “Atribacteria” These biomass degraders were complemented by a synergistic network of microorganisms that convert key fermentation intermediates (fatty acids) via syntrophic interactions with hydrogenotrophic methanogens to ultimately produce methane. Interpretation of the proteomics data also suggested activity of a Methanosaeta phylotype acclimatized to high ammonia levels. In particular, we report multiple novel phylotypes proposed as syntrophic acetate oxidizers, which also exert expression of enzymes needed for both the Wood-Ljungdahl pathway and ß-oxidation of fatty acids to acetyl coenzyme A. Such an arrangement differs from known syntrophic oxidizing bacteria and presents an interesting hypothesis for future studies. Collectively, these findings provide increased insight into active metabolic roles of uncultured phylotypes and presents new synergistic relationships, both of which may contribute to the stability of the biogas reactor.Biogas production through anaerobic digestion of organic waste provides an attractive source of renewable energy and a sustainable waste management strategy. A comprehensive understanding of the microbial community that drives anaerobic digesters is essential to ensure stable and efficient energy production. Here, we characterize the intricate microbial networks and metabolic pathways in a thermophilic biogas reactor. We discuss the impact of frequently encountered microbial populations as well as the metabolism of newly discovered novel phylotypes that seem to play distinct roles within key microbial stages of anaerobic digestion in this stable high-temperature system. In particular, we draft a metabolic scenario whereby multiple uncultured syntrophic acetate-oxidizing bacteria are capable of syntrophically oxidizing acetate as well as longer-chain fatty acids (via the ß-oxidation and Wood-Ljundahl pathways) to hydrogen and carbon dioxide, which methanogens subsequently convert to methane. Copyright © 2016 American Society for Microbiology.
Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data.
DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community. Eight SMRT cells produced approximately 94 Mb of CCS reads from a biogas reactor microbiome sample that averaged 1319 nt in length and 99.7% accuracy. CCS data assembly generated a comparative number of large contigs greater than 1?kb, to those assembled from a ~190x larger HiSeq dataset (~18 Gb) produced from the same sample (i.e approximately 62% of total contigs). Hybrid assemblies using PacBio CCS and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length and number of large contigs. The incorporation of CCS data produced significant enhancements in taxonomic binning and genome reconstruction of two dominant phylotypes, which assembled and binned poorly using HiSeq data alone. Collectively these results illustrate the value of PacBio CCS reads in certain metagenomics applications.
The repeat structure of two paralogous genes, Yersinia ruckeri invasin (yrInv) and a “Y. ruckeri invasin-like molecule”, (yrIlm) sheds light on the evolution of adhesive capacities of a fish pathogen.
Inverse autotransporters comprise the recently identified type Ve secretion system and are exemplified by intimin from enterohaemorrhagic Escherichia coli and invasin from enteropathogenic Yersiniae. These proteins share a common domain architecture and promote bacterial adhesion to host cells. Here, we identified and characterized two putative inverse autotransporter genes in the fish pathogen Yersinia ruckeri NVH_3758, namely yrInv (for Y. ruckeri invasin) and yrIlm (for Y. ruckeri invasin-like molecule). When trying to clone the highly repetitive genes for structural and functional studies, we experienced problems in obtaining PCR products. PCR failures and the highly repetitive nature of inverse autotransporters prompted us to sequence the genome of Y. ruckeri NVH_3758 using PacBio sequencing, which produces some of the longest average read lengths available in the industry at this moment. According to our sequencing data, YrIlm is composed of 2603 amino acids (7812bp) and has a molecular mass of 256.4kDa. Based on the new genome information, we performed PCR analysis on four non-sequenced Y. ruckeri strains as well as the sequenced. Y. ruckeri type strain. We found that the genes are variably present in the strains, and that the length of yrIlm, when present, also varies. In addition, the length of the gene product for all strains, including the type strain, was much longer than expected based on deposited sequences. The internal repeats of the yrInv gene product are highly diverged, but represent the same bacterial immunoglobulin-like domains as in yrIlm. Using qRT-PCR, we found that yrIlm and yrInv are differentially expressed under conditions relevant for pathogenesis. In addition, we compared the genomic context of both genes in the newly sequenced Y. ruckeri strain to all available PacBio-sequenced Y. ruckeri genomes, and found indications of recent events of horizontal gene transfer. Taken together, this study demonstrates and highlights the power of Single Molecule Real-Time technology for sequencing highly repetitive proteins, and sheds light on the genetic events that gave rise to these highly repetitive genes in a commercially important fish pathogen. Copyright © 2017 Elsevier Inc. All rights reserved.
Development and validation of 58K SNP-array and high-density linkage map in Nile tilapia (O. niloticus).
Despite being the second most important aquaculture species in the world accounting for 7.4% of global production in 2015, tilapia aquaculture has lacked genomic tools like SNP-arrays and high-density linkage maps to improve selection accuracy and accelerate genetic progress. In this paper, we describe the development of a genotyping array containing more than 58,000 SNPs for Nile tilapia (Oreochromis niloticus). SNPs were identified from whole genome resequencing of 32 individuals from the commercial population of the Genomar strain, and were selected for the SNP-array based on polymorphic information content and physical distribution across the genome using the Orenil1.1 genome assembly as reference sequence. SNP-performance was evaluated by genotyping 4991 individuals, including 689 offspring belonging to 41 full-sib families, which revealed high-quality genotype data for 43,588 SNPs. A preliminary genetic linkage map was constructed using Lepmap2 which in turn was integrated with information from the O_niloticus_UMD1 genome assembly to produce an integrated physical and genetic linkage map comprising 40,186 SNPs distributed across 22 linkage groups (LGs). Around one-third of the LGs showed a different recombination rate between sexes, with the female being greater than the male map by a factor of 1.2 (1632.9 to 1359.6 cM, respectively), with most LGs displaying a sigmoid recombination profile. Finally, the sex-determining locus was mapped to position 40.53 cM on LG23, in the vicinity of the anti-Müllerian hormone (amh) gene. These new resources has the potential to greatly influence and improve the genetic gain when applying genomic selection and surpass the difficulties of efficient selection for invasively measured traits in Nile tilapia.
A pan-genome is defined as the set of all unique gene families found in one or more strains of a prokaryotic species. Due to the extensive within-species diversity in the microbial world, the pan-genome is often many times larger than a single genome. Studies of pan-genomes have become popular due to the easy access to whole-genome sequence data for prokaryotes. A pan-genome study reveals species diversity and gene families that may be of special interest, e.g because of their role in bacterial survival or their ability to discriminate strains.We present an R package for the study of prokaryotic pan-genomes. The R computing environment harbors endless possibilities with respect to statistical analyses and graphics. External free software is used for the heavy computations involved, and the R package provides functions for building a computational pipeline.We demonstrate parts of the package on a data set for the gram positive bacterium Enterococcus faecalis. The package is free to download and install from The Comprehensive R Archive Network.
Comprehensive molecular, genomic and phenotypic analysis of a major clone of Enterococcus faecalis MLST ST40.
Enterococcus faecalis is a multifaceted microorganism known to act as a beneficial intestinal commensal bacterium. It is also a dreaded nosocomial pathogen causing life-threatening infections in hospitalised patients. Isolates of a distinct MLST type ST40 represent the most frequent strain type of this species, distributed worldwide and originating from various sources (animal, human, environmental) and different conditions (colonisation/infection). Since enterococci are known to be highly recombinogenic we determined to analyse the microevolution and niche adaptation of this highly distributed clonal type.We compared a set of 42 ST40 isolates by assessing key molecular determinants, performing whole genome sequencing (WGS) and a number of phenotypic assays including resistance profiling, formation of biofilm and utilisation of carbon sources. We generated the first circular closed reference genome of an E. faecalis isolate D32 of animal origin and compared it with the genomes of other reference strains. D32 was used as a template for detailed WGS comparisons of high-quality draft genomes of 14 ST40 isolates. Genomic and phylogenetic analyses suggest a high level of similarity regarding the core genome, also demonstrated by similar carbon utilisation patterns. Distribution of known and putative virulence-associated genes did not differentiate between ST40 strains from a commensal and clinical background or an animal or human source. Further analyses of mobile genetic elements (MGE) revealed genomic diversity owed to: (1) a modularly structured pathogenicity island; (2) a site-specifically integrated and previously unknown genomic island of 138 kb in two strains putatively involved in exopolysaccharide synthesis; and (3) isolate-specific plasmid and phage patterns. Moreover, we used different cell-biological and animal experiments to compare the isolate D32 with a closely related ST40 endocarditis isolate whose draft genome sequence was also generated. D32 generally showed a greater capacity of adherence to human cell lines and an increased pathogenic potential in various animal models in combination with an even faster growth in vivo (not in vitro).Molecular, genomic and phenotypic analysis of representative isolates of a major clone of E. faecalis MLST ST40 revealed new insights into the microbiology of a commensal bacterium which can turn into a conditional pathogen.
Co-cultivation and transcriptome sequencing of two co-existing fish pathogens Moritella viscosa and Aliivibrio wodanis.
Aliivibrio wodanis and Moritella viscosa have often been isolated concurrently from fish with winter-ulcer disease. Little is known about the interaction between the two bacterial species and how the presence of one bacterial species affects the behaviour of the other.The impact on bacterial growth in co-culture was investigated in vitro, and the presence of A. wodanis has an inhibitorial effect on M. viscosa. Further, we have sequenced the complete genomes of these two marine Gram-negative species, and have performed transcriptome analysis of the bacterial gene expression levels from in vivo samples. Using bacterial implants in the fish abdomen, we demonstrate that the presence of A. wodanis is altering the gene expression levels of M. viscosa compared to when the bacteria are implanted separately.From expression profiling of the transcriptomes, it is evident that the presence of A. wodanis is altering the global gene expression of M. viscosa. Co-cultivation studies showed that A. wodanis is impeding the growth of M. viscosa, and that the inhibitorial effect is not contact-dependent.
Genome sequences of Corynebacterium pseudotuberculosis strains 48252 (human, pneumonia), CS_10 (lab strain), Ft_2193/ 67 (goat, pus), and CCUG 27541.
Here we report the genome sequencess of four Corynebacterium pseudotuberculosis strains. These include a strain isolated from a patient with C. pseudotuberculosis pneumonia (48252), a strain isolated from pus in goat (Ft_2193/67), a laboratory strain originating from strain Ft_2193/67 (CS_10), and the draft genome of an equine reference strain, CCUG 27541. Copyright © 2014 Håvelsrud et al.
The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch.
Silver birch (Betula pendula) is a pioneer boreal tree that can be induced to flower within 1 year. Its rapid life cycle, small (440-Mb) genome, and advanced germplasm resources make birch an attractive model for forest biotechnology. We assembled and chromosomally anchored the nuclear genome of an inbred B. pendula individual. Gene duplicates from the paleohexaploid event were enriched for transcriptional regulation, whereas tandem duplicates were overrepresented by environmental responses. Population resequencing of 80 individuals showed effective population size crashes at major points of climatic upheaval. Selective sweeps were enriched among polyploid duplicates encoding key developmental and physiological triggering functions, suggesting that local adaptation has tuned the timing of and cross-talk between fundamental plant processes. Variation around the tightly-linked light response genes PHYC and FRS10 correlated with latitude and longitude and temperature, and with precipitation for PHYC. Similar associations characterized the growth-promoting cytokinin response regulator ARR1, and the wood development genes KAK and MED5A.
The genus Pectobacterium, which belongs to the bacterial family Enterobacteriaceae, contains numerous species that cause soft rot diseases in a wide range of plants. The species Pectobacterium carotovorum is highly heterogeneous, indicating a need for re-evaluation and a better classification of the species. PacBio was used for sequencing of two soft-rot-causing bacterial strains (NIBIO1006T and NIBIO1392), initially identified as P. carotovorumstrains by fatty acid analysis and sequencing of three housekeeping genes (dnaX, icdA and mdh). Their taxonomic relationship to other Pectobacterium species was determined and the distance from any described species within the genus Pectobacterium was less than 94?% average nucleotide identity (ANI). Based on ANI, phylogenetic data and genome-to-genome distance, strains NIBIO1006T, NIBIO1392 and NCPPB3395 are suggested to represent a novel species of the genus Pectobacterium, for which the name Pectobacterium polaris sp. nov. is proposed. The type strain is NIBIO1006T (=DSM 105255T=NCPPB 4611T).
The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.
Development of molecular markers linked to powdery mildew resistance GenePm4bby combining SNP discovery from transcriptome sequencing data with bulked segregant analysis (BSR-Seq) in wheat.
Powdery mildew resistance genePm4b, originating fromTriticum persicum, is effective against the prevalentBlumeria graminisf. sp.tritici(Bgt) isolates from certain regions of wheat production in China. The lack of tightly linked molecular markers with the target gene prevents the precise identification ofPm4bduring the application of molecular marker-assisted selection (MAS). The strategy that combines the RNA-Seq technique and the bulked segregant analysis (BSR-Seq) was applied in an F2:3mapping population (237 families) derived from a pair of isogenic lines VPM1/7*Bainong 3217 F4(carryingPm4b) and Bainong 3217 to develop more closely linked molecular markers. RNA-Seq analysis of the two phenotypically contrasting RNA bulks prepared from the representative F2:3families generated 20,745,939 and 25,867,480 high-quality read pairs, and 82.8 and 80.2% of them were uniquely mapped to the wheat whole genome draft assembly for the resistant and susceptible RNA bulks, respectively. Variant calling identified 283,866 raw single nucleotide polymorphisms (SNPs) and InDels between the two bulks. The SNPs that were closely associated with the powdery mildew resistance were concentrated on chromosome 2AL. Among the 84 variants that were potentially associated with the disease resistance trait, 46 variants were enriched in an about 25 Mb region at the distal end of chromosome arm 2AL. FourPm4b-linked SNP markers were developed from these variants. Based on the sequences of Chinese Spring where these polymorphic SNPs were located, 98 SSR primer pairs were designed to develop distal markers flanking thePm4bgene. Three SSR markers,Xics13,Xics43, andXics76, were incorporated in the new genetic linkage map, which locatedPm4bin a 3.0 cM genetic interval spanning a 6.7 Mb physical genomic region. This region had a collinear relationship withBrachypodium distachyonchromosome 5, rice chromosome 4, and sorghum chromosome 6. Seven genes associated with disease resistance were predicted in this collinear genomic region, which included C2 domain protein, peroxidase activity protein, protein kinases of PKc_like super family, Mlo family protein, and catalytic domain of the serine/threonine kinases (STKc_IRAK like super family). The markers developed in the present study facilitate identification ofPm4bduring its MAS practice.