Menu
September 22, 2019

100K Pathogen Genome Project.

The 100K Pathogen Genome Project is producing draft and closed genome sequences from diverse pathogens. This project expanded globally to include a snapshot of global bacterial genome diversity. The genomes form a sequence database that has a variety of uses from systematics to public health. Copyright © 2017 Weimer.


September 22, 2019

Single-cell (meta-)genomics of a dimorphic Candidatus Thiomargarita nelsonii reveals genomic plasticity.

The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group I intron also carried a MITE sequence that, like the hupL MITE family, occurs broadly across the genome. The presence of a high degree of mobile elements in genes central to Thiomargarita’s core metabolism has not been previously reported in free-living bacteria and suggests a highly mutable genome.


September 22, 2019

Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection.

Productivity of ruminant livestock depends on the rumen microbiota, which ferment indigestible plant polysaccharides into nutrients used for growth. Understanding the functions carried out by the rumen microbiota is important for reducing greenhouse gas production by ruminants and for developing biofuels from lignocellulose. We present 410 cultured bacteria and archaea, together with their reference genomes, representing every cultivated rumen-associated archaeal and bacterial family. We evaluate polysaccharide degradation, short-chain fatty acid production and methanogenesis pathways, and assign specific taxa to functions. A total of 336 organisms were present in available rumen metagenomic data sets, and 134 were present in human gut microbiome data sets. Comparison with the human microbiome revealed rumen-specific enrichment for genes encoding de novo synthesis of vitamin B12, ongoing evolution by gene loss and potential vertical inheritance of the rumen microbiome based on underrepresentation of markers of environmental stress. We estimate that our Hungate genome resource represents ~75% of the genus-level bacterial and archaeal taxa present in the rumen.


September 22, 2019

MCF-7 breast cancer cell line PacBio generated transcriptome has ~300 novel transcribed regions, un-annotated in both RefSeq and GENCODE, and absent in the liver, heart and brain transcriptomes

Illuminating the “dark” regions of the human genome remains an ongoing effort, a decade and a half after the human genome was sequenced – RefSeq and GENCODE being two of the major annotation databases. Pacific Biosciences (PacBio) has provided open access to the transcriptome of MCF-7, a breast cancer cell line that has provided significant therapeutic advancement in breast cancer research since the 1970s. PacBio sequencing generates much longer reads compared to second-generation sequencing technologies, with a trade-off of lower throughput, higher error rate and more cost per base. Here, this transcriptome was analyzed using the YeATS pipeline, with additionally introduced kmer based algorithms, reducing computational times to a few hours on a simple workstation. Out of ~300 transcripts that have no match in both RefSeq and GENCODE, ~250 are absent in the transcriptomes of the heart, liver and brain, also provided by PacBio. Also, ~200 transcripts are absent in a recent catalogue of un-annotated long non-coding RNAs from 6,503 samples (~43 Terabases of sequence data) [1], and only two present in common in an experimental workflow RACE-Seq that reported 2,556 novel transcripts [2]. ~100 transcripts have >100 amino acid open reading frames, and have the potential of being protein coding genes. ORF based annotation also identified few bacterial transcripts in the PacBio database mapped to the human genome, and one human transcript that has been annotated as bacterial in the NCBI database. The current work reiterates the under-utilization of transcriptomes for annotating genomes. It also provides new leads for investigating breast cancer by virtue of exclusively expressed transcripts not expressed in other tissues, which have the prospects of breast cancer biomarkers based on further investigations.


September 22, 2019

Long-term changes of bacterial and viral compositions in the intestine of a recovered Clostridium difficile patient after fecal microbiota transplantation

Fecal microbiota transplantation (FMT) is an effective treatment for recurrent Clostridium difficile infections (RCDIs). However, long-term effects on the patients’ gut microbiota and the role of viruses remain to be elucidated. Here, we characterized bacterial and viral microbiota in the feces of a cured RCDI patient at various time points until 4.5 yr post-FMT compared with the stool donor. Feces were subjected to DNA sequencing to characterize bacteria and double-stranded DNA (dsDNA) viruses including phages. The patient’s microbial communities varied over time and showed little overall similarity to the donor until 7 mo post-FMT, indicating ongoing gut microbiota adaption in this time period. After 4.5 yr, the patient’s bacteria attained donor-like compositions at phylum, class, and order levels with similar bacterial diversity. Differences in the bacterial communities between donor and patient after 4.5 yr were seen at lower taxonomic levels. C. difficile remained undetectable throughout the entire timespan. This demonstrated sustainable donor feces engraftment and verified long-term therapeutic success of FMT on the molecular level. Full engraftment apparently required longer than previously acknowledged, suggesting the implementation of year-long patient follow-up periods into clinical practice. The identified dsDNA viruses were mainly Caudovirales phages. Unexpectedly, sequences related to giant algae–infecting Chlorella viruses were also detected. Our findings indicate that intestinal viruses may be implicated in the establishment of gut microbiota. Therefore, virome analyses should be included in gut microbiota studies to determine the roles of phages and other viruses—such as Chlorella viruses—in human health and disease, particularly during RCDI.


September 22, 2019

Design of primers for evaluation of lactic acid bacteria populations in complex biological samples.

Lactic acid bacteria (LAB) are important for human health. However, the relative abundance of LAB in complex samples, such as fecal samples, is low and their presence and diversity (at the species level) is understudied. Therefore, we designed LAB-specific primer pairs based on 16S rRNA gene consensus sequences from 443 species of LAB from seven genera. The LAB strains selected were genetically similar and known to play a role in human health. Prior to primer design, we obtained consistent sequences for the primer-binding sites by comparing the 16S rRNA gene sequences, manually identifying single-stranded primers and modifying these primers using degenerate bases. We assembled primer pairs with product sizes of >400 bp. Optimal LAB-specific primers were screened using three methods: PCR amplification, agarose gel electrophoresis and single-molecule real-time (SMRT) sequencing analysis. During the SMRT analysis procedure, we focused on sequence reads and diversity at the species level of target LAB in three fecal samples, using the universal bacterium primer 27f/1492r as a reference control. We created a phylogenetic tree to confirm the ability of the best candidate primer pair to differentiate amongst species. The results revealed that LAB-specific primer L5, with a product size of 750 bp, could generate 3222, 2552, and 3405 sequence reads from fecal Samples 1, 2, and 3. This represented 14, 13 and 10% of all target LAB sequence reads, respectively, compared with 2, 0.8, and 0.8% using the 27f/1492r primer. In addition, L5 detected LAB that were in low abundance and could not be detected using the 27f/1492r primer. The phylogenetic tree based on the alignments between the forward and reverse primer of L5 showed that species within the seven target LAB genera could be distinguished from each other, confirming L5 is a powerful tool for inferring phylogenetic relationships amongst LAB species. In conclusion, L5 is a LAB-specific primer that can be used for high-throughput sequencing and identification of taxa to the species level, especially in complex samples with relatively low LAB content. This enables further research on LAB population diversity in complex ecosystem, and on relationships between LAB and their hosts.


September 22, 2019

Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853.

Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the virulence genes lecA, lasB, quorum sensing regulators LasI/R, and the type I, III and VI secretion systems were observed in the two strains.The complete genome sequence of P. aeruginosa ATCC 27853 reveals the comprehensive genetic background of the strain, and provides genetic basis for several interesting findings about the functions of surface associated proteins, prophages, and genomic islands. Comparative transcriptome analysis of P. aeruginosa ATCC 27853 and PAO1 revealed several classes of differentially expressed genes in the two strains, underlying the genetic and molecular details of several known and yet to be explored morphological and physiological potentials of P. aeruginosa ATCC 27853.


September 22, 2019

Genomic insights into the acid adaptation of novel methanotrophs enriched from acidic forest soils.

Soil acidification is accelerated by anthropogenic and agricultural activities, which could significantly affect global methane cycles. However, detailed knowledge of the genomic properties of methanotrophs adapted to acidic soils remains scarce. Using metagenomic approaches, we analyzed methane-utilizing communities enriched from acidic forest soils with pH 3 and 4, and recovered near-complete genomes of proteobacterial methanotrophs. Novel methanotroph genomes designated KS32 and KS41, belonging to two representative clades of methanotrophs (Methylocystis of Alphaproteobacteria and Methylobacter of Gammaproteobacteria), were dominant. Comparative genomic analysis revealed diverse systems of membrane transporters for ensuring pH homeostasis and defense against toxic chemicals. Various potassium transporter systems, sodium/proton antiporters, and two copies of proton-translocating F1F0-type ATP synthase genes were identified, which might participate in the key pH homeostasis mechanisms in KS32. In addition, the V-type ATP synthase and urea assimilation genes might be used for pH homeostasis in KS41. Genes involved in the modification of membranes by incorporation of cyclopropane fatty acids and hopanoid lipids might be used for reducing proton influx into cells. The two methanotroph genomes possess genes for elaborate heavy metal efflux pumping systems, possibly owing to increased heavy metal toxicity in acidic conditions. Phylogenies of key genes involved in acid adaptation, methane oxidation, and antiviral defense in KS41 were incongruent with that of 16S rRNA. Thus, the detailed analysis of the genome sequences provides new insights into the ecology of methanotrophs responding to soil acidification.


September 22, 2019

Shift in fungal communities and associated enzyme activities along an age gradient of managed Pinus sylvestris stands.

Forestry reshapes ecosystems with respect to tree age structure, soil properties and vegetation composition. These changes are likely to be paralleled by shifts in microbial community composition with potential feedbacks on ecosystem functioning. Here, we assessed fungal communities across a chronosequence of managed Pinus sylvestris stands and investigated correlations between taxonomic composition and extracellular enzyme activities. Not surprisingly, clear-cutting had a negative effect on ectomycorrhizal fungal abundance and diversity. In contrast, clear-cutting favoured proliferation of saprotrophic fungi correlated with enzymes involved in holocellulose decomposition. During stand development, the re-establishing ectomycorrhizal fungal community shifted in composition from dominance by Atheliaceae in younger stands to Cortinarius and Russula species in older stands. Late successional ectomycorrhizal taxa correlated with enzymes involved in mobilisation of nutrients from organic matter, indicating intensified nutrient limitation. Our results suggest that maintenance of functional diversity in the ectomycorrhizal fungal community may sustain long-term forest production by retaining a capacity for symbiosis-driven recycling of organic nutrient pools.


September 22, 2019

A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.

Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.© 2018 Wang et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Accurate determination of bacterial abundances in human metagenomes using full-length 16S sequencing reads

DNA sequencing of PCR-amplified marker genes, especially but not limited to the 16S rRNA gene, is perhaps the most common approach for profiling microbial communities. Due to technological constraints of commonly available DNA sequencing, these approaches usually take the form of short reads sequenced from a narrow, targeted variable region, with a corresponding loss of taxonomic resolution relative to the full length marker gene. We use Pacific Biosciences single-molecule, real-time circular consensus sequencing to sequence amplicons spanning the entire length of the 16S rRNA gene. However, this sequencing technology suffers from high sequencing error rate that needs to be addressed in order to take full advantage of the longer sequence. Here, we present a method to model the sequencing error process using a generalized pair hidden Markov chain model and estimate bacterial abundances in microbial samples. We demonstrate, with simulated and real data, that our model and its associated estimation procedure are able to give accurate estimates at the species (or subspecies) level, and is more flexible than existing methods like SImple Non-Bayesian TAXonomy (SINTAX).


September 22, 2019

Meeting report: 31st International Mammalian Genome Conference, Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications.

High on the Heidelberg hills, inside the Advanced Training Centre of the European Molecular Biology Laboratory (EMBL) campus with its unique double-helix staircase, scientists gathered for the EMBL conference “Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications,” organized in cooperation with the International Mammalian Genome Society (IMGS) and the Mouse Molecular Genetics (MMG) group. The conference attracted 205 participants from 30 countries, representing 6 of the 7 continents-all except Antarctica. It was a richly diverse group of geneticists, clinicians, and bioinformaticians, with presentations by established and junior investigators, including many trainees. From the 24th-27th of October 2017, they shared exciting advances in mammalian genetics and genomics research, from the introduction of cutting-edge technologies to descriptions of translational studies involving highly relevant models of human disease.


September 22, 2019

Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation.

Species-specific, new, or “orphan” genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.© 2018 Werner et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

A community-based culture collection for targeting novel plant growth-promoting bacteria from the sugarcane microbiome.

The soil-plant ecosystem harbors an immense microbial diversity that challenges investigative approaches to study traits underlying plant-microbe association. Studies solely based on culture-dependent techniques have overlooked most microbial diversity. Here we describe the concomitant use of culture-dependent and -independent techniques to target plant-beneficial microbial groups from the sugarcane microbiome. The community-based culture collection (CBC) approach was used to access microbes from roots and stalks. The CBC recovered 399 unique bacteria representing 15.9% of the rhizosphere core microbiome and 61.6-65.3% of the endophytic core microbiomes of stalks. By cross-referencing the CBC (culture-dependent) with the sugarcane microbiome profile (culture-independent), we designed a synthetic community comprised of naturally occurring highly abundant bacterial groups from roots and stalks, most of which has been poorly explored so far. We then used maize as a model to probe the abundance-based synthetic inoculant. We show that when inoculated in maize plants, members of the synthetic community efficiently colonize plant organs, displace the natural microbiota and dominate at 53.9% of the rhizosphere microbial abundance. As a result, inoculated plants increased biomass by 3.4-fold as compared to uninoculated plants. The results demonstrate that abundance-based synthetic inoculants can be successfully applied to recover beneficial plant microbes from plant microbiota.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.