Menu
July 7, 2019

BusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation.

Metagenomics-based studies of mixed microbial communities are impacting biotechnology, life sciences and medicine. Computational binning of metagenomic data is a powerful approach for the culture-independent recovery of population-resolved genomic sequences, i.e. from individual or closely related, constituent microorganisms. Existing binning solutions often require a priori characterized reference genomes and/or dedicated compute resources. Extending currently available reference-independent binning tools, we developed the BusyBee Web server for the automated deconvolution of metagenomic data into population-level genomic bins using assembled contigs (Illumina) or long reads (Pacific Biosciences, Oxford Nanopore Technologies). A reversible compression step as well as bootstrapped supervised binning enable quick turnaround times. The binning results are represented in interactive 2D scatterplots. Moreover, bin quality estimates, taxonomic annotations and annotations of antibiotic resistance genes are computed and visualized. Ground truth-based benchmarks of BusyBee Web demonstrate comparably high performance to state-of-the-art binning solutions for assembled contigs and markedly improved performance for long reads (median F1 scores: 70.02-95.21%). Furthermore, the applicability to real-world metagenomic datasets is shown. In conclusion, our reference-independent approach automatically bins assembled contigs or long reads, exhibits high sensitivity and precision, enables intuitive inspection of the results, and only requires FASTA-formatted input. The web-based application is freely accessible at: https://ccb-microbe.cs.uni-saarland.de/busybee.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Sequencing a piece of history: complete genome sequence of the original Escherichia coli strain.

In 1885, Theodor Escherich first described the Bacillus coli commune, which was subsequently renamed Escherichia coli. We report the complete genome sequence of this original strain (NCTC 86). The 5?144?392?bp circular chromosome encodes the genes for 4805 proteins, which include antigens, virulence factors, antimicrobial-resistance factors and secretion systems, of a commensal organism from the pre-antibiotic era. It is located in the E. coli A subgroup and is closely related to E. coli K-12 MG1655. E. coli strain NCTC 86 and the non-pathogenic K-12, C, B and HS strains share a common backbone that is largely co-linear. The exception is a large 2?803?932?bp inversion that spans the replication terminus from gmhB to clpB. Comparison with E. coli K-12 reveals 41 regions of difference (577?351?bp) distributed across the chromosome. For example, and contrary to current dogma, E. coli NCTC 86 includes a nine gene sil locus that encodes a silver-resistance efflux pump acquired before the current widespread use of silver nanoparticles as an antibacterial agent, possibly resulting from the widespread use of silver utensils and currency in Germany in the 1800s. In summary, phylogenetic comparisons with other E. coli strains confirmed that the original strain isolated by Escherich is most closely related to the non-pathogenic commensal strains. It is more distant from the root than the pathogenic organisms E. coli 042 and O157?:?H7; therefore, it is not an ancestral state for the species.


July 7, 2019

Genome-wide identification of the mutation underlying fleece variation and discriminating ancestral hairy species from modern woolly sheep.

The composition and structure of fleece variation observed in mammals is a consequence of a strong selective pressure for fiber production after domestication. In sheep, fleece variation discriminates ancestral species carrying a long and hairy fleece from modern domestic sheep (Ovis aries) owning a short and woolly fleece. Here, we report that the “woolly” allele results from the insertion of an antisense EIF2S2 retrogene (called asEIF2S2) into the 3′ UTR of the IRF2BP2 gene leading to an abnormal IRF2BP2 transcript. We provide evidence that this chimeric IRF2BP2/asEIF2S2 messenger 1) targets the genuine sense EIF2S2 RNA and 2) creates a long endogenous double-stranded RNA which alters the expression of both EIF2S2 and IRF2BP2 mRNA. This represents a unique example of a phenotype arising via a RNA-RNA hybrid, itself generated through a retroposition mechanism. Our results bring new insights on the sheep population history thanks to the identification of the molecular origin of an evolutionary phenotypic variation.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Nonomuraea sp. ATCC 55076 harbours the largest actinomycete chromosome to date and the kistamicin biosynthetic gene cluster.

Glycopeptide antibiotics (GPAs) have served as potent clinical drugs and as an inspiration to chemists in various disciplines. Among known GPAs, complestatin, chloropeptin, and kistamicin are unique in that they contain an unusual indole-phenol crosslink. The mechanism of formation of this linkage is unknown, and to date, the biosynthetic gene cluster of only one GPA with an indole-phenol crosslink, that of complestatin, has been identified. Here, we report the genome sequence of the kistamicin producer Nonomuraea sp. ATCC 55076. We find that this strain harbours the largest actinobacterial chromosome to date, consisting of a single linear chromosome of ~13.1 Mbp. AntiSMASH analysis shows that ~32 biosynthetic gene clusters and ~10% of the genome are devoted to production of secondary metabolites, which include 1,6-dihydroxyphenazine and nomuricin, a new anthraquinone-type pentacyclic compound that we report herein. The kistamicin gene cluster (kis) was identified bioinformatically. A unique feature of kis is that it contains two cytochrome P450 enzymes, which likely catalyze three crosslinking reactions. These findings set the stage for examining the biosynthesis of kistamicin and its unusual indole-phenol crosslink in the future.


July 7, 2019

Evolutionary strata on young mating-type chromosomes despite the lack of sexual antagonism.

Sex chromosomes can display successive steps of recombination suppression known as “evolutionary strata,” which are thought to result from the successive linkage of sexually antagonistic genes to sex-determining genes. However, there is little evidence to support this explanation. Here we investigate whether evolutionary strata can evolve without sexual antagonism using fungi that display suppressed recombination extending beyond loci determining mating compatibility despite lack of male/female roles associated with their mating types. By comparing full-length chromosome assemblies from five anther-smut fungi with or without recombination suppression in their mating-type chromosomes, we inferred the ancestral gene order and derived chromosomal arrangements in this group. This approach shed light on the chromosomal fusion underlying the linkage of mating-type loci in fungi and provided evidence for multiple clearly resolved evolutionary strata over a range of ages (0.9-2.1 million years) in mating-type chromosomes. Several evolutionary strata did not include genes involved in mating-type determination. The existence of strata devoid of mating-type genes, despite the lack of sexual antagonism, calls for a unified theory of sex-related chromosome evolution, incorporating, for example, the influence of partially linked deleterious mutations and the maintenance of neutral rearrangement polymorphism due to balancing selection on sexes and mating types.


July 7, 2019

Emergence of a new Neisseria meningitidis clonal complex 11 lineage 11.2 clade as an effective urogenital pathogen.

Neisseria meningitidis (Nm) clonal complex 11 (cc11) lineage is a hypervirulent pathogen responsible for outbreaks of invasive meningococcal disease, including among men who have sex with men, and is increasingly associated with urogenital infections. Recently, clusters of Nm urethritis have emerged primarily among heterosexual males in the United States. We determined that nonencapsulated meningococcal isolates from an ongoing Nm urethritis outbreak among epidemiologically unrelated men in Columbus, Ohio, are linked to increased Nm urethritis cases in multiple US cities, including Atlanta and Indianapolis, and that they form a unique clade (the US Nm urethritis clade, US_NmUC). The isolates belonged to the cc11 lineage 11.2/ET-15 with fine type of PorA P1.5-1, 10-8; FetA F3-6; PorB 2-2 and express a unique FHbp allele. A common molecular fingerprint of US_NmUC isolates was an IS1301 element in the intergenic region separating the capsule ctr-css operons and adjacent deletion of cssA/B/C and a part of csc, encoding the serogroup C capsule polymerase. This resulted in the loss of encapsulation and intrinsic lipooligosaccharide sialylation that may promote adherence to mucosal surfaces. Furthermore, we detected an IS1301-mediated inversion of an ~20-kb sequence near the cps locus. Surprisingly, these isolates had acquired by gene conversion the complete gonococcal denitrification norB-aniA gene cassette, and strains grow well anaerobically. The cc11 US_NmUC isolates causing urethritis clusters in the United States may have adapted to a urogenital environment by loss of capsule and gene conversion of the Neisseria gonorrheae norB-aniA cassette promoting anaerobic growth.


July 7, 2019

Phylogeography of Burkholderia pseudomallei isolates, Western Hemisphere.

The bacterium Burkholderia pseudomallei causes melioidosis, which is mainly associated with tropical areas. We analyzed single-nucleotide polymorphisms (SNPs) among genome sequences from isolates of B. pseudomallei that originated in the Western Hemisphere by comparing them with genome sequences of isolates that originated in the Eastern Hemisphere. Analysis indicated that isolates from the Western Hemisphere form a distinct clade, which supports the hypothesis that these isolates were derived from a constricted seeding event from Africa. Subclades have been resolved that are associated with specific regions within the Western Hemisphere and suggest that isolates might be correlated geographically with cases of melioidosis. One isolate associated with a former World War II prisoner of war was believed to represent illness 62 years after exposure in Southeast Asia. However, analysis suggested the isolate originated in Central or South America.


July 7, 2019

Rare Pyrenophora teres hybridization events revealed by development of sequence-specific PCR markers.

Pyrenophora teres f. teres and P. teres f. maculata cause net form and spot form, respectively, of net blotch on barley (Hordeum vulgare). The two forms reproduce sexually, producing hybrids with genetic and pathogenic variability. Phenotypic identification of hybrids is challenging because lesions induced by hybrids on host plants resemble lesions induced by either P. teres f. teres or P. teres f. maculata. In this study, 12 sequence-specific polymerase chain reaction markers were developed based on expressed regions spread across the genome. The primers were validated using 210 P. teres isolates, 2 putative field hybrids (WAC10721 and SNB172), 50 laboratory-produced hybrids, and 7 isolates collected from barley grass (H. leporinum). The sequence-specific markers confirmed isolate WAC10721 as a hybrid. Only four P. teres f. teres markers amplified on DNA of barley grass isolates. Amplified fragment length polymorphism markers suggested that P. teres barley grass isolates are genetically different from P. teres barley isolates and that the second putative hybrid (SNB172) is a barley grass isolate. We developed a suite of markers which clearly distinguish the two forms of P. teres and enable unambiguous identification of hybrids.


July 7, 2019

Automated structural variant verification in human genomesw using single-molecule electronic DNA mapping.

The importance of structural variation in human disease and the difficulty of detecting structural variants larger than 50 base pairs has led to the development of several long-read sequencing technologies and optical mapping platforms. Frequently, multiple technologies and ad hoc methods are required to obtain a consensus regarding the location, size and nature of a structural variant, with no approach able to reliably bridge the gap of variant sizes between the domain of short-read approaches and the largest rearrangements observed with optical mapping. To address this unmet need, we have developed a new software package, SV-VerifyTM, which utilizes data collected with the Nabsys High Definition Mapping (HD-MappingTM) system, to perform hypothesis-based verification of putative deletions. We demonstrate that whole genome maps, constructed from electronic detection of tagged DNA, hundreds of kilobases in length, can be used effectively to facilitate calling of structural variants ranging in size from 300 base pairs to hundreds of kilobase pairs. SV-Verify implements hypothesis-based verification of putative structural variants using a set of support vector machines and is capable of concurrently testing several thousand independent hypotheses. We describe support vector machine training, utilizing a well-characterized human genome, and application of the resulting classifiers to another human genome, demonstrating high sensitivity and specificity for deletions >= 300 base pairs.


July 7, 2019

ALUMINUM RESISTANCE TRANSCRIPTION FACTOR 1 (ART1) contributes to natural variation in aluminum resistance in diverse genetic backgrounds of rice (O. sativa)

Abstract Transcription factors (TFs) regulate the expression of other genes to indirectly mediate stress resistance mechanisms. Therefore, when studying TF-mediated stress resistance, it is important to understand how TFs interact with genes in the genetic background. Here, we fine-mapped the aluminum (Al) resistance QTL Alt12.1 to a 44-kb region containing six genes. Among them is ART1, which encodes a C2H2-type zinc finger TF required for Al resistance in rice. The mapping parents, Al-resistant cv Azucena (tropical japonica) and Al-sensitive cv IR64 (indica), have extensive sequence polymorphism within the ART1 coding region, but similar ART1 expression levels. Using reciprocal near-isogenic lines (NILs) we examined how allele-swapping the Alt12.1 locus would affect plant responses to Al. Analysis of global transcriptional responses to Al stress in roots of the NILs alongside their recurrent parents demonstrated that the presence of the Alt12.1 from Al-resistant Azucena led to greater changes in gene expression in response to Al when compared to the Alt12.1 from IR64 in both genetic backgrounds. The presence of the ART1 allele from the opposite parent affected the expression of several genes not previously implicated in rice Al tolerance. We highlight examples where putatively functional variation in cis-regulatory regions of ART1-regulated genes interacts with ART1 to determine gene expression in response to Al. This ART1–promoter interaction may be associated with transgressive variation for Al resistance in the Azucena × IR64 population. These results illustrate how ART1 interacts with the genetic background to contribute to quantitative phenotypic variation in rice Al resistance.


July 7, 2019

Genome analysis of Endomicrobium proavitum suggests loss and gain of relevant functions during the evolution of intracellular symbionts.

Bacterial endosymbionts of eukaryotes show progressive genome erosion, but detailed investigations of the evolutionary processes involved in the transition to an intracellular lifestyle are generally hampered by the lack of extant free-living lineages. Here, we characterize the genome of the recently isolated, free-living Endomicrobium proavitum, the second member of the Elusimicrobia phylum brought into pure culture, and compare it to the closely related “Candidatus Endomicrobium trichonymphae” strain Rs-D17, a previously described but uncultured endosymbiont of termite gut flagellates. A reconstruction of the metabolic pathways of Endomicrobium proavitum matched the fermentation products formed in pure culture and underscored its restriction to glucose as the substrate. However, several pathways present in the free-living strain, e.g., for the uptake and activation of glucose and its subsequent fermentation, ammonium assimilation, and outer membrane biogenesis, were absent or disrupted in the endosymbiont, probably lost during the massive genome rearrangements that occurred during symbiogenesis. While the majority of the genes in strain Rs-D17 have orthologs in Endomicrobium proavitum, the endosymbiont also possesses a number of functions that are absent from the free-living strain and may represent adaptations to the intracellular lifestyle. Phylogenetic analysis revealed that the genes encoding glucose 6-phosphate and amino acid transporters, acetaldehyde/alcohol dehydrogenase, and the pathways of glucuronic acid catabolism and thiamine pyrophosphate biosynthesis were either acquired by horizontal gene transfer or may represent ancestral traits that were lost in the free-living strain. The polyphyletic origin of Endomicrobia in different flagellate hosts makes them excellent models for future studies of convergent and parallel evolution during symbiogenesis.IMPORTANCE The isolation of a free-living relative of intracellular symbionts provides the rare opportunity to identify the evolutionary processes that occur in the course of symbiogenesis. Our study documents that the genome of “Candidatus Endomicrobium trichonymphae,” which represents a clade of endosymbionts that have coevolved with termite gut flagellates for more than 40 million years, is not simply a subset of the genes present in Endomicrobium proavitum, a member of the ancestral, free-living lineage. Rather, comparative genomics revealed that the endosymbionts possess several relevant functions that were either prerequisites for colonization of the intracellular habitat or might have served to compensate for genes losses that occurred during genome erosion. Some gene sets found only in the endosymbiont were apparently acquired by horizontal transfer from other gut bacteria, which suggests that the intracellular bacteria of flagellates are not entirely cut off from gene flow. Copyright © 2017 American Society for Microbiology.


July 7, 2019

Genomics-enabled analysis of the emergent disease cotton bacterial blight.

Cotton bacterial blight (CBB), an important disease of (Gossypium hirsutum) in the early 20th century, had been controlled by resistant germplasm for over half a century. Recently, CBB re-emerged as an agronomic problem in the United States. Here, we report analysis of cotton variety planting statistics that indicate a steady increase in the percentage of susceptible cotton varieties grown each year since 2009. Phylogenetic analysis revealed that strains from the current outbreak cluster with race 18 Xanthomonas citri pv. malvacearum (Xcm) strains. Illumina based draft genomes were generated for thirteen Xcm isolates and analyzed along with 4 previously published Xcm genomes. These genomes encode 24 conserved and nine variable type three effectors. Strains in the race 18 clade contain 3 to 5 more effectors than other Xcm strains. SMRT sequencing of two geographically and temporally diverse strains of Xcm yielded circular chromosomes and accompanying plasmids. These genomes encode eight and thirteen distinct transcription activator-like effector genes. RNA-sequencing revealed 52 genes induced within two cotton cultivars by both tested Xcm strains. This gene list includes a homeologous pair of genes, with homology to the known susceptibility gene, MLO. In contrast, the two strains of Xcm induce different clade III SWEET sugar transporters. Subsequent genome wide analysis revealed patterns in the overall expression of homeologous gene pairs in cotton after inoculation by Xcm. These data reveal important insights into the Xcm-G. hirsutum disease complex and strategies for future development of resistant cultivars.


July 7, 2019

Correspondence on Lovell et al.: response to Bornelöv et al.

While the analysis of Bornelöv et al. is informative, they provide evidence for the existence of only 3% of the reported avian missing genes set, and thus do not significantly challenge our main findings that specific groups of syntenic protein-coding genes are missing in birds.This is a response to the Correspondence article: https://www.dx.doi.org/10.1186/s13059-017-1231-1.


July 7, 2019

Genomic exploration of individual giant ocean viruses.

Viruses are major pathogens in all biological systems. Virus propagation and downstream analysis remains a challenge, particularly in the ocean where the majority of their microbial hosts remain recalcitrant to current culturing techniques. We used a cultivation-independent approach to isolate and sequence individual viruses. The protocol uses high-speed fluorescence-activated virus sorting flow cytometry, multiple displacement amplification (MDA), and downstream genomic sequencing. We focused on ‘giant viruses’ that are readily distinguishable by flow cytometry. From a single-milliliter sample of seawater collected from off the dock at Boothbay Harbor, ME, USA, we sorted almost 700 single virus particles, and subsequently focused on a detailed genome analysis of 12. A wide diversity of viruses was identified that included Iridoviridae, extended Mimiviridae and even a taxonomically novel (unresolved) giant virus. We discovered a viral metacaspase homolog in one of our sorted virus particles and discussed its implications in rewiring host metabolism to enhance infection. In addition, we demonstrated that viral metacaspases are widespread in the ocean. We also discovered a virus that contains both a reverse transcriptase and a transposase; although highly speculative, we suggest such a genetic complement would potentially allow this virus to exploit a latency propagation mechanism. Application of single virus genomics provides a powerful opportunity to circumvent cultivation of viruses, moving directly to genomic investigation of naturally occurring viruses, with the assurance that the sequence data is virus-specific, non-chimeric and contains no cellular contamination.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.