Genome mining Archives - Page 2 of 16

April 21, 2020

Comparative genomic analysis of Lactobacillus mucosae LM1 identifies potential niche-specific genes and pathways for gastrointestinal adaptation.

Lactobacillus mucosae is currently of interest as putative probiotics due to their metabolic capabilities and ability to colonize host mucosal niches. L. mucosae LM1 has been studied in its functions in cell adhesion and pathogen inhibition, etc. It demonstrated unique abilities to use energy from carbohydrate and non-carbohydrate sources. Due to these functions, we report the first complete genome sequence of an L. mucosae strain, L. mucosae LM1. Analysis of the pan-genome in comparison with closely-related Lactobacillus species identified a complete glycogen metabolism pathway, as well as folate biosynthesis, complementing previous proteomic data on the LM1 strain. It also revealed common and unique niche-adaptation genes among the various L. mucosae strains. The aim of this study was to derive genomic information that would reveal the probable mechanisms underlying the probiotic effect of L. mucosae LM1, and provide a better understanding of the nature of L. mucosae sp. Copyright © 2017 Elsevier Inc. All rights reserved.

April 21, 2020

Genome mining reveals the origin of a bald phenotype and a cryptic nucleocidin gene cluster in Streptomyces asterosporus DSM 41452.

Streptomyces asterosporus DSM 41452 is a producer of the polyketide annimycin and the non-ribosomal depsipeptide WS9326A. This strain is also notable for exhibiting a bald phenotype that is devoid of spores and aerial mycelium when grown on solid media. Based on the similarity of the 16S rRNA sequence to Streptomyces calvus, the only known producer of the fluorometabolite nucleocidin, the genome of S. asterosporus DSM 41452 was sequenced and analyzed. Twenty-nine natural product gene clusters were detected in the genome, including a gene cluster predicted to encode the fluorometabolite nucleocidin. Through genome analysis and gene complementation experiments, we demonstrate that the bald phenotype arises from a transposon gene inserted within the promoter sequence for the pleiotropic regulator adpA. Complementation of S. asterosporus DSM 41452 with a functional adpA sequence restored morphological differentiation and promoted the production of nucleocidin. Copyright © 2019 Elsevier B.V. All rights reserved.

April 21, 2020

Structural and functional characterization of an intradiol ring-cleavage dioxygenase from the polyphagous spider mite herbivore Tetranychus urticae Koch.

Genome analyses of the polyphagous spider mite herbivore Tetranychus urticae (two-spotted spider mite) revealed the presence of a set of 17 genes that code for secreted proteins belonging to the “intradiol dioxygenase-like” subgroup. Phylogenetic analyses indicate that this novel enzyme family has been acquired by horizontal gene transfer. In order to better understand the role of these proteins in T. urticae, we have structurally and functionally characterized one paralog (tetur07g02040). It was demonstrated that this protein is indeed an intradiol ring-cleavage dioxygenase, as the enzyme is able to cleave catechol between two hydroxyl-groups using atmospheric dioxygen. The enzyme was characterized functionally and structurally. The active site of the T. urticae enzyme contains an Fe3+ cofactor that is coordinated by two histidine and two tyrosine residues, an arrangement that is similar to those observed in bacterial homologs. However, the active site is significantly more solvent exposed than in bacterial proteins. Moreover, the mite enzyme is monomeric, while almost all structurally characterized bacterial homologs form oligomeric assemblies. Tetur07g02040 is not only the first spider mite dioxygenase that has been characterized at the molecular level, but is also the first structurally characterized intradiol ring-cleavage dioxygenase originating from a eukaryote.Copyright © 2018 Elsevier Ltd. All rights reserved.

April 21, 2020

Complete genome and data mining of Aeromicrobium sp. A1-2 isolated from the Southern Ocean

Aeromicrobium sp. A1–2, a putative new species isolated from marine sediments in the King George Island, Antarctica, was completely sequenced. The genome data showed biosynthetic potential of new natural products and clues for environmental adaptation of this actinobacterium.

April 21, 2020

Genomics-driven discovery of a biosynthetic gene cluster required for the synthesis of BII-Rafflesfungin from the fungus Phoma sp. F3723.

Phomafungin is a recently reported broad spectrum antifungal compound but its biosynthetic pathway is unknown. We combed publicly available Phoma genomes but failed to find any putative biosynthetic gene cluster that could account for its biosynthesis.Therefore, we sequenced the genome of one of our Phoma strains (F3723) previously identified as having antifungal activity in a high-throughput screen. We found a biosynthetic gene cluster that was predicted to synthesize a cyclic lipodepsipeptide that differs in the amino acid composition compared to Phomafungin. Antifungal activity guided isolation yielded a new compound, BII-Rafflesfungin, the structure of which was determined.We describe the NRPS-t1PKS cluster ‘BIIRfg’ compatible with the synthesis of the cyclic lipodepsipeptide BII-Rafflesfungin [HMHDA-L-Ala-L-Glu-L-Asn-L-Ser-L-Ser-D-Ser-D-allo-Thr-Gly]. We report new Stachelhaus codes for Ala, Glu, Asn, Ser, Thr, and Gly. We propose a mechanism for BII-Rafflesfungin biosynthesis, which involves the formation of the lipid part by BIIRfg_PKS followed by activation and transfer of the lipid chain by a predicted AMP-ligase on to the first PCP domain of the BIIRfg_NRPS gene.

April 21, 2020

Direct pathway cloning of the sodorifen biosynthetic gene cluster and recombinant generation of its product in E. coli.

Serratia plymuthica WS3236 was selected for whole genome sequencing based on preliminary genetic and chemical screening indicating the presence of multiple natural product pathways. This led to the identification of a putative sodorifen biosynthetic gene cluster (BGC). The natural product sodorifen is a volatile organic compound (VOC) with an unusual polymethylated hydrocarbon bicyclic structure (C16H26) produced by selected strains of S. plymuthica. The BGC encoding sodorifen consists of four genes, two of which (sodA, sodB) are homologs of genes encoding enzymes of the non-mevalonate pathway and are thought to enhance the amounts of available farnesyl pyrophosphate (FPP), the precursor of sodorifen. Proceeding from FPP, only two enzymes are necessary to produce sodorifen: an S-adenosyl methionine dependent methyltransferase (SodC) with additional cyclisation activity and a terpene-cyclase (SodD). Previous analysis of S. plymuthica found sodorifen production titers are generally low and vary significantly among different producer strains. This precludes studies on the still elusive biological function of this structurally and biosynthetically fascinating bacterial terpene.Sequencing and mining of the S. plymuthica WS3236 genome revealed the presence of 38 BGCs according to antiSMASH analysis, including a putative sodorifen BGC. Further genome mining for sodorifen and sodorifen-like BGCs throughout bacteria was performed using SodC and SodD as queries and identified a total of 28 sod-like gene clusters. Using direct pathway cloning (DiPaC) we intercepted the 4.6 kb candidate sodorifen BGC from S. plymuthica WS3236 (sodA-D) and transformed it into Escherichia coli BL21. Heterologous expression under the control of the tetracycline inducible PtetO promoter firmly linked this BGC to sodorifen production. By utilizing this newly established expression system, we increased the production yields by approximately 26-fold when compared to the native producer. In addition, sodorifen was easily isolated in high purity by simple head-space sampling.Genome mining of all available genomes within the NCBI and JGI IMG databases led to the identification of a wealth of sod-like pathways which may be responsible for producing a range of structurally unknown sodorifen analogs. Introduction of the S. plymuthica WS3236 sodorifen BGC into the fast-growing heterologous expression host E. coli with a very low VOC background led to a significant increase in both sodorifen product yield and purity compared to the native producer. By providing a reliable, high-level production system, this study sets the stage for future investigations of the biological role and function of sodorifen and for functionally unlocking the bioinformatically identified putative sod-like pathways.

October 23, 2019

Rapid CRISPR/Cas9-mediated cloning of full-length Epstein-Barr virus genomes from latently infected cells.

Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.

September 22, 2019

Evolution of selective-sequencing approaches for virus discovery and virome analysis.

Recent advances in sequencing technologies have transformed the field of virus discovery and virome analysis. Once mostly confined to the traditional Sanger sequencing based individual virus discovery, is now entirely replaced by high throughput sequencing (HTS) based virus metagenomics that can be used to characterize the nature and composition of entire viromes. To better harness the potential of HTS for the study of viromes, sample preparation methodologies use different approaches to exclude amplification of non-viral components that can overshadow low-titer viruses. These virus-sequence enrichment approaches mostly focus on the sample preparation methods, like enzymatic digestion of non-viral nucleic acids and size exclusion of non-viral constituents by column filtration, ultrafiltration or density gradient centrifugation. However, recently a new approach of virus-sequence enrichment called virome-capture sequencing, focused on the amplification or HTS library preparation stage, was developed to increase the ability of virome characterization. This new approach has the potential to further transform the field of virus discovery and virome analysis, but its technical complexity and sequence-dependence warrants further improvements. In this review we discuss the different methods, their applications and evolution, for selective sequencing based virome analysis and also propose refinements needed to harness the full potential of HTS for virome analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

September 22, 2019

Widespread polycistronic transcripts in fungi revealed by single-molecule mRNA sequencing.

Genes in prokaryotic genomes are often arranged into clusters and co-transcribed into polycistronic RNAs. Isolated examples of polycistronic RNAs were also reported in some higher eukaryotes but their presence was generally considered rare. Here we developed a long-read sequencing strategy to identify polycistronic transcripts in several mushroom forming fungal species including Plicaturopsis crispa, Phanerochaete chrysosporium, Trametes versicolor, and Gloeophyllum trabeum. We found genome-wide prevalence of polycistronic transcription in these Agaricomycetes, involving up to 8% of the transcribed genes. Unlike polycistronic mRNAs in prokaryotes, these co-transcribed genes are also independently transcribed. We show that polycistronic transcription may interfere with expression of the downstream tandem gene. Further comparative genomic analysis indicates that polycistronic transcription is conserved among a wide range of mushroom forming fungi. In summary, our study revealed, for the first time, the genome prevalence of polycistronic transcription in a phylogenetic range of higher fungi. Furthermore, we systematically show that our long-read sequencing approach and combined bioinformatics pipeline is a generic powerful tool for precise characterization of complex transcriptomes that enables identification of mRNA isoforms not recovered via short-read assembly.

September 22, 2019

Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection.

Productivity of ruminant livestock depends on the rumen microbiota, which ferment indigestible plant polysaccharides into nutrients used for growth. Understanding the functions carried out by the rumen microbiota is important for reducing greenhouse gas production by ruminants and for developing biofuels from lignocellulose. We present 410 cultured bacteria and archaea, together with their reference genomes, representing every cultivated rumen-associated archaeal and bacterial family. We evaluate polysaccharide degradation, short-chain fatty acid production and methanogenesis pathways, and assign specific taxa to functions. A total of 336 organisms were present in available rumen metagenomic data sets, and 134 were present in human gut microbiome data sets. Comparison with the human microbiome revealed rumen-specific enrichment for genes encoding de novo synthesis of vitamin B12, ongoing evolution by gene loss and potential vertical inheritance of the rumen microbiome based on underrepresentation of markers of environmental stress. We estimate that our Hungate genome resource represents ~75% of the genus-level bacterial and archaeal taxa present in the rumen.

September 22, 2019

The state of play in higher eukaryote gene annotation.

A genome sequence is worthless if it cannot be deciphered; therefore, efforts to describe – or ‘annotate’ – genes began as soon as DNA sequences became available. Whereas early work focused on individual protein-coding genes, the modern genomic ocean is a complex maelstrom of alternative splicing, non-coding transcription and pseudogenes. Scientists – from clinicians to evolutionary biologists – need to navigate these waters, and this has led to the design of high-throughput, computationally driven annotation projects. The catalogues that are being produced are key resources for genome exploration, especially as they become integrated with expression, epigenomic and variation data sets. Their creation, however, remains challenging.

September 22, 2019

Interpreting microbial biosynthesis in the genomic age: Biological and practical considerations.

Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.

September 22, 2019

Resolving the complexity of human skin metagenomes using single-molecule sequencing.

Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation.The species comprising a microbial community are often difficult to deconvolute due to technical limitations inherent to most short-read sequencing technologies. Here, we leverage new advances in sequencing technology, single-molecule sequencing, to significantly improve reconstruction of a complex human skin microbial community. With this long-read technology, we were able to reconstruct and annotate a closed, high-quality genome of a previously uncharacterized skin species. We demonstrate that hybrid approaches with short-read technology are sufficiently powerful to reconstruct even single-nucleotide polymorphism level variation of species in this a community. Copyright © 2016 Tsai et al.

September 22, 2019

Complete genome sequence of Geobacillus thermodenitrificans T12, a potential host for biotechnological applications.

In attempt to obtain a thermophilic host for the conversion of lignocellulose derived substrates into lactic acid, Geobacillus thermodenitrificans T12 was isolated from a compost heap. It was selected from over 500 isolates as a genetically tractable hemicellulolytic lactic acid producer, requiring little nutrients. The strain is able to ferment glucose and xylose simultaneously and can produce lactic acid from xylan, making it a potential host for biotechnological applications. The genome of strain T12 consists of a 3.64 Mb chromosome and two plasmids of 59 and 56 kb. It has a total of 3.676 genes with an average genomic GC content of 48.7%. The T12 genome encodes a denitrification pathway, allowing for anaerobic respiration. The identity and localization of the responsible genes are similar to those of the denitrification pathways found in strain NG80-2. The hemicellulose utilization (HUS) locus was identified based on sequence homology against G. stearothermophilus T-6. It appeared that T12 has all the genes that are present in strain T-6 except for the arabinan degradation cluster. Instead, the HUS locus of strain T12 contains genes for both an inositol and a pectate degradation pathway. Strain T12 has complete pathways for the synthesis of purine and pyrimidine, all 20 amino acids and several vitamins except D-biotin. The host-defense systems present comprise a Type II and a Type III restriction-modification system, as well as a CRISPR-Cas Type II system. It is concluded that G. thermodenitrificans T12 is a potentially interesting candidate for industrial applications.

September 22, 2019

Identification of the biosynthetic pathway for the antibiotic bicyclomycin.

Diketopiperazines (DKPs) make up a large group of natural products with diverse structures and biological activities. Bicyclomycin is a broad-spectrum DKP antibiotic with unique structure and function: it contains a highly oxidized bicyclic [4.2.2] ring and is the only known selective inhibitor of the bacterial transcription termination factor, Rho. Here, we identify the biosynthetic gene cluster for bicyclomycin containing six iron-dependent oxidases. We demonstrate that the DKP core is made by a tRNA-dependent cyclodipeptide synthase, and hydroxylations on two unactivated sp(3) carbons are performed by two mononuclear iron, a-ketoglutarate-dependent hydroxylases. Using bioinformatics, we also identify a homologous gene cluster prevalent in a human pathogen Pseudomonas aeruginosa. We detect bicyclomycin by overexpressing this gene cluster and establish P. aeruginosa as a new producer of bicyclomycin. Our work uncovers the biosynthetic pathway for bicyclomycin and sheds light on the intriguing oxidation chemistry that converts a simple DKP into a powerful antibiotic.

Auto Tag: Genome mining

Comparative genomic analysis of Lactobacillus mucosae LM1 identifies potential niche-specific genes and pathways for gastrointestinal adaptation.

Genome mining reveals the origin of a bald phenotype and a cryptic nucleocidin gene cluster in Streptomyces asterosporus DSM 41452.

Structural and functional characterization of an intradiol ring-cleavage dioxygenase from the polyphagous spider mite herbivore Tetranychus urticae Koch.

Complete genome and data mining of Aeromicrobium sp. A1-2 isolated from the Southern Ocean

Genomics-driven discovery of a biosynthetic gene cluster required for the synthesis of BII-Rafflesfungin from the fungus Phoma sp. F3723.

Direct pathway cloning of the sodorifen biosynthetic gene cluster and recombinant generation of its product in E. coli.

Rapid CRISPR/Cas9-mediated cloning of full-length Epstein-Barr virus genomes from latently infected cells.

Evolution of selective-sequencing approaches for virus discovery and virome analysis.

Widespread polycistronic transcripts in fungi revealed by single-molecule mRNA sequencing.

Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection.

The state of play in higher eukaryote gene annotation.

Interpreting microbial biosynthesis in the genomic age: Biological and practical considerations.

Resolving the complexity of human skin metagenomes using single-molecule sequencing.

Complete genome sequence of Geobacillus thermodenitrificans T12, a potential host for biotechnological applications.

Identification of the biosynthetic pathway for the antibiotic bicyclomycin.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert