Menu
July 7, 2019

Complete genome sequence of the halophile bacterium Kushneria marisflavi KCCM 80003T, isolated from seawater in Korea

We present the genome sequence of Kushneria marisflavi KCCM 80003T isolated from Yellow Sea in Korea. The complete genome of KCCM 80003T consisted of a single, circular chromosome of 3,667,185bp, with an average G+C content of 59.05%, and 3287 coding sequences, 12 rRNAs, and 66 tRNAs. Kushneria marisflavi KCCM 80003T, belonging to the family Halomonadaceae, exhibited resistance to high salt concentrations and possessed potassium metabolism- or osmotic stress-related coding sequences, including potassium homeostasis, ectoine biosynthesis and regulation, choline and betaine uptake, and betaine biosynthesis features in the genome. These results provide a basis for understanding resistance strategies to osmotic stress at the genetic level and accordingly have implications for genetic engineering and biotechnology.


July 7, 2019

Complete genome sequence of the marine Rhodococcus sp. H-CA8f isolated from Comau fjord in Northern Patagonia, Chile

Rhodococcus sp. H-CA8f was isolated from marine sediments obtained from the Comau fjord, located in Northern Chilean Patagonia. Whole-genome sequencing was achieved using PacBio RS II platform, comprising one closed, complete chromosome of 6,19?Mbp with a 62.45% G?+?C content. The chromosome harbours several metabolic pathways providing a wide catabolic potential, where the upper biphenyl route is described. Also, Rhodococcus sp. H-CA8f bears one linear mega-plasmid of 301?Kbp and 62.34% of G?+?C content, where genomic analyses demonstrated that it is constituted mostly by putative ORFs with unknown functions, representing a novel genetic feature. These genetic characteristics provide relevant insights regarding Chilean marine actinobacterial strains.


July 7, 2019

Synthetic biology, genome mining, and combinatorial biosynthesis of NRPS-derived antibiotics: a perspective.

Combinatorial biosynthesis of novel secondary metabolites derived from nonribosomal peptide synthetases (NRPSs) has been in slow development for about a quarter of a century. Progress has been hampered by the complexity of the giant multimodular multienzymes. More recently, advances have been made on understanding the chemical and structural biology of these complex megaenzymes, and on learning the design rules for engineering functional hybrid enzymes. In this perspective, I address what has been learned about successful engineering of complex lipopeptides related to daptomycin, and discuss how synthetic biology and microbial genome mining can converge to broaden the scope and enhance the speed and robustness of combinatorial biosynthesis of NRPS-derived natural products for drug discovery.


July 7, 2019

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.


July 7, 2019

Satellite DNA evolution: old ideas, new approaches.

A substantial portion of the genomes of most multicellular eukaryotes consists of large arrays of tandemly repeated sequence, collectively called satellite DNA. The processes generating and maintaining different satellite DNA abundances across lineages are important to understand as satellites have been linked to chromosome mis-segregation, disease phenotypes, and reproductive isolation between species. While much theory has been developed to describe satellite evolution, empirical tests of these models have fallen short because of the challenges in assessing satellite repeat regions of the genome. Advances in computational tools and sequencing technologies now enable identification and quantification of satellite sequences genome-wide. Here, we describe some of these tools and how their applications are furthering our knowledge of satellite evolution and function. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 7, 2019

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods.

Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE .


July 7, 2019

Phylogeny of dermatophytes with genomic character evaluation of clinically distinct Trichophyton rubrum and T. áviolaceum

Trichophyton rubrum and T. violaceum are prevalent agents of human dermatophyte infections, the former being found on glabrous skin and nail, while the latter is confined to the scalp. The two species are phenotypically different but are highly similar phylogenetically. The taxonomy of dermatophytes is currently being reconsidered on the basis of molecular phylogeny. Molecular species definitions do not always coincide with existing concepts which are guided by ecological and clinical principles. In this article, we aim to bring phylogenetic and ecological data together in an attempt to develop new species concepts for anthropophilic dermatophytes. Focus is on the T. rubrum complex with analysis of rDNA ITS supplemented with LSU, TUB2, TEF3 and ribosomal protein L10 gene sequences. In order to explore genomic differences between T. rubrum and T. violaceum, one representative for both species was whole genome sequenced. Draft sequences were compared with currently available dermatophyte genomes. Potential virulence factors of adhesins and secreted proteases were predicted and compared phylogenetically. General phylogeny showed clear gaps between geophilic species of Arthroderma, but multilocus distances between species were often very small in the derived anthropophilic and zoophilic genus Trichophyton. Significant genome conservation between T. rubrum and T. violaceum was observed, with a high similarity at the nucleic acid level of 99.38 % identity. Trichophyton violaceum contains more paralogs than T. rubrum. About 30 adhesion genes were predicted among dermatophytes. Seventeen adhesins were common between T. rubrum and T. violaceum, while four were specific for the former and eight for the latter. Phylogenetic analysis of secreted proteases reveals considerable expansion and conservation among the analyzed species. Multilocus phylogeny and genome comparison of T. rubrum and T. violaceum underlined their close affinity. The possibility that they represent a single species exhibiting different phenotypes due to different localizations on the human body is discussed.


July 7, 2019

The case for not masking away repetitive DNA

In the course of analyzing whole-genome data, it is common practice to mask or filter out repetitive regions of a genome, such as transposable elements and endogenous retroviruses, in order to focus only on genes and thus simplify the results. This Commentary is a plea from one member of the Mobile DNA community to all gene-centric researchers: please do not ignore the repetitive fraction of the genome. Please stop narrowing your findings by only analyzing a minority of the genome, and instead broaden your analyses to include the rich biology of repetitive and mobile DNA. In this article, I present four arguments supporting a case for retaining repetitive DNA in your genome-wide analysis.


July 7, 2019

Short genome report of cellulose-producing commensal Escherichia coli 1094.

Bacterial surface colonization and biofilm formation often rely on the production of an extracellular polymeric matrix that mediates cell-cell and cell-surface contacts. In Escherichia coli and many Betaproteobacteria and Gammaproteobacteria cellulose is often the main component of the extracellular matrix. Here we report the complete genome sequence of the cellulose producing strain E. coli 1094 and compare it with five other closely related genomes within E. coli phylogenetic group A. We present a comparative analysis of the regions encoding genes responsible for cellulose biosynthesis and discuss the changes that could have led to the loss of this important adaptive advantage in several E. coli strains. Data deposition: The annotated genome sequence has been deposited at the European Nucleotide Archive under the accession number PRJEB21000.


July 7, 2019

Complete genome sequence of “Thiodictyon syntrophicum” sp. nov. strain Cad16T, a photolithoautotrophic purple sulfur bacterium isolated from the alpine meromictic Lake Cadagno.

Thiodictyon syntrophicum sp. nov. strain Cad16T is a photoautotrophic purple sulfur bacterium belonging to the family of Chromatiaceae in the class of Gammaproteobacteria. The type strain Cad16T was isolated from the chemocline of the alpine meromictic Lake Cadagno in Switzerland. Strain Cad16T represents a key species within this sulfur-driven bacterial ecosystem with respect to carbon fixation. The 7.74-Mbp genome of strain Cad16T has been sequenced and annotated. It encodes 6237 predicted protein sequences and 59 RNA sequences. Phylogenetic comparison based on 16S rRNA revealed that Thiodictyon elegans strain DSM 232T the most closely related species. Genes involved in sulfur oxidation, central carbon metabolism and transmembrane transport were found. Noteworthy, clusters of genes encoding the photosynthetic machinery and pigment biosynthesis are found on the 0.48 Mb plasmid pTs485. We provide a detailed insight into the Cad16T genome and analyze it in the context of the microbial ecosystem of Lake Cadagno.


July 7, 2019

Complete genome sequence of Gordonia sp. YC-JH1, a bacterium efficiently degrading a wide range of phthalic acid esters.

Phthalic acid esters (PAEs) are a family of recalcitrant pollutants mainly used as plasticizer. The strain Gordonia sp.YC-JH1, isolated from petroleum-contaminated soil, is capable of efficiently degrading a wide range of PAEs. In order to pertinently investigate the genetic mechanism of PAEs catabolism by strain YC-JH1, its complete genome sequencing has been performed by SMRT sequencing technology. The genome comprises a circular chromosome and a plasmid with a size of 4,101,557 bp and 91,767 bp respectively. Based on the genome sequence, 3563 protein-coding genes are predicted, of which the genes responsible for PAEs degradation are identified, including the two genes of PAEs hydrolase and the gene clusters for phthalic acid and protocatechuic acid degradation. The genome information provides genomic basis of PAEs degradation to allow the complete metabolism of PAEs. The wide substrate spectrum and its genetic basis of this strain should expand its application potential for environments bioremediation, provide novel gene resources involved in PAEs degradation for biotechnology and gene engineering, and contribute to shed light on the mechanism of PAEs metabolism. Copyright © 2018. Published by Elsevier B.V.


July 7, 2019

RIFRAF: a frame-resolving consensus algorithm.

Protein coding genes can be studied using long-read next generation sequencing. However, high rates of indel sequencing errors are problematic, corrupting the reading frame. Even the consensus of multiple independent sequence reads retains indel errors. To solve this problem, we introduce Reference-Informed Frame-Resolving multiple-Alignment Free template inference algorithm (RIFRAF), a sequence consensus algorithm that takes a set of error-prone reads and a reference sequence and infers an accurate in-frame consensus. RIFRAF uses a novel structure, analogous to a two-layer hidden Markov model: the consensus is optimized to maximize alignment scores with both the set of noisy reads and with a reference. The template-to-reads component of the model encodes the preponderance of indels, and is sensitive to the per-base quality scores, giving greater weight to more accurate bases. The reference-to-template component of the model penalizes frame-destroying indels. A local search algorithm proceeds in stages to find the best consensus sequence for both objectives.Using Pacific Biosciences SMRT sequences from an HIV-1 env clone, NL4-3, we compare our approach to other consensus and frame correction methods. RIFRAF consistently finds a consensus sequence that is more accurate and in-frame, especially with small numbers of reads. It was able to perfectly reconstruct over 80% of consensus sequences from as few as three reads, whereas the best alternative required twice as many. RIFRAF is able to achieve these results and keep the consensus in-frame even with a distantly related reference sequence. Moreover, unlike other frame correction methods, RIFRAF can detect and keep true indels while removing erroneous ones.RIFRAF is implemented in Julia, and source code is publicly available at https://github.com/MurrellGroup/Rifraf.jl.Supplementary data are available at Bioinformatics online.


July 7, 2019

Assembly of a complete genome sequence for Gemmata obscuriglobus reveals a novel prokaryotic rRNA operon gene architecture.

Gemmata obscuriglobus is a Gram-negative bacterium with several intriguing biological features. Here, we present a complete, de novo whole genome assembly for G. obscuriglobus which consists of a single, circular 9 Mb chromosome, with no plasmids detected. The genome was annotated using the NCBI Prokaryotic Genome Annotation pipeline to generate common gene annotations. Analysis of the rRNA genes revealed three interesting features for a bacterium. First, linked G. obscuriglobus rrn operons have a unique gene order, 23S-5S-16S, compared to typical prokaryotic rrn operons (16S-23S-5S). Second, G. obscuriglobus rrn operons can either be linked or unlinked (a 16S gene is in a separate genomic location from a 23S and 5S gene pair). Third, all of the 23S genes (5 in total) have unique polymorphisms. Genome analysis of a different Gemmata species (SH-PL17), revealed a similar 23S-5S-16S gene order in all of its linked rrn operons and the presence of an unlinked operon. Together, our findings show that unique and rare features in Gemmata rrn operons among prokaryotes provide a means to better define the evolutionary relatedness of Gemmata species and the divergence time for different Gemmata species. Additionally, these rrn operon differences provide important insights into the rrn operon architecture of common ancestors of the planctomycetes.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.