Menu
July 7, 2019

Probiotic genomes: Sequencing and annotation in the past decade

Probiotics are live microorganisms that confer many health benefits to the host when administered in adequate quantities. These health benefits have garnered much attention towards Probiotics and have given an impetus to their use as dietary supplements for the improvement of general health and as adjuvant therapies for certain diseases. The increased demand for probiotic products in the recent times has provided the thrust for probiotic research applied to several areas of human biology. The advances in genomic technologies have further facilitated the sequencing of the genomes of such probiotic bacteria and their genomic analyses to identify the genes that endow the beneficial effects they are known to exert. This work reviews the application of genomic strategies on probiotic bacteria, while providing the details about the probiotic strains whose genome sequences are available. It also consolidates the Genomic tools used for the sequencing, assembly and annotation of the probiotic genes and how it has helped in comparative genomic analyses.


July 7, 2019

Thauera sinica sp. nov., a phenol derivative-degrading bacterium isolated from activated sludge.

A bacterial strain, K11T, capable of degrading phenol derivatives was isolated from activated sludge of a sewage treatment plant in China. This strain, which can degrade more than ten phenol derivatives, was identified as a Gram-stain negative, rod-shaped, asporogenous, facultative anaerobic bacterium with a polar flagellum. The strain was found to grow in tryptic soy broth in the presence of 0-2.5% (w/v) NaCl (optimum 0-1%), at 4-43 °C (optimum 30-35 °C) and pH 4.5-10.5 (optimum 7.5-8). Comparative analysis of nearly full-length 16S rRNA gene sequences showed that this strain belongs to the genus Thauera. The 16S rRNA gene sequence was found to show high similarity (97.5%) to that of Thauera chlorobenzoica 3CB-1T, with lesser similarity to other recognised Thauera strains. The G+C content of the DNA of the strain was determined to be 67.8 mol%. The DNA-DNA hybridization value between K11T and Thauera aromatica DSM6984T was 10.4 ± 4.5%. The genomic OrthoANI values of K11T with the other nine type strains of genus Thauera were less than 81.1%. Chemotaxonomic analysis of strain K11T revealed that Q-8 is the predominant quinone; the polar lipids contain phosphatidylglycerol, diphosphatidylglycerol, phosphatidylethanolamine, two unidentified phospholipids and five uncharacterised lipids; the major cellular fatty acid was identified as summed feature 3 (C16:1 ?7c and/or iso-C15:0 2-OH; 45.9%), followed by C16:0 (20.5%) and C18:1 ?7c (15.8%). Based on the phenotypic and phylogenetic evidence, DNA-DNA hybridisation, OrthoANI, chemotaxonomic analysis and results of the physiological and biochemical tests, a new species named Thauera sinica sp. nov. is proposed with strain K11T (= CGMCC 1.15731T = KACC 19216T) designated as the type strain.


July 7, 2019

Darwin: A genomics co-processor provides up to 15,000 X acceleration on long read assembly

of life in fundamental ways. Genomics data, however, is far outpacing Moore’s Law. Third-generation sequencing tech- nologies produce 100× longer reads than second generation technologies and reveal a much broader mutation spectrum of disease and evolution. However, these technologies incur prohibitively high computational costs. Over 1,300 CPU hours are required for reference-guided assembly of the human genome (using [47]), and over 15,600 CPU hours are required for de novo assembly [57]. This paper describes “Darwin” — a co-processor for genomic sequence alignment that, without sacrificing sensitivity, provides up to 15,000× speedup over the state-of-the-art software for reference-guided assembly of third-generation reads. Darwin achieves this speedup through hardware/algorithm co-design, trading more easily accelerated alignment for less memory-intensive filtering, and by optimizing the memory system for filtering. Darwin combines a hardware-accelerated version of D-SOFT, a novel filtering algorithm, with a hardware-accelerated version of GACT, a novel alignment algorithm. GACT generates near-optimal alignments of arbitrarily long genomic sequences using constant memory for the compute-intensive step. Dar- win is adaptable, with tunable speed and sensitivity to match emerging sequencing technologies and to meet the requirements of genomic applications beyond read assembly.


July 7, 2019

Emerging mechanisms of antimicrobial resistance in bacteria and fungi: advances in the era of genomics.

Bacteria and fungi continue to develop new ways to adapt and survive the lethal or biostatic effects of antimicrobials through myriad mechanisms. Novel antibiotic resistance genes such as lsa(C), erm(44), VCC-1, mcr-1, mcr-2, mcr-3, mcr-4, bla KLUC-3 and bla KLUC-4 were discovered through comparative genomics and further functional studies. As well, mutations in genes that hitherto were unknown to confer resistance to antimicrobials, such as trm, PP2C, rpsJ, HSC82, FKS2 and Rv2887, were shown by genomics and transcomplementation assays to mediate antimicrobial resistance in Acinetobacter baumannii, Staphylococcus aureus, Enterococcus faecium, Saccharomyces cerevisae, Candida glabrata and Mycobacterium tuberculosis, respectively. Thus, genomics, transcriptomics and metagenomics, coupled with functional studies are the future of antimicrobial resistance research and novel drug discovery or design.


July 7, 2019

Genome sequence resources for the wheat stripe rust pathogen (Puccinia striiformis f. sp. tritici) and the barley stripe rust pathogen (Puccinia striiformis f. sp. hordei)

Puccinia striiformis f. sp. tritici causes devastating stripe (yellow) rust on wheat and P. striiformis f. sp. hordei causes stripe rust on barley. Several P. striiformis f. sp. tritici genomes are available, but no P. striiformis f. sp. hordei genome is available. More genomes of P. striiformis f. sp. tritici and P. striiformis f. sp. hordei are needed to understand the genome evolution and molecular mechanisms of their pathogenicity. We sequenced P. striiformis f. sp. tritici isolate 93-210 and P. striiformis f. sp. hordei isolate 93TX-2, using PacBio and Illumina technologies and RNA sequencing. Their genomic sequences were assembled to contigs with high continuity and showed significant structural differences. The circular mitochondria genomes of both were complete. These genomes provide high-quality resources for deciphering the genomic basis of rapid evolution and host adaptation, identifying genes for avirulence and other important traits, and studying host-pathogen interactions.


July 7, 2019

Optimise wheat A-genome.

The wild einkorn wheat Triticum urartu (Tu) is the A-genome progenitor of tetraploid (AABB) and hexaploid (AABBDD) wheat. A draft genome of Tu was published in 2013, but a better reference sequence is urgently needed by scientists and breeders. Hong-Qing Ling, from the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and colleagues have now completed a high-quality Tu genome using multiple methods.


July 7, 2019

RIFRAF: a frame-resolving consensus algorithm.

Protein coding genes can be studied using long-read next generation sequencing. However, high rates of indel sequencing errors are problematic, corrupting the reading frame. Even the consensus of multiple independent sequence reads retains indel errors. To solve this problem, we introduce Reference-Informed Frame-Resolving multiple-Alignment Free template inference algorithm (RIFRAF), a sequence consensus algorithm that takes a set of error-prone reads and a reference sequence and infers an accurate in-frame consensus. RIFRAF uses a novel structure, analogous to a two-layer hidden Markov model: the consensus is optimized to maximize alignment scores with both the set of noisy reads and with a reference. The template-to-reads component of the model encodes the preponderance of indels, and is sensitive to the per-base quality scores, giving greater weight to more accurate bases. The reference-to-template component of the model penalizes frame-destroying indels. A local search algorithm proceeds in stages to find the best consensus sequence for both objectives.Using Pacific Biosciences SMRT sequences from an HIV-1 env clone, NL4-3, we compare our approach to other consensus and frame correction methods. RIFRAF consistently finds a consensus sequence that is more accurate and in-frame, especially with small numbers of reads. It was able to perfectly reconstruct over 80% of consensus sequences from as few as three reads, whereas the best alternative required twice as many. RIFRAF is able to achieve these results and keep the consensus in-frame even with a distantly related reference sequence. Moreover, unlike other frame correction methods, RIFRAF can detect and keep true indels while removing erroneous ones.RIFRAF is implemented in Julia, and source code is publicly available at https://github.com/MurrellGroup/Rifraf.jl.Supplementary data are available at Bioinformatics online.


July 7, 2019

Isolation and identification of an anthracimycin analogue from Nocardiopsis kunsanensis, a halophile from a saltern, by genomic mining strategy.

Modern medicine is unthinkable without antibiotics; yet, growing issues with microbial drug resistance require intensified search for new active compounds. Natural products generated by Actinobacteria have been a rich source of candidate antibiotics, for example anthracimycin that, so far, is only known to be produced by Streptomyces species. Based on sequence similarity with the respective biosynthetic cluster, we sifted through available microbial genome data with the goal to find alternative anthracimycin-producing organisms. In this work, we report about the prediction and experimental verification of the production of anthracimycin derivatives by Nocardiopsis kunsanensis, a non-Streptomyces actinobacterial microorganism. We discovered N. kunsanensis to predominantly produce a new anthracimycin derivative with methyl group at C-8 and none at C-2, labeled anthracimycin BII-2619, besides a minor amount of anthracimycin. It displays activity against Gram-positive bacteria with similar low level of mammalian cytotoxicity as that of anthracimycin.


July 7, 2019

The draft genome of the lichen-forming fungus Lasallia hispanica (Frey) Sancho & A. Crespo

Lasallia hispanica (Frey) Sancho & A. Crespo is one of three Lasallia species occurring in central-western Europe. It is an orophytic, photophilous Mediterranean endemic which is sympatric with the closely related, widely distributed, highly clonal sister taxon L. pustulata in the supra- and oro-Mediterranean belts. We sequenced the genome of L. hispanica from a multispore isolate. The total genome length is 41·2 Mb, including 8488 gene models. We present the annotation of a variety of genes that are involved in protein secretion, mating processes and secondary metabolism, and we report transposable elements. Additionally, we compared the genome of L. hispanica to the closely related, yet ecologically distant, L. pustulata and found high synteny in gene content and order. The newly assembled and annotated L. hispanica genome represents a useful resource for future investigations into niche differentiation, speciation and microevolution in L. hispanica and other members of the genus.


July 7, 2019

Complete genome sequence of Mycobacterium shigaense.

Mycobacterium shigaense is a slowly growing scotochromogenic species and a member of the Mycobacterium simiae complex group. Here, we report the complete sequence of its genome, comprising a 5.2-Mb chromosome. The sequence will represent the essential data for future phylogenetic and comparative genome studies of the Mycobacterium simiae complex group. Copyright © 2018 Yoshida et al.


July 7, 2019

The challenge of analyzing the sugarcane genome.

Reference genome sequences have become key platforms for genetics and breeding of the major crop species. Sugarcane is probably the largest crop produced in the world (in weight of crop harvested) but lacks a reference genome sequence. Sugarcane has one of the most complex genomes in crop plants due to the extreme level of polyploidy. The genome of modern sugarcane hybrids includes sub-genomes from two progenitors Saccharum officinarum and S. spontaneum with some chromosomes resulting from recombination between these sub-genomes. Advancing DNA sequencing technologies and strategies for genome assembly are making the sugarcane genome more tractable. Advances in long read sequencing have allowed the generation of a more complete set of sugarcane gene transcripts. This is supporting transcript profiling in genetic research. The progenitor genomes are being sequenced. A monoploid coverage of the hybrid genome has been obtained by sequencing BAC clones that cover the gene space of the closely related sorghum genome. The complete polyploid genome is now being sequenced and assembled. The emerging genome will allow comparison of related genomes and increase understanding of the functioning of this polyploidy system. Sugarcane breeding for traditional sugar and new energy and biomaterial uses will be enhanced by the availability of these genomic resources.


July 7, 2019

Analysis of resistance genes of clinical Pannonibacter phragmitetus strain 31801 by complete genome sequencing.

To clarify the resistance mechanisms of Pannonibacter phragmitetus 31801, isolated from the blood of a liver abscess patient, at the genomic level, we performed whole genomic sequencing using a PacBio RS II single-molecule real-time long-read sequencer. Bioinformatic analysis of the resulting sequence was then carried out to identify any possible resistance genes. Analyses included Basic Local Alignment Search Tool searches against the Antibiotic Resistance Genes Database, ResFinder analysis of the genome sequence, and Resistance Gene Identifier analysis within the Comprehensive Antibiotic Resistance Database. Prophages, clustered regularly interspaced short palindromic repeats (CRISPR), and other putative virulence factors were also identified using PHAST, CRISPRfinder, and the Virulence Factors Database, respectively. The circular chromosome and single plasmid of P. phragmitetus 31801 contained multiple antibiotic resistance genes, including those coding for three different types of ß-lactamase [NPS ß-lactamase (EC 3.5.2.6), ß-lactamase class C, and a metal-dependent hydrolase of ß-lactamase superfamily I]. In addition, genes coding for subunits of several multidrug-resistance efflux pumps were identified, including those targeting macrolides (adeJ, cmeB), tetracycline (acrB, adeAB), fluoroquinolones (acrF, ceoB), and aminoglycosides (acrD, amrB, ceoB, mexY, smeB). However, apart from the tripartite macrolide efflux pump macAB-tolC, the genome did not appear to contain the complete complement of subunit genes required for production of most of the major multidrug-resistance efflux pumps.


July 7, 2019

Assembly of a complete genome sequence for Gemmata obscuriglobus reveals a novel prokaryotic rRNA operon gene architecture.

Gemmata obscuriglobus is a Gram-negative bacterium with several intriguing biological features. Here, we present a complete, de novo whole genome assembly for G. obscuriglobus which consists of a single, circular 9 Mb chromosome, with no plasmids detected. The genome was annotated using the NCBI Prokaryotic Genome Annotation pipeline to generate common gene annotations. Analysis of the rRNA genes revealed three interesting features for a bacterium. First, linked G. obscuriglobus rrn operons have a unique gene order, 23S-5S-16S, compared to typical prokaryotic rrn operons (16S-23S-5S). Second, G. obscuriglobus rrn operons can either be linked or unlinked (a 16S gene is in a separate genomic location from a 23S and 5S gene pair). Third, all of the 23S genes (5 in total) have unique polymorphisms. Genome analysis of a different Gemmata species (SH-PL17), revealed a similar 23S-5S-16S gene order in all of its linked rrn operons and the presence of an unlinked operon. Together, our findings show that unique and rare features in Gemmata rrn operons among prokaryotes provide a means to better define the evolutionary relatedness of Gemmata species and the divergence time for different Gemmata species. Additionally, these rrn operon differences provide important insights into the rrn operon architecture of common ancestors of the planctomycetes.


July 7, 2019

Activation of the mismatch-specific endonuclease EndoMS/NucS by the replication clamp is required for high fidelity DNA replication.

The mismatch repair (MMR) system, exemplified by the MutS/MutL proteins, is widespread in Bacteria and Eukarya. However, molecular mechanisms how numerous archaea and bacteria lacking the mutS/mutL genes maintain high replication fidelity and genome stability have remained elusive. EndoMS is a recently discovered hyperthermophilic mismatch-specific endonuclease encoded by nucS in Thermococcales. We deleted the nucS from the actinobacterium Corynebacterium glutamicum and demonstrated a drastic increase of spontaneous transition mutations in the nucS deletion strain. The observed spectra of these mutations were consistent with the enzymatic properties of EndoMS in vitro. The robust mismatch-specific endonuclease activity was detected with the purified C. glutamicum EndoMS protein but only in the presence of the ß-clamp (DnaN). Our biochemical and genetic data suggest that the frequently occurring G/T mismatch is efficiently repaired by the bacterial EndoMS-ß-clamp complex formed via a carboxy-terminal sequence motif of EndoMS proteins. Our study thus has great implications for understanding how the activity of the novel MMR system is coordinated with the replisome and provides new mechanistic insight into genetic diversity and mutational patterns in industrially and clinically (e.g. Mycobacteria) important archaeal and bacterial phyla previously thought to be devoid of the MMR system.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.