Menu
September 22, 2019

Comparative genomics and genotype-phenotype associations in Bifidobacterium breve.

Bifidobacteria are common members of the gastro-intestinal microbiota of a broad range of animal hosts. Their successful adaptation to this particular niche is linked to their saccharolytic metabolism, which is supported by a wide range of glycosyl hydrolases. In the current study a large-scale gene-trait matching (GTM) effort was performed to explore glycan degradation capabilities in B. breve. By correlating the presence/absence of genes and associated genomic clusters with growth/no-growth patterns across a dataset of 20 Bifidobacterium breve strains and nearly 80 different potential growth substrates, we not only validated the approach for a number of previously characterized carbohydrate utilization clusters, but we were also able to discover novel genetic clusters linked to the metabolism of salicin and sucrose. Using GTM, genetic associations were also established for antibiotic resistance and exopolysaccharide production, thereby identifying (novel) bifidobacterial antibiotic resistance markers and showing that the GTM approach is applicable to a variety of phenotypes. Overall, the GTM findings clearly expand our knowledge on members of the B. breve species, in particular how their variable genetic features can be linked to specific phenotypes.


September 22, 2019

Integrating long-range connectivity information into de Bruijn graphs.

The de Bruijn graph is a simple and efficient data structure that is used in many areas of sequence analysis including genome assembly, read error correction and variant calling. The data structure has a single parameter k, is straightforward to implement and is tractable for large genomes with high sequencing depth. It also enables representation of multiple samples simultaneously to facilitate comparison. However, unlike the string graph, a de Bruijn graph does not retain long range information that is inherent in the read data. For this reason, applications that rely on de Bruijn graphs can produce sub-optimal results given their input data.We present a novel assembly graph data structure: the Linked de Bruijn Graph (LdBG). Constructed by adding annotations on top of a de Bruijn graph, it stores long range connectivity information through the graph. We show that with error-free data it is possible to losslessly store and recover sequence from a Linked de Bruijn graph. With assembly simulations we demonstrate that the LdBG data structure outperforms both our de Bruijn graph and the String Graph Assembler (SGA). Finally we apply the LdBG to Klebsiella pneumoniae short read data to make large (12 kbp) variant calls, which we validate using PacBio sequencing data, and to characterize the genomic context of drug-resistance genes.Linked de Bruijn Graphs and associated algorithms are implemented as part of McCortex, which is available under the MIT license at https://github.com/mcveanlab/mccortex.Supplementary data are available at Bioinformatics online.


September 22, 2019

Periodic variation of mutation rates in bacterial genomes associated with replication timing

The causes and consequences of spatiotemporal variation in mutation rates remain to be explored in nearly all organisms. Here we examine relationships between local mutation rates and replication timing in three bacterial species whose genomes have multiple chromosomes: Vibrio fischeri, Vibrio cholerae, and Burkholderia cenocepacia Following five mutation accumulation experiments with these bacteria conducted in the near absence of natural selection, the genomes of clones from each lineage were sequenced and analyzed to identify variation in mutation rates and spectra. In lineages lacking mismatch repair, base substitution mutation rates vary in a mirrored wave-like pattern on opposing replichores of the large chromosomes of V. fischeri and V. cholerae, where concurrently replicated regions experience similar base substitution mutation rates. The base substitution mutation rates on the small chromosome are less variable in both species but occur at similar rates to those in the concurrently replicated regions of the large chromosome. Neither nucleotide composition nor frequency of nucleotide motifs differed among regions experiencing high and low base substitution rates, which along with the inferred ~800-kb wave period suggests that the source of the periodicity is not sequence specific but rather a systematic process related to the cell cycle. These results support the notion that base substitution mutation rates are likely to vary systematically across many bacterial genomes, which exposes certain genes to elevated deleterious mutational load.IMPORTANCE That mutation rates vary within bacterial genomes is well known, but the detailed study of these biases has been made possible only recently with contemporary sequencing methods. We applied these methods to understand how bacterial genomes with multiple chromosomes, like those of Vibrio and Burkholderia, might experience heterogeneous mutation rates because of their unusual replication and the greater genetic diversity found on smaller chromosomes. This study captured thousands of mutations and revealed wave-like rate variation that is synchronized with replication timing and not explained by sequence context. The scale of this rate variation over hundreds of kilobases of DNA strongly suggests that a temporally regulated cellular process may generate wave-like variation in mutation risk. These findings add to our understanding of how mutation risk is distributed across bacterial and likely also eukaryotic genomes, owing to their highly conserved replication and repair machinery. Copyright © 2018 Dillon et al.


September 22, 2019

Sequencing of pT5282-CTXM, p13190-KPC and p30860-NR, and comparative genomics analysis of IncX8 plasmids.

This study proposes a replicon-based scheme for typing IncX plasmids into nine separately clustering subgroups, including IncX1a, IncX1ß and IncX2-8. The complete nucleotide sequences of three IncX8 plasmids, namely pT5282-CTXM and p30860-NR from Enterobacter cloacae and p13190-KPC from Klebsiella pneumoniae, were determined and were compared with two other previously sequenced IncX8 plasmids (pCAV1043-58 and pCAV1741-16). These five plasmids possessed conserved IncX8 backbones with limited genetic variation with respect to gene content and organisation, and each of them carried one or three accessory modules that harboured resistance markers and metabolic gene clusters as well as transposons, insertion sequence (IS)-based transposition units and miniature inverted repeat transposable elements (MITEs), indicating that the relatively small IncX8 backbones were able to integrate various foreign genetic contents. The resistance genes blaCTX-M-3 and blaTEM-1 (ß-lactam resistance), blaKPC-2 (carbapenem resistance) and ?blaTEM-1, and tet(A) (tetracycline resistance) and mph(E) (macrolide resistance) were found in pT5282-CTXM, p13190-KPC and pCAV1741-16, respectively, whilst p30860-NR and pCAV1043-58 carried no resistance genes. The data presented here provide an insight into the diversification and evolution history of IncX8 plasmids. Copyright © 2018 Elsevier B.V. and International Society of Chemotherapy. All rights reserved.


September 22, 2019

Tracing genomic divergence of Vibrio bacteria in the Harveyi clade.

The mechanism of bacterial speciation remains a topic of tremendous interest. To understand the ecological and evolutionary mechanisms of speciation in Vibrio bacteria, we analyzed the genomic dissimilarities between three closely related species in the so-called Harveyi clade of the genus Vibrio, V. campbellii, V. jasicida, and V. hyugaensis The analysis focused on strains isolated from diverse geographic locations over a long period of time. The results of phylogenetic analyses and calculations of average nucleotide identity (ANI) supported the classification of V. jasicida and V. hyugaensis into two species. These analyses also identified two well-supported clades in V. campbellii; however, strains from both clades were classified as members of the same species. Comparative analyses of the complete genome sequences of representative strains from the three species identified higher syntenic coverage between genomes of V. jasicida and V. hyugaensis than that between the genomes from the two V. campbellii clades. The results from comparative analyses of gene content between bacteria from the three species did not support the hypothesis that gene gain and/or loss contributed to their speciation. We also did not find support for the hypothesis that ecological diversification toward associations with marine animals contributed to the speciation of V. jasicida and V. hyugaensis Overall, based on the results obtained in this study, we propose that speciation in Harveyi clade species is a result of stochastic diversification of local populations, which was influenced by multiple evolutionary processes, followed by extinction events.IMPORTANCE To investigate the mechanisms underlying speciation in the genus Vibrio, we provided a well-assembled reference of genomes and performed systematic genomic comparisons among three evolutionarily closely related species. We resolved taxonomic ambiguities and identified genomic features separating the three species. Based on the study results, we propose a hypothesis explaining how species in the Harveyi clade of Vibrio bacteria diversified. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Analysis of the draft genome of the red seaweed Gracilariopsis chorda provides insights into genome size evolution in Rhodophyta.

Red algae (Rhodophyta) underwent two phases of large-scale genome reduction during their early evolution. The red seaweeds did not attain genome sizes or gene inventories typical of other multicellular eukaryotes. We generated a high-quality 92.1 Mb draft genome assembly from the red seaweed Gracilariopsis chorda, including methylation and small (s)RNA data. We analyzed these and other Archaeplastida genomes to address three questions: 1) What is the role of repeats and transposable elements (TEs) in explaining Rhodophyta genome size variation, 2) what is the history of genome duplication and gene family expansion/reduction in these taxa, and 3) is there evidence for TE suppression in red algae? We find that the number of predicted genes in red algae is relatively small (4,803-13,125 genes), particularly when compared with land plants, with no evidence of polyploidization. Genome size variation is primarily explained by TE expansion with the red seaweeds having the largest genomes. Long terminal repeat elements and DNA repeats are the major contributors to genome size growth. About 8.3% of the G. chorda genome undergoes cytosine methylation among gene bodies, promoters, and TEs, and 71.5% of TEs contain methylated-DNA with 57% of these regions associated with sRNAs. These latter results suggest a role for TE-associated sRNAs in RNA-dependent DNA methylation to facilitate silencing. We postulate that the evolution of genome size in red algae is the result of the combined action of TE spread and the concomitant emergence of its epigenetic suppression, together with other important factors such as changes in population size.


September 22, 2019

Genetic and biochemical characterization of 5-hydroxypicolinic acid metabolism in Alcaligenes faecalis JQ135.

5-Hydroxypicolinic acid (5HPA), a natural pyridine derivative, is microbially degraded in the environment. However, the physiological, biochemical, and genetic foundations of the 5HPA metabolism remain unknown. In this study, an operon (hpa), responsible for 5HPA degradation, was cloned from Alcaligenes faecalis JQ135. HpaM was a monocomponent FAD-dependent monooxygenase and shared low identity (only 28-31%) with reported monooxygenases. HpaM catalyzed the ortho decarboxylative hydroxylation of 5HPA, generating 2,5-dihydroxypyridine (2,5DHP). The monooxygenase activity of HpaM was FAD and NADH-dependent. The apparent Km values of HpaM for 5HPA and NADH were 45.4 µM and 37.8 µM, respectively. The genes hpaX, hpaD, and hpaF were found to encode 2,5DHP dioxygenase, N-formylmaleamic acid deformylase, and maleamate amidohydrolase, respectively; however, the three genes were not essential for 5HPA degradation in A. faecalis JQ135. Furthermore, the gene maiA, which encodes a maleic acid cis-trans isomerase, was essential for the metabolism of 5HPA, nicotinic acid, and picolinic acid in A. faecalis JQ135, indicating that it might be a key gene in the metabolism of pyridine derivatives. The genes and proteins identified in this study showed a novel degradation mechanism of pyridine derivatives.Importance Unlike the benzene ring, the uneven distribution of the electron density of pyridine ring influences the positional reactivity and the interaction with enzymes, e.g., the ortho and para oxidation are more difficult than the meta oxidations. Hydroxylation is an important oxidation process for the pyridine derivative metabolism. In previous reports, the ortho hydroxylation of pyridine derivatives were catalyzed by multicomponent molybdenum-containing monooxygenases, while the meta hydroxylations were catalyzed by monocomponent FAD-dependent monooxygenases. This study identified the new monocomponent FAD-dependent monooxygenase HpaM that catalyzed the ortho decarboxylative hydroxylation of 5HPA. In addition, we found that the maiA coding for maleic acid cis-trans isomerase was pivotal for the metabolism of 5HPA, nicotinic acid, and picolinic acid in A. faecalis JQ135. This study provides novel insights into the microbial metabolism of pyridine derivatives. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Chromosomally encoded mcr-5 in colistin non-susceptible Pseudomonas aeruginosa.

Whole genome sequencing (WGS) of historical Pseudomonas aeruginosa clinical isolates identified a chromosomal copy of mcr-5 within a Tn3-like transposon in P. aeruginosa MRSN 12280. The isolate was non-susceptible to colistin by broth microdilution and genome analysis revealed no mutations known to confer colistin resistance. To the best of our knowledge, this is the first report of mcr in colistin non-susceptible P. aeruginosa.


September 22, 2019

Analysis of the Gli-D2 locus identifies a genetic target for simultaneously improving the breadmaking and health-related traits of common wheat.

Gliadins are a major component of wheat seed proteins. However, the complex homoeologous Gli-2 loci (Gli-A2, -B2 and -D2) that encode the a-gliadins in commercial wheat are still poorly understood. Here we analyzed the Gli-D2 locus of Xiaoyan 81 (Xy81), a winter wheat cultivar. A total of 421.091 kb of the Gli-D2 sequence was assembled from sequencing multiple bacterial artificial clones, and 10 a-gliadin genes were annotated. Comparative genomic analysis showed that Xy81 carried only eight of the a-gliadin genes of the D genome donor Aegilops tauschii, with two of them each experiencing a tandem duplication. A mutant line lacking Gli-D2 (DLGliD2) consistently exhibited better breadmaking quality and dough functionalities than its progenitor Xy81, but without penalties in other agronomic traits. It also had an elevated lysine content in the grains. Transcriptome analysis verified the lack of Gli-D2 a-gliadin gene expression in DLGliD2. Furthermore, the transcript and protein levels of protein disulfide isomerase were both upregulated in DLGliD2 grains. Consistent with this finding, DLGliD2 had increased disulfide content in the flour. Our work sheds light on the structure and function of Gli-D2 in commercial wheat, and suggests that the removal of Gli-D2 and the gliadins specified by it is likely to be useful for simultaneously enhancing the end-use and health-related traits of common wheat. Because gliadins and homologous proteins are widely present in grass species, the strategy and information reported here may be broadly useful for improving the quality traits of diverse cereal crops.© 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.


September 22, 2019

Human copy number variants are enriched in regions of low mappability.

Copy number variants (CNVs) are known to affect a large portion of the human genome and have been implicated in many diseases. Although whole-genome sequencing (WGS) can help identify CNVs, most analytical methods suffer from limited sensitivity and specificity, especially in regions of low mappability. To address this, we use PopSV, a CNV caller that relies on multiple samples to control for technical variation. We demonstrate that our calls are stable across different types of repeat-rich regions and validate the accuracy of our predictions using orthogonal approaches. Applying PopSV to 640 human genomes, we find that low-mappability regions are approximately 5 times more likely to harbor germline CNVs, in stark contrast to the nearly uniform distribution observed for somatic CNVs in 95 cancer genomes. In addition to known enrichments in segmental duplication and near centromeres and telomeres, we also report that CNVs are enriched in specific types of satellite and in some of the most recent families of transposable elements. Finally, using this comprehensive approach, we identify 3455 regions with recurrent CNVs that were missing from existing catalogs. In particular, we identify 347 genes with a novel exonic CNV in low-mappability regions, including 29 genes previously associated with disease.


September 22, 2019

Genomic signatures of mitonuclear coevolution across populations of Tigriopus californicus.

The copepod Tigriopus californicus shows extensive population divergence and is becoming a model for understanding allopatric differentiation and the early stages of speciation. Here, we report a high-quality reference genome for one population (~190?megabases across 12 scaffolds, and ~15,500 protein-coding genes). Comparison with other arthropods reveals 2,526 genes presumed to be specific to T. californicus, with an apparent proliferation of genes involved in ion transport and receptor activity. Beyond the reference population, we report re-sequenced genomes of seven additional populations, spanning the continuum of reproductive isolation. Populations show extreme mitochondrial DNA divergence, with higher levels of amino acid differentiation than observed in other taxa. Across the nuclear genome, we find elevated protein evolutionary rates and positive selection in genes predicted to interact with mitochondrial DNA and the proteins and RNA it encodes in multiple pathways. Together, these results support the hypothesis that rapid mitochondrial evolution drives compensatory nuclear evolution within isolated populations, thereby providing a potentially important mechanism for causing intrinsic reproductive isolation.


September 22, 2019

Extensive genomic diversity among Mycobacterium marinum strains revealed by whole genome sequencing.

Mycobacterium marinum is the causative agent for the tuberculosis-like disease mycobacteriosis in fish and skin lesions in humans. Ubiquitous in its geographical distribution, M. marinum is known to occupy diverse fish as hosts. However, information about its genomic diversity is limited. Here, we provide the genome sequences for 15 M. marinum strains isolated from infected humans and fish. Comparative genomic analysis of these and four available genomes of the M. marinum strains M, E11, MB2 and Europe reveal high genomic diversity among the strains, leading to the conclusion that M. marinum should be divided into two different clusters, the “M”- and the “Aronson”-type. We suggest that these two clusters should be considered to represent two M. marinum subspecies. Our data also show that the M. marinum pan-genome for both groups is open and expanding and we provide data showing high number of mutational hotspots in M. marinum relative to other mycobacteria such as Mycobacterium tuberculosis. This high genomic diversity might be related to the ability of M. marinum to occupy different ecological niches.


September 22, 2019

Creating a functional single-chromosome yeast.

Eukaryotic genomes are generally organized in multiple chromosomes. Here we have created a functional single-chromosome yeast from a Saccharomyces cerevisiae haploid cell containing sixteen linear chromosomes, by successive end-to-end chromosome fusions and centromere deletions. The fusion of sixteen native linear chromosomes into a single chromosome results in marked changes to the global three-dimensional structure of the chromosome due to the loss of all centromere-associated inter-chromosomal interactions, most telomere-associated inter-chromosomal interactions and 67.4% of intra-chromosomal interactions. However, the single-chromosome and wild-type yeast cells have nearly identical transcriptome and similar phenome profiles. The giant single chromosome can support cell life, although this strain shows reduced growth across environments, competitiveness, gamete production and viability. This synthetic biology study demonstrates an approach to exploration of eukaryote evolution with respect to chromosome structure and function.


September 22, 2019

Characterization of LE3 and LE4, the only lytic phages known to infect the spirochete Leptospira.

Leptospira is a phylogenetically unique group of bacteria, and includes the causative agents of leptospirosis, the most globally prevalent zoonosis. Bacteriophages in Leptospira are largely unexplored. To date, a genomic sequence is available for only one temperate leptophage called LE1. Here, we sequenced and analysed the first genomes of the lytic phages LE3 and LE4 that can infect the saprophyte Leptospira biflexa using the lipopolysaccharide O-antigen as receptor. Bioinformatics analysis showed that the 48-kb LE3 and LE4 genomes are similar and contain 62% genes whose function cannot be predicted. Mass spectrometry led to the identification of 21 and 23 phage proteins in LE3 and LE4, respectively. However we did not identify significant similarities with other phage genomes. A search for prophages close to LE4 in the Leptospira genomes allowed for the identification of a related plasmid in L. interrogans and a prophage-like region in the draft genome of a clinical isolate of L. mayottensis. Long-read whole genome sequencing of the L. mayottensis revealed that the genome contained a LE4 phage-like circular plasmid. Further isolation and genomic comparison of leptophages should reveal their role in the genetic evolution of Leptospira.


September 22, 2019

Linking genotype and phenotype in an economically viable propionic acid biosynthesis process

Propionic acid (PA) is used as a food preservative and increasingly, as a precursor for the synthesis of monomers. PA is produced mainly through hydrocarboxylation of ethylene, also known as the `oxo-process’; however, Propionibacterium species are promising biological PA producers natively producing PA as their main fermentation product. However, for fermentation to be competitive, a PA yield of at least 0.6 g/g is required.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.