The causes and consequences of spatiotemporal variation in mutation rates remain to be explored in nearly all organisms. Here we examine relationships between local mutation rates and replication timing in three bacterial species whose genomes have multiple chromosomes: Vibrio fischeri, Vibrio cholerae, and Burkholderia cenocepacia Following five mutation accumulation experiments with these bacteria conducted in the near absence of natural selection, the genomes of clones from each lineage were sequenced and analyzed to identify variation in mutation rates and spectra. In lineages lacking mismatch repair, base substitution mutation rates vary in a mirrored wave-like pattern on opposing replichores of the large chromosomes of V. fischeri and V. cholerae, where concurrently replicated regions experience similar base substitution mutation rates. The base substitution mutation rates on the small chromosome are less variable in both species but occur at similar rates to those in the concurrently replicated regions of the large chromosome. Neither nucleotide composition nor frequency of nucleotide motifs differed among regions experiencing high and low base substitution rates, which along with the inferred ~800-kb wave period suggests that the source of the periodicity is not sequence specific but rather a systematic process related to the cell cycle. These results support the notion that base substitution mutation rates are likely to vary systematically across many bacterial genomes, which exposes certain genes to elevated deleterious mutational load.IMPORTANCE That mutation rates vary within bacterial genomes is well known, but the detailed study of these biases has been made possible only recently with contemporary sequencing methods. We applied these methods to understand how bacterial genomes with multiple chromosomes, like those of Vibrio and Burkholderia, might experience heterogeneous mutation rates because of their unusual replication and the greater genetic diversity found on smaller chromosomes. This study captured thousands of mutations and revealed wave-like rate variation that is synchronized with replication timing and not explained by sequence context. The scale of this rate variation over hundreds of kilobases of DNA strongly suggests that a temporally regulated cellular process may generate wave-like variation in mutation risk. These findings add to our understanding of how mutation risk is distributed across bacterial and likely also eukaryotic genomes, owing to their highly conserved replication and repair machinery. Copyright © 2018 Dillon et al.
Prevalence and genomic structure of bacteriophage phi3 in human derived livestock-associated MRSA from 2000 to 2015.
Whereas the emergence of livestock-associated methicillin-resistant Staphylococcus aureus (LA-MRSA) clonal complex 398 (CC398) in animal husbandry and its transmission to humans are well documented, less is known about factors driving the epidemic spread of this zoonotic lineage within the human population. One factor could be the bacteriophage phi3, which is rarely detected in S. aureus isolates from animals but commonly found among isolates from humans, including those of the human-adapted methicillin-susceptible S. aureus (MSSA) CC398 clade. The proportion of phi3-carrying MRSA spa-CC011 isolates, which constitute presumptively LA-MRSA within the multilocus sequence type (MLST) clonal complex 398, was systematically assessed for a period of 16 years to investigate the role of phi3 in the adaptation process of LA-MRSA to the human host. For this purpose, 632 MRSA spa-CC011 isolates from patients of a university hospital located in a pig farming-dense area in Germany were analyzed. Livestock-associated acquisition of MRSA spa-CC011 was previously reported as having increased from 1.8% in 2000 to 29.4% in 2014 in MRSA-positive patients admitted to this hospital. However, in this study, the proportion of phi3-carrying isolates rose only from 1.1% (2000 to 2006) to 3.9% (2007 to 2015). Characterization of the phi3 genomes revealed 12 different phage types ranging in size from 40,712 kb up to 44,003 kb, with four hitherto unknown integration sites (genes or intergenic regions) and several modified bacterial attachment (attB) sites. In contrast to the MSSA CC398 clade, phi3 acquisition seems to be no major driver for the readaptation of MRSA spa-CC011 to the human host. Copyright © 2018 American Society for Microbiology.
Therapy for bacteremia caused by Staphylococcus aureus is often ineffective, even when treatment conditions are optimal according to experimental protocols. Adapted subclones, such as those bearing mutations that attenuate agr-mediated virulence activation, are associated with persistent infection and patient mortality. To identify additional alterations in agr-defective mutants, we sequenced and assembled the complete genomes of clone pairs from colonizing and infected sites of several patients in whom S. aureus demonstrated a within-host loss of agr function. We report that events associated with agr inactivation result in agr-defective blood and nares strain pairs that are enriched in mutations compared to pairs from wild-type controls. The random distribution of mutations between colonizing and infecting strains from the same patient, and between strains from different patients, suggests that much of the genetic complexity of agr-defective strains results from prolonged infection or therapy-induced stress. However, in one of the agr-defective infecting strains, multiple genetic changes resulted in increased virulence in a murine model of bloodstream infection, bypassing the mutation of agr and raising the possibility that some changes were selected. Expression profiling correlated the elevated virulence of this agr-defective mutant to restored expression of the agr-regulated ESAT6-like type VII secretion system, a known virulence factor. Thus, additional mutations outside the agr locus can contribute to diversification and adaptation during infection by S. aureus agr mutants associated with poor patient outcomes. Copyright © 2018 Altman et al.
Cushing’s disease (CD) is caused by pituitary corticotroph adenomas that secrete excess adrenocorticotropic hormone (ACTH). In these tumors, somatic mutations in the gene USP8 have been identified as recurrent and pathogenic and are the sole known molecular driver for CD. Although other somatic mutations were reported in these studies, their contribution to the pathogenesis of CD remains unexplored. No molecular drivers have been established for a large proportion of CD cases and tumor heterogeneity has not yet been investigated using genomics methods. Also, even in USP8-mutant tumors, a possibility may exist of additional contributing mutations, following a paradigm from other neoplasm types where multiple somatic alterations contribute to neoplastic transformation. The current study utilizes whole-exome discovery sequencing on the Illumina platform, followed by targeted amplicon-validation sequencing on the Pacific Biosciences platform, to interrogate the somatic mutation landscape in a corticotroph adenoma resected from a CD patient. In this USP8-mutated tumor, we identified an interesting somatic mutation in the gene RASD1, which is a component of the corticotropin-releasing hormone receptor signaling system. This finding may provide insight into a novel mechanism involving loss of feedback control to the corticotropin-releasing hormone receptor and subsequent deregulation of ACTH production in corticotroph tumors.
Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing.
Single-molecule real-time (SMRT) DNA sequencing allows the systematic detection of chemical modifications such as methylation but has not previously been applied on a genome-wide scale. We used this approach to detect 49,311 putative 6-methyladenine (m6A) residues and 1,407 putative 5-methylcytosine (m5C) residues in the genome of a pathogenic Escherichia coli strain. We obtained strand-specific information for methylation sites and a quantitative assessment of the frequency of methylation at each modified position. We deduced the sequence motifs recognized by the methyltransferase enzymes present in this strain without prior knowledge of their specificity. Furthermore, we found that deletion of a phage-encoded methyltransferase-endonuclease (restriction-modification; RM) system induced global transcriptional changes and led to gene amplification, suggesting that the role of RM systems extends beyond protecting host genomes from foreign DNA.
Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases.
Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types.
Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic.
DNA modifications such as methylation and DNA damage can play critical regulatory roles in biological systems. Single molecule, real time (SMRT) sequencing technology generates DNA sequences as well as DNA polymerase kinetic information that can be used for the direct detection of DNA modifications. We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data) covering the same local sequence contexts of interest. We develop an Empirical Bayesian hierarchical model for incorporating historical data. Our results show that the model could greatly increase DNA modification detection accuracy, and reduce requirement of control data coverage. For some DNA modifications that have a strong signal, a control sample is not even needed by using historical data as alternative to control. Thus, sequencing costs can be greatly reduced by using the model. We implemented the model in a R package named seqPatch, which is available at https://github.com/zhixingfeng/seqPatch.
Comprehensive methylome characterization of Mycoplasma genitalium and Mycoplasma pneumoniae at single-base resolution.
In the bacterial world, methylation is most commonly associated with restriction-modification systems that provide a defense mechanism against invading foreign genomes. In addition, it is known that methylation plays functionally important roles, including timing of DNA replication, chromosome partitioning, DNA repair, and regulation of gene expression. However, full DNA methylome analyses are scarce due to a lack of a simple methodology for rapid and sensitive detection of common epigenetic marks (ie N(6)-methyladenine (6 mA) and N(4)-methylcytosine (4 mC)), in these organisms. Here, we use Single-Molecule Real-Time (SMRT) sequencing to determine the methylomes of two related human pathogen species, Mycoplasma genitalium G-37 and Mycoplasma pneumoniae M129, with single-base resolution. Our analysis identified two new methylation motifs not previously described in bacteria: a widespread 6 mA methylation motif common to both bacteria (5′-CTAT-3′), as well as a more complex Type I m6A sequence motif in M. pneumoniae (5′-GAN(7)TAY-3’/3′-CTN(7)ATR-5′). We identify the methyltransferase responsible for the common motif and suggest the one involved in M. pneumoniae only. Analysis of the distribution of methylation sites across the genome of M. pneumoniae suggests a potential role for methylation in regulating the cell cycle, as well as in regulation of gene expression. To our knowledge, this is one of the first direct methylome profiling studies with single-base resolution from a bacterial organism.
Effective targeted cancer therapeutic development depends upon distinguishing disease-associated ‘driver’ mutations, which have causative roles in malignancy pathogenesis, from ‘passenger’ mutations, which are dispensable for cancer initiation and maintenance. Translational studies of clinically active targeted therapeutics can definitively discriminate driver from passenger lesions and provide valuable insights into human cancer biology. Activating internal tandem duplication (ITD) mutations in FLT3 (FLT3-ITD) are detected in approximately 20% of acute myeloid leukaemia (AML) patients and are associated with a poor prognosis. Abundant scientific and clinical evidence, including the lack of convincing clinical activity of early FLT3 inhibitors, suggests that FLT3-ITD probably represents a passenger lesion. Here we report point mutations at three residues within the kinase domain of FLT3-ITD that confer substantial in vitro resistance to AC220 (quizartinib), an active investigational inhibitor of FLT3, KIT, PDGFRA, PDGFRB and RET; evolution of AC220-resistant substitutions at two of these amino acid positions was observed in eight of eight FLT3-ITD-positive AML patients with acquired resistance to AC220. Our findings demonstrate that FLT3-ITD can represent a driver lesion and valid therapeutic target in human AML. AC220-resistant FLT3 kinase domain mutants represent high-value targets for future FLT3 inhibitor development efforts.
The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development.
Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement. Copyright © 2014 Elsevier Inc. All rights reserved.
Prior to the epidemic that emerged in Haiti in October of 2010, cholera had not been documented in this country. After its introduction, a strain of Vibrio cholerae O1 spread rapidly throughout Haiti, where it caused over 600,000 cases of disease and >7,500 deaths in the first two years of the epidemic. We applied whole-genome sequencing to a temporal series of V. cholerae isolates from Haiti to gain insight into the mode and tempo of evolution in this isolated population of V. cholerae O1. Phylogenetic and Bayesian analyses supported the hypothesis that all isolates in the sample set diverged from a common ancestor within a time frame that is consistent with epidemiological observations. A pangenome analysis showed nearly homogeneous genomic content, with no evidence of gene acquisition among Haiti isolates. Nine nearly closed genomes assembled from continuous-long-read data showed evidence of genome rearrangements and supported the observation of no gene acquisition among isolates. Thus, intrinsic mutational processes can account for virtually all of the observed genetic polymorphism, with no demonstrable contribution from horizontal gene transfer (HGT). Consistent with this, the 12 Haiti isolates tested by laboratory HGT assays were severely impaired for transformation, although unlike previously characterized noncompetent V. cholerae isolates, each expressed hapR and possessed a functional quorum-sensing system. Continued monitoring of V. cholerae in Haiti will illuminate the processes influencing the origin and fate of genome variants, which will facilitate interpretation of genetic variation in future epidemics.Vibrio cholerae is the cause of substantial morbidity and mortality worldwide, with over three million cases of disease each year. An understanding of the mode and rate of evolutionary change is critical for proper interpretation of genome sequence data and attribution of outbreak sources. The Haiti epidemic provides an unprecedented opportunity to study an isolated, single-source outbreak of Vibrio cholerae O1 over an established time frame. By using multiple approaches to assay genetic variation, we found no evidence that the Haiti strain has acquired any genes by horizontal gene transfer, an observation that led us to discover that it is also poorly transformable. We have found no evidence that environmental strains have played a role in the evolution of the outbreak strain.
Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae.
Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment. Copyright © 2014, American Association for the Advancement of Science.
Transmission of methicillin-resistant Staphylococcus aureus via deceased donor liver transplantation confirmed by whole genome sequencing.
Donor-derived bacterial infection is a recognized complication of solid organ transplantation (SOT). The present report describes the clinical details and successful outcome in a liver transplant recipient despite transmission of methicillin-resistant Staphylococcus aureus (MRSA) from a deceased donor with MRSA endocarditis and bacteremia. We further describe whole genome sequencing (WGS) and complete de novo assembly of the donor and recipient MRSA isolate genomes, which confirms that both isolates are genetically 100% identical. We propose that similar application of WGS techniques to future investigations of donor bacterial transmission would strengthen the definition of proven bacterial transmission in SOT, particularly in the presence of highly clonal bacteria such as MRSA. WGS will further improve our understanding of the epidemiology of bacterial transmission in SOT and the risk of adverse patient outcomes when it occurs.© Copyright 2014 The American Society of Transplantation and the American Society of Transplant Surgeons.
Resolving tandemly repeated genomic sequences is a necessary step in improving our understanding of the human genome. Short tandem repeats (TRs), or microsatellites, are often used as molecular markers in genetics, and clinically, variation in microsatellites can lead to genetic disorders like Huntington’s diseases. Accurately resolving repeats, and in particular TRs, remains a challenging task in genome alignment, assembly and variation calling. Though tools have been developed for detecting microsatellites in short-read sequencing data, these are limited in the size and types of events they can resolve. Single-molecule sequencing technologies may potentially resolve a broader spectrum of TRs given their increased length, but require new approaches given their significantly higher raw error profiles. However, due to inherent error profiles of the single-molecule technologies, these reads presents a unique challenge in terms of accurately identifying and estimating the TRs.Here we present PacmonSTR, a reference-based probabilistic approach, to identify the TR region and estimate the number of these TR elements in long DNA reads. We present a multistep approach that requires as input, a reference region and the reference TR element. Initially, the TR region is identified from the long DNA reads via a 3-stage modified Smith-Waterman approach and then, expected number of TR elements is calculated using a pair-Hidden Markov Models-based method. Finally, TR-based genotype selection (or clustering: homozygous/heterozygous) is performed with Gaussian mixture models, using the Akaike information criteria, and coverage expectations. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com.
Staphylococcus aureus has evolved as a pathogen that causes a range of diseases in humans. There are two dominant modes of evolution thought to explain most of the virulence differences between strains. First, virulence genes may be acquired from other organisms. Second, mutations may cause changes in the regulation and expression of genes. Here we describe an evolutionary event in which transposition of an IS element has a direct impact on virulence gene regulation resulting in hypervirulence. Whole-genome analysis of a methicillin-resistant S. aureus (MRSA) strain USA500 revealed acquisition of a transposable element (IS256) that is absent from close relatives of this strain. Of the multiple copies of IS256 found in the USA500 genome, one was inserted in the promoter sequence of repressor of toxins (Rot), a master transcriptional regulator responsible for the expression of virulence factors in S. aureus. We show that insertion into the rot promoter by IS256 results in the derepression of cytotoxin expression and increased virulence. Taken together, this work provides new insight into evolutionary strategies by which S. aureus is able to modify its virulence properties and demonstrates a novel mechanism by which horizontal gene transfer directly impacts virulence through altering toxin regulation. © 2014 John Wiley & Sons Ltd.