Understanding the genetic basis of infectious diseases is critical to enacting effective treatments, and several large-scale sequencing initiatives are underway to collect this information. Sequencing bacterial samples is typically performed by mapping sequence reads against genomes of known reference strains. While such resequencing informs on the spectrum of single nucleotide differences relative to the chosen reference, it can miss numerous other forms of variation known to influence pathogenicity: structural variations (duplications, inversions), acquisition of mobile elements (phages, plasmids), homonucleotide length variation causing phase variation, and epigenetic marks (methylation, phosphorothioation) that influence gene expression to switch bacteria from non-pathogenic to pathogenic states. Therefore, sequencing methods which provide complete, de novo genome assemblies and epigenomes are necessary to fully characterize infectious disease agents in an unbiased, hypothesis-free manner. Hybrid assembly methods have been described that combine long sequence reads from SMRT DNA sequencing with short, high-accuracy reads (SMRT (circular consensus sequencing) CCS or second-generation reads) to generate long, highly accurate reads that are then used for assembly. We have developed a new paradigm for microbial de novo assemblies in which long SMRT sequencing reads (average readlengths >5,000 bases) are used exclusively to close the genome through a hierarchical genome assembly process, thereby obviating the need for a second sample preparation, sequencing run and data set. We have applied this method to achieve closed de novo genomes with accuracies exceeding QV50 (>99.999%) to numerous disease outbreak samples, including E. coli, Salmonella, Campylobacter, Listeria, Neisseria, and H. pylori. The kinetic information from the same SMRT sequencing reads is utilized to determine epigenomes. Approximately 70% of all methyltransferase specificities we have determined to date represent previously unknown bacterial epigenetic signatures. The process has been automated and requires less than 1 day from an unknown DNA sample to its complete de novo genome and epigenome.
The evolution of Bordetella pertussis from a common ancestor similar to Bordetella bronchiseptica has occurred through large-scale gene loss, inactivation and rearrangements, largely driven by the spread of insertion sequence element repeats throughout the genome. B. pertussis is widely considered to be monomorphic, and recent evolution of the B. pertussis genome appears to, at least in part, be driven by vaccine-based selection. Given the recent global resurgence of whooping cough despite the wide-spread use of vaccination, a more thorough understanding of B. pertussis genomics could be highly informative. In this chapter we discuss the evolution of B. pertussis, including how vaccination is changing the circulating B. pertussis population at the gene-level, and how new sequencing technologies are revealing previously unknown levels of inter- and intra-strain variation at the genome-level.
Acinetobacter baumannii is an important Gram-negative pathogen in hospital-related infections. However, treatment options for A. baumannii infections have become limited due to multidrug resistance. Bacterial virulence is often associated with capsule genes found in the K locus, many of which are essential for biosynthesis of the bacterial envelope. However, the roles of other genes in the K locus remain largely unknown. From an in vitro evolution experiment, we obtained an isolate of the virulent and multidrug-resistant A. baumannii strain MDR-ZJ06, called MDR-ZJ06M, which has an insertion by the ISAba16 transposon in gnaA (encoding UDP-N-acetylglucosamine C-6 dehydrogenase), a gene found in the K locus. The isolate showed an increased resistance toward tigecycline, whereas the MIC decreased in the case of carbapenems, cephalosporins, colistin, and minocycline. By using knockout and complementation experiments, we demonstrated that gnaA is important for the synthesis of lipooligosaccharide and capsular polysaccharide and that disruption of the gene affects the morphology, drug susceptibility, and virulence of the pathogen.Copyright © 2019 American Society for Microbiology.
Genomic characterization of Kerstersia gyiorum SWMUKG01, an isolate from a patient with respiratory infection in China.
The Gram-negative bacterium Kerstersia gyiorum, a potential etiological agent of clinical infections, was isolated from several human patients presenting clinical symptoms. Its significance as a possible pathogen has been previously overlooked as no disease has thus far been definitively associated with this bacterium. To better understand how the organism contributes to the infectious disease, we determined the complete genomic sequence of K. gyiorum SWMUKG01, the first clinical isolate from southwest China.The genomic data obtained displayed a single circular chromosome of 3, 945, 801 base pairs in length, which contains 3, 441 protein-coding genes, 55 tRNA genes and 9 rRNA genes. Analysis on the full spectrum of protein coding genes for cellular structures, two-component regulatory systems and iron uptake pathways that may be important for the success of the bacterial survival, colonization and establishment in the host conferred new insights into the virulence characteristics of K. gyiorum. Phylogenomic comparisons with Alcaligenaceae species indicated that K. gyiorum SWMUKG01 had a close evolutionary relationships with Alcaligenes aquatilis and Alcaligenes faecalis.The comprehensive analysis presented in this work determinates for the first time a complete genome sequence of K. gyiorum, which is expected to provide useful information for subsequent studies on pathogenesis of this species.
We characterized 170 complete genome assemblies from clinical Bordetella pertussis isolates representing geographic and temporal diversity in the United States. These data capture genotypic shifts, including increased pertactin deficiency, occurring amid the current pertussis disease resurgence and provide a foundation for needed research to direct future public health control strategies.
Complete Genome Sequence of Saccharospirillum mangrovi HK-33T Sheds Light on the Ecological Role of a Bacterium in Mangrove Sediment Environment.
We present the genome sequence of Saccharospirillum mangrovi HK-33T, isolated from a mangrove sediment sample in Haikou, China. The complete genome of S. mangrovi HK-33T consisted of a single-circular chromosome with the size of 3,686,911 bp as well as an average G?+?C content of 57.37%, and contained 3,383 protein-coding genes, 4 operons of 16S-23S-5S rRNA genes, and 52 tRNA genes. Genomic annotation indicated that the genome of S. mangrovi HK-33T had many genes related to oligosaccharide and polysaccharide degradation and utilization of polyhydroxyalkanoate. For nitrogen cycle, genes encoding nitrate and nitrite reductase, glutamate dehydrogenase, glutamate synthase, and glutamine synthetase could be found. For phosphorus cycle, genes related to polyphosphate kinases (ppk1 and ppk2), the high-affinity phosphate-specific transport (Pst) system, and the low-affinity inorganic phosphate transporter (pitA) were predicted. For sulfur cycle, cysteine synthase and type III acyl coenzyme A transferase (dddD) coding genes were searched out. This study provides evidence about carbon, nitrogen, phosphorus, and sulfur metabolic patterns of S. mangrovi HK-33T and broadens our understandings about ecological roles of this bacterium in the mangrove sediment environment.
The development of clustered regularly interspaced short-palindromic repeat (CRISPR)-Cas systems for genome editing has transformed the way life science research is conducted and holds enormous potential for the treatment of disease as well as for many aspects of biotech- nology. Here, I provide a personal perspective on the development of CRISPR-Cas9 for genome editing within the broader context of the field and discuss our work to discover novel Cas effectors and develop them into additional molecular tools. The initial demonstra- tion of Cas9-mediated genome editing launched the development of many other technologies, enabled new lines of biological inquiry, and motivated a deeper examination of natural CRISPR-Cas systems, including the discovery of new types of CRISPR-Cas systems. These new discoveries in turn spurred further technological developments. I review these exciting discoveries and technologies as well as provide an overview of the broad array of applications of these technologies in basic research and in the improvement of human health. It is clear that we are only just beginning to unravel the potential within microbial diversity, and it is quite likely that we will continue to discover other exciting phenomena, some of which it may be possible to repurpose as molecular technologies. The transformation of mysterious natural phenomena to powerful tools, however, takes a collective effort to discover, characterize, and engineer them, and it has been a privilege to join the numerous researchers who have contributed to this transformation of CRISPR-Cas systems.
Complete Assembly of the Genome of an Acidovorax citrulli Strain Reveals a Naturally Occurring Plasmid in This Species.
Acidovorax citrulli is the causal agent of bacterial fruit blotch (BFB), a serious threat to cucurbit crop production worldwide. Based on genetic and phenotypic properties, A. citrulli strains are divided into two major groups: group I strains have been generally isolated from melon and other non-watermelon cucurbits, while group II strains are closely associated with watermelon. In a previous study, we reported the genome of the group I model strain, M6. At that time, the M6 genome was sequenced by MiSeq Illumina technology, with reads assembled into 139 contigs. Here, we report the assembly of the M6 genome following sequencing with PacBio technology. This approach not only allowed full assembly of the M6 genome, but it also revealed the occurrence of a ~53 kb plasmid. The M6 plasmid, named pACM6, was further confirmed by plasmid extraction, Southern-blot analysis of restricted fragments and obtention of M6-derivative cured strains. pACM6 occurs at low copy numbers (average of ~4.1 ± 1.3 chromosome equivalents) in A. citrulli M6 and contains 63 open reading frames (ORFs), most of which (55.6%) encoding hypothetical proteins. The plasmid contains several genes encoding type IV secretion components, and typical plasmid-borne genes involved in plasmid maintenance, replication and transfer. The plasmid also carries an operon encoding homologs of a Fic-VbhA toxin-antitoxin (TA) module. Transcriptome data from A. citrulli M6 revealed that, under the tested conditions, the genes encoding the components of this TA system are among the highest expressed genes in pACM6. Whether this TA module plays a role in pACM6 maintenance is still to be determined. Leaf infiltration and seed transmission assays revealed that, under tested conditions, the loss of pACM6 did not affect the virulence of A. citrulli M6. We also show that pACM6 or similar plasmids are present in several group I strains, but absent in all tested group II strains of A. citrulli.
Comparative Genomic Analyses Reveal Core-Genome-Wide Genes Under Positive Selection and Major Regulatory Hubs in Outlier Strains of Pseudomonas aeruginosa.
Genomic information for outlier strains of Pseudomonas aeruginosa is exiguous when compared with classical strains. We sequenced and constructed the complete genome of an environmental strain CR1 of P. aeruginosa and performed the comparative genomic analysis. It clustered with the outlier group, hence we scaled up the analyses to understand the differences in environmental and clinical outlier strains. We identified eight new regions of genomic plasticity and a plasmid pCR1 with a VirB/D4 complex followed by trimeric auto-transporter that can induce virulence phenotype in the genome of strain CR1. Virulence genotype analysis revealed that strain CR1 lacked hemolytic phospholipase C and D, three genes for LPS biosynthesis and had reduced antibiotic resistance genes when compared with clinical strains. Genes belonging to proteases, bacterial exporters and DNA stabilization were found to be under strong positive selection, thus facilitating pathogenicity and survival of the outliers. The outliers had the complete operon for the production of vibrioferrin, a siderophore present in plant growth promoting bacteria. The competence to acquire multidrug resistance and new virulence factors makes these strains a potential threat. However, we identified major regulatory hubs that can be used as drug targets against both the classical and outlier groups.
Currently, there is a critical need to rapidly identify infectious organisms in clinical samples. Next-Generation Sequencing (NGS) could surmount the deficiencies of culture-based methods; however, there are no standardized, automated programs to process NGS data. To address this deficiency, we developed the Rapid Infectious Disease Identification (RIDI™) system. The system requires minimal guidance, which reduces operator errors. The system is compatible with the three major NGS platforms. It automatically interfaces with the sequencing system, detects their data format, configures the analysis type, applies appropriate quality control, and analyzes the results. Sequence information is characterized using both the NCBI database and RIDI™ specific databases. RIDI™ was designed to identify high probability sequence matches and more divergent matches that could represent different or novel species. We challenged the system using defined American Type Culture Collection (ATCC) reference standards of 27 species, both individually and in varying combinations. The system was able to rapidly detect known organisms in <12h with multi-sample throughput. The system accurately identifies 99.5% of the DNA sequence reads at the genus-level and 75.3% at the species-level in reference standards. It has a limit of detection of 146cells/ml in simulated clinical samples, and is also able to identify the components of polymicrobial samples with 16.9% discrepancy at the genus-level and 31.2% at the species-level. Thus, the system's effectiveness may exceed current methods, especially in situations where culture methods could produce false negatives or where rapid results would influence patient outcomes. Copyright © 2016 Elsevier B.V. All rights reserved.
The Old World vulture may carry and spread pathogens for emerging infections since they feed on the carcasses of dead animals and participate in the sky burials of humans, some of whom have died from communicable diseases. Therefore, we studied the precise fecal microbiome of the Old World vulture with metataxonomics, integrating the high-throughput sequencing of almost full-length small subunit ribosomal RNA (16S rRNA) gene amplicons in tandem with the operational phylogenetic unit (OPU) analysis strategy. Nine vultures of three species were sampled using rectal swabs on the Qinghai-Tibet Plateau, China. Using the Pacific Biosciences sequencing platform, we obtained 54 135 high-quality reads of 16S rRNA amplicons with an average of 1442±6.9?bp in length and 6015±1058 reads per vulture. Those sequences were classified into 314 OPUs, including 102 known species, 50 yet to be described species and 161 unknown new lineages of uncultured representatives. Forty-five species have been reported to be responsible for human outbreaks or infections, and 23 yet to be described species belong to genera that include pathogenic species. Only six species were common to all vultures. Clostridium perfringens was the most abundant and present in all vultures, accounting for 30.8% of the total reads. Therefore, using the new technology, we found that vultures are an important reservoir for C. perfringens as evidenced by the isolation of 107 strains encoding for virulence genes, representing 45 sequence types. Our study suggests that the soil-related C. perfringens and other pathogens could have a reservoir in vultures and other animals.
Acquisition of genes through horizontal gene transfer (HGT) allows microbes to rapidly gain new capabilities and adapt to new or changing environments. Identifying widespread HGT regions within multispecies microbiomes can pinpoint the molecular mechanisms that play key roles in microbiome assembly. We sought to identify horizontally transferred genes within a model microbiome, the cheese rind. Comparing 31 newly sequenced and 134 previously sequenced bacterial isolates from cheese rinds, we identified over 200 putative horizontally transferred genomic regions containing 4733 protein coding genes. The largest of these regions are enriched for genes involved in siderophore acquisition, and are widely distributed in cheese rinds in both Europe and the US. These results suggest that HGT is prevalent in cheese rind microbiomes, and that identification of genes that are frequently transferred in a particular environment may provide insight into the selective forces shaping microbial communities.
Despite high vaccine coverage, pertussis cases in the United States have increased over the last decade. Growing evidence suggests that disease resurgence results, in part, from genetic divergence of circulating strain populations away from vaccine references. The United States employs acellular vaccines exclusively, and current Bordetella pertussis isolates are predominantly deficient in at least one immunogen, pertactin (Prn). First detected in the United States retrospectively in a 1994 isolate, the rapid spread of Prn deficiency is likely vaccine driven, raising concerns about whether other acellular vaccine immunogens experience similar pressures, as further antigenic changes could potentially threaten vaccine efficacy. We developed an electrochemiluminescent antibody capture assay to monitor the production of the acellular vaccine immunogen filamentous hemagglutinin (Fha). Screening 722 U.S. surveillance isolates collected from 2010 to 2016 identified two that were both Prn and Fha deficient. Three additional Fha-deficient laboratory strains were also identified from a historic collection of 65 isolates dating back to 1935. Whole-genome sequencing of deficient isolates revealed putative, underlying genetic changes. Only four isolates harbored mutations to known genes involved in Fha production, highlighting the complexity of its regulation. The chromosomes of two Fha-deficient isolates included unexpected structural variation that did not appear to influence Fha production. Furthermore, insertion sequence disruption of fhaB was also detected in a previously identified pertussis toxin-deficient isolate that still produced normal levels of Fha. These results demonstrate the genetic potential for additional vaccine immunogen deficiency and underscore the importance of continued surveillance of circulating B. pertussis evolution in response to vaccine pressure. Copyright © 2018 American Society for Microbiology.
Insights into the evolution of host association through the isolation and characterization of a novel human periodontal pathobiont, Desulfobulbus oralis.
The human oral microbiota encompasses representatives of many bacterial lineages that have not yet been cultured. Here we describe the isolation and characterization of previously uncultured Desulfobulbus oralis, the first human-associated representative of its genus. As mammalian-associated microbes rarely have free-living close relatives, D. oralis provides opportunities to study how bacteria adapt and evolve within a host. This sulfate-reducing deltaproteobacterium has adapted to the human oral subgingival niche by curtailing its physiological repertoire, losing some biosynthetic abilities and metabolic independence, and by dramatically reducing environmental sensing and signaling capabilities. The genes that enable free-living Desulfobulbus to synthesize the potent neurotoxin methylmercury were also lost by D. oralis, a notably positive outcome of host association. However, horizontal gene acquisitions from other members of the microbiota provided novel mechanisms of interaction with the human host, including toxins like leukotoxin and hemolysins. Proteomic and transcriptomic analysis revealed that most of those factors are actively expressed, including in the subgingival environment, and some are secreted. Similar to other known oral pathobionts, D. oralis can trigger a proinflammatory response in oral epithelial cells, suggesting a direct role in the development of periodontal disease.IMPORTANCE Animal-associated microbiota likely assembled as a result of numerous independent colonization events by free-living microbes followed by coevolution with their host and other microbes. Through specific adaptation to various body sites and physiological niches, microbes have a wide range of contributions, from beneficial to disease causing. Desulfobulbus oralis provides insights into genomic and physiological transformations associated with transition from an open environment to a host-dependent lifestyle and the emergence of pathogenicity. Through a multifaceted mechanism triggering a proinflammatory response, D. oralis is a novel periodontal pathobiont. Even though culture-independent approaches can provide insights into the potential role of the human microbiome “dark matter,” cultivation and experimental characterization remain important to studying the roles of individual organisms in health and disease.
Contagious equine metritis is a disease of worldwide concern in equids. The United States is considered to be free of the disease although sporadic outbreaks have occurred over the last few decades that were thought to be associated with the importation of horses. The objective of this study was to create finished, reference quality genomes that characterize the diversity of Taylorella equigenitalis isolates introduced into the USA, and identify their differences. Five isolates of T. equigenitalis associated with introductions into the USA from unique sources were sequenced using both short and long read chemistries allowing for complete assembly and annotation. These sequences were compared to previously published genomes as well as the short read sequences of the 200 isolates in the National Veterinary Services Laboratories’ diagnostic repository to identify unique regions and genes, potential virulence factors, and characterize diversity. The 5 genomes varied in size by up to 100,000 base pairs, but averaged 1.68 megabases. The majority of that diversity in size can be explained by repeat regions and 4 main regions of difference, which ranged in size from 15,000 to 45,000 base pairs. The first region of difference contained mostly hypothetical proteins, the second contained the CRISPR, the third contained primarily hemagglutinin proteins, and the fourth contained primarily segments of a type IV secretion system. As expected and previously reported, little evidence of recombination was found within these genomes. Several additional areas of interest were also observed including a mechanism for streptomycin resistance and other virulence factors. A SNP distance comparison of the T. equigenitalis isolates and Mycobacterium tuberculosis complex (MTBC) showed that relatively, T. equigenitalis was a more diverse species than the entirety of MTBC.