Menu
July 7, 2019

Fallacy of the unique genome: sequence diversity within single Helicobacter pylori strains.

Many bacterial genomes are highly variable but nonetheless are typically published as a single assembled genome. Experiments tracking bacterial genome evolution have not looked at the variation present at a given point in time. Here, we analyzed the mouse-passaged Helicobacter pylori strain SS1 and its parent PMSS1 to assess intra- and intergenomic variability. Using high sequence coverage depth and experimental validation, we detected extensive genome plasticity within these H. pylori isolates, including movement of the transposable element IS607, large and small inversions, multiple single nucleotide polymorphisms, and variation in cagA copy number. The cagA gene was found as 1 to 4 tandem copies located off the cag island in both SS1 and PMSS1; this copy number variation correlated with protein expression. To gain insight into the changes that occurred during mouse adaptation, we also compared SS1 and PMSS1 and observed 46 differences that were distinct from the within-genome variation. The most substantial was an insertion in cagY, which encodes a protein required for a type IV secretion system function. We detected modifications in genes coding for two proteins known to affect mouse colonization, the HpaA neuraminyllactose-binding protein and the FutB a-1,3 lipopolysaccharide (LPS) fucosyltransferase, as well as genes predicted to modulate diverse properties. In sum, our work suggests that data from consensus genome assemblies from single colonies may be misleading by failing to represent the variability present. Furthermore, we show that high-depth genomic sequencing data of a population can be analyzed to gain insight into the normal variation within bacterial strains.IMPORTANCE Although it is well known that many bacterial genomes are highly variable, it is nonetheless traditional to refer to, analyze, and publish “the genome” of a bacterial strain. Variability is usually reduced (“only sequence from a single colony”), ignored (“just publish the consensus”), or placed in the “too-hard” basket (“analysis of raw read data is more robust”). Now that whole-genome sequences are regularly used to assess virulence and track outbreaks, a better understanding of the baseline genomic variation present within single strains is needed. Here, we describe the variability seen in typical working stocks and colonies of pathogen Helicobacter pylori model strains SS1 and PMSS1 as revealed by use of high-coverage mate pair next-generation sequencing (NGS) and confirmed by traditional laboratory techniques. This work demonstrates that reliance on a consensus assembly as “the genome” of a bacterial strain may be misleading. Copyright © 2017 Draper et al.


July 7, 2019

Genome sequence of Streptomyces sp. H-KF8, a marine actinobacterium isolated from a northern Chilean Patagonian fjord.

Streptomyces sp. H-KF8 is a fjord-derived marine actinobacterium capable of producing antimicrobial activity. Streptomyces sp. H-KF8 was isolated from sediments of the Comau fjord, located in the northern Chilean Patagonia. Here, we report the 7.7-Mb genome assembly, which represents the first genome of a Chilean marine actinobacterium. Copyright © 2017 Undabarrena et al.


July 7, 2019

Complete genome sequences of three Cupriavidus strains isolated from various Malaysian environments.

Cupriavidus sp. USMAA1020, USMAA2-4, and USMAHM13 are capable of producing polyhydroxyalkanoate (PHA). This biopolymer is an alternative solution to synthetic plastics, whereby polyhydroxyalkanoate synthase is the key enzyme involved in PHA biosynthesis. Here, we report the complete genomes of three Cupriavidus sp. strains: USMAA1020, USMAA2-4, and USMAHM13. Copyright © 2017 Shafie et al.


July 7, 2019

Complete genome sequence of Thermus brockianus GE-1 reveals key enzymes of xylan/xylose metabolism.

Thermus brockianus strain GE-1 is a thermophilic, Gram-negative, rod-shaped and non-motile bacterium that was isolated from the Geysir geothermal area, Iceland. Like other thermophiles, Thermus species are often used as model organisms to understand the mechanism of action of extremozymes, especially focusing on their heat-activity and thermostability. Genome-specific features of T. brockianus GE-1 and their properties further help to explain processes of the adaption of extremophiles at elevated temperatures. Here we analyze the first whole genome sequence of T. brockianus strain GE-1. Insights of the genome sequence and the methodologies that were applied during de novo assembly and annotation are given in detail. The finished genome shows a phred quality value of QV50. The complete genome size is 2.38 Mb, comprising the chromosome (2,035,182 bp), the megaplasmid pTB1 (342,792 bp) and the smaller plasmid pTB2 (10,299 bp). Gene prediction revealed 2,511 genes in total, including 2,458 protein-encoding genes, 53 RNA and 66 pseudo genes. A unique genomic region on megaplasmid pTB1 was identified encoding key enzymes for xylan depolymerization and xylose metabolism. This is in agreement with the growth experiments in which xylan is utilized as sole source of carbon. Accordingly, we identified sequences encoding the xylanase Xyn10, an endoglucanase, the membrane ABC sugar transporter XylH, the xylose-binding protein XylF, the xylose isomerase XylA catalyzing the first step of xylose metabolism and the xylulokinase XylB, responsible for the second step of xylose metabolism. Our data indicate that an ancestor of T. brockianus obtained the ability to use xylose as alternative carbon source by horizontal gene transfer.


July 7, 2019

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Long-read sequencing can overcome the weaknesses of short reads in the assembly of eukaryotic genomes, however, at present additional scaffolding is needed to achieve chromosome-level assemblies. We generated PacBio long-read data of the genomes of three relatives of the model plant Arabidopsis thaliana and assembled all three genomes into only a few hundred contigs. To improve the contiguities of these assemblies, we generated BioNano Genomics optical mapping and Dovetail Genomics chromosome conformation capture data for genome scaffolding. Despite their technical differences, optical mapping and chromosome conformation capture performed similarly and doubled N50 values. After improving both integration methods, assembly contiguity reached chromosome-arm-levels. We rigorously assessed the quality of contigs and scaffolds using Illumina mate-pair libraries and genetic map information. This showed that PacBio assemblies have high sequence accuracy but can contain several misassemblies, which join unlinked regions of the genome. Most, but not all of these mis-joints were removed during the integration of the optical mapping and chromosome conformation capture data. Even though none of the centromeres was fully assembled, the scaffolds revealed large parts of some centromeric regions, even including some of the heterochromatic regions, which are not present in gold standard reference sequences. Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Variant tolerant read mapping using min-hashing

DNA read mapping is a ubiquitous task in bioinformatics, and many tools have been developed to solve the read mapping problem. However, there are two trends that are changing the landscape of readmapping: First, new sequencing technologies provide very long reads with high error rates (up to 15%). Second, many genetic variants in the population are known, so the reference genome is not considered as a single string over ACGT, but as a complex object containing these variants. Most existing read mappers do not handle these new circumstances appropriately.


July 7, 2019

Whole-genome sequence of Escherichia coli serotype O157:H7 strain PA20.

Escherichia coli serotype O157:H7 strain PA20 is a Pennsylvania Department of Health clinical isolate. It has been used to study biofilm formation in O157:H7 clinical isolates, where the high incidence of prophage insertions in the mlrA transcription factor disrupts traditional csgD biofilm regulation. Here, we report the complete PA20 genome sequence. Copyright © 2017 Uhlich et al.


July 7, 2019

First complete genome sequence of Haemophilus influenzae serotype a.

Haemophilus influenzae is an important human pathogen that primarily infects small children. In recent years, H. influenzae serotype a has emerged as a significant cause of invasive disease among indigenous populations. Here, we present the first complete whole-genome sequence of H. influenzae serotype a.© Crown copyright 2017.


July 7, 2019

Whole-genome sequences of Mycobacterium tuberculosis TB282 and TB284, a widespread and a unique strain, respectively, identified in a previous study of tuberculosis transmission in central Los Angeles, California, USA.

We report here the genome sequences of two Mycobacterium tuberculosis clinical isolates previously identified in central Los Angeles, CA, in the 1990s using a PacBio platform. Isolate TB282 represents a large-cluster strain that caused 27% of the tuberculosis cases, while TB284 represents a strain that caused disease in only one patient. Copyright © 2017 Zhang and Yang.


July 7, 2019

Antibiotic discovery throughout the Small World Initiative: A molecular strategy to identify biosynthetic gene clusters involved in antagonistic activity.

The emergence of bacterial pathogens resistant to all known antibiotics is a global health crisis. Adding to this problem is that major pharmaceutical companies have shifted away from antibiotic discovery due to low profitability. As a result, the pipeline of new antibiotics is essentially dry and many bacteria now resist the effects of most commonly used drugs. To address this global health concern, citizen science through the Small World Initiative (SWI) was formed in 2012. As part of SWI, students isolate bacteria from their local environments, characterize the strains, and assay for antibiotic production. During the 2015 fall semester at Bowling Green State University, students isolated 77 soil-derived bacteria and genetically characterized strains using the 16S rRNA gene, identified strains exhibiting antagonistic activity, and performed an expanded SWI workflow using transposon mutagenesis to identify a biosynthetic gene cluster involved in toxigenic compound production. We identified one mutant with loss of antagonistic activity and through subsequent whole-genome sequencing and linker-mediated PCR identified a 24.9 kb biosynthetic gene locus likely involved in inhibitory activity in that mutant. Further assessment against human pathogens demonstrated the inhibition of Bacillus cereus, Listeria monocytogenes, and methicillin-resistant Staphylococcus aureus in the presence of this compound, thus supporting our molecular strategy as an effective research pipeline for SWI antibiotic discovery and genetic characterization.© 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.


July 7, 2019

Complete genome sequence of Mycoplasma pneumoniae type 2 reference strain FH using single-molecule real-time sequencing technology.

Mycoplasma pneumoniae type 2 strain FH was previously sequenced with Illumina (FH-Illumina) and 454 (FH-454) technologies according to Xiao et al. (2015) and Krishnakumar et al. (2010). Comparative analyses revealed differences in genomic content between these sequences, including a 6-kb region absent from the FH-454 submission. Here, we present a complete genome sequence of FH sequenced with the Pacific Biosciences RSII platform. Copyright © 2017 Desai et al.


July 7, 2019

Genome sequence of the filamentous actinomycete Kitasatospora viridifaciens.

The vast majority of antibiotics are produced by filamentous soil bacteria called actinomycetes. We report here the genome sequence of the tetracycline producer “Streptomyces viridifaciens” DSM 40239. Given that this species has the hallmark signatures characteristic of the Kitasatospora genus, we previously proposed to rename this organism Kitasatospora viridifaciens. Copyright © 2017 Ramijan et al.


July 7, 2019

Toward a complete North American Borrelia miyamotoi genome.

Borrelia miyamotoi, of the relapsing-fever spirochete group, is an emerging tick-borne pathogen causing human illness in the northern hemisphere. Here, we present the chromosome, eight extrachromosomal linear plasmids, and a draft sequence for five circular and one linear plasmid of a Borrelia miyamotoi strain isolated from an Ixodes sp. tick from Connecticut, USA. Copyright © 2017 Kingry et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.