Menu
July 19, 2019

Monitoring microevolution of OXA-48-producing Klebsiella pneumoniae ST147 in a hospital setting by SMRT sequencing.

Carbapenemase-producing Klebsiella pneumoniae pose an increasing risk for healthcare facilities worldwide. A continuous monitoring of ST distribution and its association with resistance and virulence genes is required for early detection of successful K. pneumoniae lineages. In this study, we used WGS to characterize MDR blaOXA-48-positive K. pneumoniae isolated from inpatients at the University Medical Center Göttingen, Germany, between March 2013 and August 2014.Closed genomes for 16 isolates of carbapenemase-producing K. pneumoniae were generated by single molecule real-time technology using the PacBio RSII platform.Eight of the 16 isolates showed identical XbaI macrorestriction patterns and shared the same MLST, ST147. The eight ST147 isolates differed by only 1-25 SNPs of their core genome, indicating a clonal origin. Most of the eight ST147 isolates carried four plasmids with sizes of 246.8, 96.1, 63.6 and 61.0?kb and a novel linear plasmid prophage, named pKO2, of 54.6?kb. The blaOXA-48 gene was located on a 63.6?kb IncL plasmid and is part of composite transposon Tn1999.2. The ST147 isolates expressed the yersinabactin system as a major virulence factor. The comparative whole-genome analysis revealed several rearrangements of mobile genetic elements and losses of chromosomal and plasmidic regions in the ST147 isolates.Single molecule real-time sequencing allowed monitoring of the genetic and epigenetic microevolution of MDR OXA-48-producing K. pneumoniae and revealed in addition to SNPs, complex rearrangements of genetic elements.© The Author 2017. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please email: journals.permissions@oup.com.


July 19, 2019

The diversity, structure, and function of heritable adaptive immunity sequences in the Aedes aegypti genome.

The Aedes aegypti mosquito transmits arboviruses, including dengue, chikungunya, and Zika virus. Understanding the mechanisms underlying mosquito immunity could provide new tools to control arbovirus spread. Insects exploit two different RNAi pathways to combat viral and transposon infection: short interfering RNAs (siRNAs) and PIWI-interacting RNAs (piRNAs) [1, 2]. Endogenous viral elements (EVEs) are sequences from non-retroviral viruses that are inserted into the mosquito genome and can act as templates for the production of piRNAs [3, 4]. EVEs therefore represent a record of past infections and a reservoir of potential immune memory [5]. The large-scale organization of EVEs has been difficult to resolve with short-read sequencing because they tend to integrate into repetitive regions of the genome. To define the diversity, organization, and function of EVEs, we took advantage of the contiguity associated with long-read sequencing to generate a high-quality assembly of the Ae. aegypti-derived Aag2 cell line genome, an important and widely used model system. We show EVEs are acquired through recombination with specific classes of long terminal repeat (LTR) retrotransposons and organize into large loci (>50 kbp) characterized by high LTR density. These EVE-containing loci have increased density of piRNAs compared to similar regions without EVEs. Furthermore, we detected EVE-derived piRNAs consistent with a targeted processing of persistently infecting virus genomes. We propose that comparisons of EVEs across mosquito populations may explain differences in vector competence, and further study of the structure and function of these elements in the genome of mosquitoes may lead to epidemiological interventions. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 19, 2019

Analysis of recombinational switching at the antigenic variation locus of the Lyme spirochete using a novel PacBio sequencing pipeline.

The Lyme disease spirochete evades the host immune system by combinatorial variation of VlsE, a surface antigen. Antigenic variation occurs via segmental gene conversion from contiguous silent cassettes into the vlsE locus. Because of the high degree of similarity between switch variants and the size of vlsE, short-read NGS technologies have been unsuitable for sequencing vlsE populations. Here we use PacBio sequencing technology coupled with the first fully-automated software pipeline (VAST) to accurately process NGS data by minimizing error frequency, eliminating heteroduplex errors and accurately aligning switch variants. We extend earlier studies by showing use of almost all of the vlsE SNP repertoire. In different tissues of the same mouse, 99.6% of the variants were unique, suggesting that dissemination of Borrelia burgdorferi is predominantly unidirectional with little tissue-to-tissue hematogenous dissemination. We also observed a similar number of variants in SCID and wild-type mice, a heatmap of location and frequency of amino acid changes on the 3D structure and note differences observed in SCID versus wild type mice that hint at possible amino acid function. Our observed selection against diversification of residues at the dimer interface in wild-type mice strongly suggests that dimerization is required for in vivo functionality of vlsE.© 2017 John Wiley & Sons Ltd.


July 19, 2019

Long-read genome sequence assembly provides insight into ongoing retroviral invasion of the koala germline.

The koala retrovirus (KoRV) is implicated in several diseases affecting the koala (Phascolarctos cinereus). KoRV provirus can be present in the genome of koalas as an endogenous retrovirus (present in all cells via germline integration) or as exogenous retrovirus responsible for somatic integrations of proviral KoRV (present in a limited number of cells). This ongoing invasion of the koala germline by KoRV provides a powerful opportunity to assess the viral strategies used by KoRV in an individual. Analysis of a high-quality genome sequence of a single koala revealed 133 KoRV integration sites. Most integrations contain full-length, endogenous provirus; KoRV-A subtype. The second most frequent integrations contain an endogenous recombinant element (recKoRV) in which most of the KoRV protein-coding region has been replaced with an ancient, endogenous retroelement. A third set of integrations, with very low sequence coverage, may represent somatic cell integrations of KoRV-A, KoRV-B and two recently designated additional subgroups, KoRV-D and KoRV-E. KoRV-D and KoRV-E are missing several genes required for viral processing, suggesting they have been transmitted as defective viruses. Our results represent the first comprehensive analyses of KoRV integration and variation in a single animal and provide further insights into the process of retroviral-host species interactions.


July 19, 2019

A comparative study on the characterization of hepatitis B virus quasispecies by clone-based sequencing and third-generation sequencing.

Hepatitis B virus (HBV) has a high mutation rate due to the extremely high replication rate and the proofreading deficiency during reverse transcription. The generated variants with genetic heterogeneity are described as viral quasispecies (QS). Clone-based sequencing (CBS) is thought to be the ‘gold standard’ for assessing QS complexity and diversity of HBV, but an important issue about CBS is cost-effectiveness and laborious. In this study, we investigated the utility of the third-generation sequencing (TGS) DNA sequencing to characterize genetic heterogeneity of HBV QS and assessed the possible contribution of TGS technology in HBV QS studies. Parallel experiments including 3 control samples, which consisted of HBV full gene genotype B and genotype C plasmids, and 10 patients samples were performed by using CBS and TGS to analyze HBV whole-genome QS. Characterization of QS heterogeneity was conducted by using comprehensive statistical analysis. The results showed that TGS had a high consistency with CBS when measuring the complexity and diversity of QS. In addition, to detect rare variants, there were strong advantages conferred by TGS. In summary, TGS was considered to be practicable in HBV QS studies and it might have a relevant role in the clinical management of HBV infection in the future.


July 19, 2019

Genome and methylome variation in Helicobacter pylori with a cag pathogenicity island during early stages of human infection.

Helicobacter pylori is remarkable for its genetic variation. Yet little isknown about its genetic changes during early stages of human infection, as the bacteria adapt to their new environment. We analyzed genome and methylome variations in a fully virulent strain of H pylori strain during experimental infection.We performed a randomized Phase 1 and 2, observer-blind, placebo-controlled, study of 12 healthy, H pylori-negative adults in Germany from October 2008 through March 2010. The volunteers were given a prophylactic vaccine candidate (n=7) or placebo (n=5) and then challenged with H pylori strain BCM-300. Biopsy samples were collected and H pylori were isolated. Genomes of the challenge strain and 12 re-isolates, obtained 12 weeks after (or in 1 case, 62 weeks after) infection were sequenced by single-molecule, real-time technology, which, in parallel, permitted determination of genome-wide methylation patterns for all strains. Functional effects of genetic changes observed in H pylori strains during human infection were assessed by measuring release of interleukin 8 from AGS cells (to detect cag PAI function), neutral red uptake (to detect vacuolating cytotoxin activity), and adhesion assays.The observed mutation rate was in agreement with rates previously determined from patients with chronic H pylori infections, without evidence of a mutation burst. A loss; of cag PAI function was observed in 3 re-isolates. In addition, 3 re-isolates from the vaccine; group acquired mutations in the vacuolating cytotoxin gene vacA, resulting in loss of; vacuolization activity from gastric epithelial cells. We observed inter-strain variation in; methylomes due to phase variation in genes encoding methyltransferases.We analyzed adaptation of a fully virulent strain of H pylori to 12 differentvolunteers to obtain a robust estimate of the frequency of genetic and epigenetic changes inthe absence of inter-strain recombination. Our findings indicate that the large amount of; genetic variation in H pylori poses a challenge to vaccine development. ClinicalTrials.gov no: NCT00736476. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Genomic analysis of hospital plumbing reveals diverse reservoir of bacterial plasmids conferring carbapenem resistance.

The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization ofblaKPC/blaNDM-positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections withblaKPC-positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission ofLeclerciasp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings.IMPORTANCECarbapenemase-producing organisms (CPOs) are a global concern because of the morbidity and mortality associated with these resistant Gram-negative bacteria. Horizontal plasmid transfer spreads the resistance mechanism to new bacteria, and understanding the plasmid ecology of the hospital environment can assist in the design of control strategies to prevent nosocomial infections. A 5-year genomic and epidemiological survey was undertaken to study the CPOs in the patient-accessible environment, as well as in the plumbing system removed from the patient. This comprehensive survey revealed a vast, unappreciated reservoir of CPOs in wastewater, which was in contrast to the low positivity rate in both the patient population and the patient-accessible environment. While there were few patient-environmental isolate associations, there were plasmid backbones common to both populations. These results are relevant to all hospitals for which CPO colonization may not yet be defined through extensive surveillance.


July 19, 2019

A high-throughput approach for identification of nontuberculous mycobacteria in drinking water reveals relationship between water age and Mycobacterium avium.

Nontuberculous mycobacteria (NTM) frequently detected in drinking water (DW) include species associated with human infections, as well as species rarely linked to disease. Methods for improved the recovery of NTM DNA and high-throughput identification of NTM are needed for risk assessment of NTM infection through DW exposure. In this study, different methods of recovering bacterial DNA from DW were compared, revealing that a phenol-chloroform DNA extraction method yielded two to four times as much total DNA and eight times as much NTM DNA as two commercial DNA extraction kits. This method, combined with high-throughput, single-molecule real-time sequencing of NTMrpoBgenes, allowed the identification of NTM to the species, subspecies, and (in some cases) strain levels. This approach was applied to DW samples collected from 15 households serviced by a chloraminated distribution system, with homes located in areas representing short (<24 h) and long (>24 h) distribution system residence times. Multivariate statistical analysis revealed that greater water age (i.e., combined distribution system residence time and home plumbing stagnation time) was associated with a greater relative abundance ofMycobacterium aviumsubsp.avium, one of the most prevalent NTM causing infections in humans. DW from homes closer to the treatment plant (with a shorter water age) contained more diverse NTM species, includingMycobacterium abscessusandMycobacterium chelonaeOverall, our approach allows NTM identification to the species and subspecies levels and can be used in future studies to assess the risk of waterborne infection by providing insight into the similarity between environmental and infection-associated NTM.IMPORTANCEAn extraction method for improved recovery of DNA from nontuberculous mycobacteria (NTM), combined with single-molecule real-time sequencing (PacBio) of NTMrpoBgenes, was used for high-throughput characterization of NTM species and in some cases strains in drinking water (DW). The extraction procedure recovered, on average, eight times as much NTM DNA and three times as much total DNA from DW as two widely used commercial DNA extraction kits. The combined DNA extraction and sequencing approach allowed high-throughput screening of DW samples to identify NTM, revealing that the relative abundance ofMycobacterium aviumsubsp.aviumincreased with water age. Furthermore, the two-step barcoding approach developed as part of the PacBio sequencing method makes this procedure highly adaptable, allowing it to be used for other target genes and species. Copyright © 2018 Haig et al.


July 19, 2019

Phasevarions of bacterial pathogens: Methylomics sheds new light on old enemies.

A wide variety of bacterial pathogens express phase-variable DNA methyltransferases that control expression of multiple genes via epigenetic mechanisms. These randomly switching regulons – phasevarions – regulate genes involved in pathogenesis, host adaptation, and antibiotic resistance. Individual phase-variable genes can be identified in silico as they contain easily recognized features such as simple sequence repeats (SSRs) or inverted repeats (IRs) that mediate the random switching of expression. Conversely, phasevarion-controlled genes do not contain any easily identifiable features. The study of DNA methyltransferase specificity using Single-Molecule, Real-Time (SMRT) sequencing and methylome analysis has rapidly advanced the analysis of phasevarions by allowing methylomics to be combined with whole-transcriptome/proteome analysis to comprehensively characterize these systems in a number of important bacterial pathogens. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 19, 2019

Survey on the use of whole-genome sequencing for infectious diseases surveillance: Rapid expansion of European national capacities, 2015-2016.

Whole-genome sequencing (WGS) has become an essential tool for public health surveillance and molecular epidemiology of infectious diseases and antimicrobial drug resistance. It provides precise geographical delineation of spread and enables incidence monitoring of pathogens at genotype level. Coupled with epidemiological and environmental investigations, it delivers ultimate resolution for tracing sources of epidemic infections. To ascertain the level of implementation of WGS-based typing for national public health surveillance and investigation of prioritized diseases in the European Union (EU)/European Economic Area (EEA), two surveys were conducted in 2015 and 2016. The surveys were designed to determine the national public health reference laboratories’ access to WGS and operational WGS-based typing capacity for national surveillance of selected foodborne pathogens, antimicrobial-resistant pathogens, and vaccine-preventable diseases identified as priorities for European genomic surveillance. Twenty-eight and twenty-nine out of the 30 EU/EEA countries participated in the survey in 2015 and 2016, respectively. National public health reference laboratories in 22 and 25 countries had access to WGS-based typing for public health applications in 2015 and 2016, respectively. Reported reasons for limited or no access were lack of funding, staff, and expertise. Illumina technology was the most frequently used followed by Ion Torrent technology. The access to bioinformatics expertise and competence for routine WGS data analysis was limited. By mid-2016, half of the EU/EEA countries were using WGS analysis either as first- or second-line typing method for surveillance of the pathogens and antibiotic resistance issues identified as EU priorities. The sampling frame as well as bioinformatics analysis varied by pathogen/resistance issue and country. Core genome multilocus allelic profiling, also called cgMLST, was the most frequently used annotation approach for typing bacterial genomes suggesting potential bioinformatics pipeline compatibility. Further capacity development for WGS-based typing is ongoing in many countries and upon consolidation and harmonization of methods should enable pan-EU data exchange for genomic surveillance in the medium-term subject to the development of suitable data management systems and appropriate agreements for data sharing.


July 19, 2019

Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.

Without knowledge of their genomic sequences, it is impossible to make functional models of the bacteria that make up human and animal microbiota. Unfortunately, the vast majority of publicly available genomes are only working drafts, an incompleteness that causes numerous problems and constitutes a major obstacle to genotypic and phenotypic interpretation. In this work, we began with an example from the class Bacteroidia in the phylum Bacteroidetes, which is preponderant among human orodigestive microbiota. We successfully identify the genetic loci responsible for assembly breaks and misassemblies and demonstrate the importance and usefulness of long-read sequencing and curated reannotation.We showed that the fragmentation in Bacteroidia draft genomes assembled from massively parallel sequencing linearly correlates with genomic repeats of the same or greater size than the reads. We also demonstrated that some of these repeats, especially the long ones, correspond to misassembled loci in three reference Porphyromonas gingivalis genomes marked as circularized (thus complete or finished). We prove that even at modest coverage (30X), long-read resequencing together with PCR contiguity verification (rrn operons and an integrative and conjugative element or ICE) can be used to identify and correct the wrongly combined or assembled regions. Finally, although time-consuming and labor-intensive, consistent manual biocuration of three P. gingivalis strains allowed us to compare and correct the existing genomic annotations, resulting in a more accurate interpretation of the genomic differences among these strains.In this study, we demonstrate the usefulness and importance of long-read sequencing in verifying published genomes (even when complete) and generating assemblies for new bacterial strains/species with high genomic plasticity. We also show that when combined with biological validation processes and diligent biocurated annotation, this strategy helps reduce the propagation of errors in shared databases, thus limiting false conclusions based on incomplete or misleading information.


July 19, 2019

The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity.Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel.Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain’s genetic profile from pathogenic to environmental.


July 19, 2019

Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.

Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018.


July 19, 2019

Expanding an expanded genome: long-read sequencing of Trypanosoma cruzi.

Although the genome of Trypanosoma cruzi, the causative agent of Chagas disease, was first made available in 2005, with additional strains reported later, the intrinsic genome complexity of this parasite (the abundance of repetitive sequences and genes organized in tandem) has traditionally hindered high-quality genome assembly and annotation. This also limits diverse types of analyses that require high degrees of precision. Long reads generated by third-generation sequencing technologies are particularly suitable to address the challenges associated with T. cruzi’s genome since they permit direct determination of the full sequence of large clusters of repetitive sequences without collapsing them. This, in turn, not only allows accurate estimation of gene copy numbers but also circumvents assembly fragmentation. Here, we present the analysis of the genome sequences of two T. cruzi clones: the hybrid TCC (TcVI) and the non-hybrid Dm28c (TcI), determined by PacBio Single Molecular Real-Time (SMRT) technology. The improved assemblies herein obtained permitted us to accurately estimate gene copy numbers, abundance and distribution of repetitive sequences (including satellites and retroelements). We found that the genome of T. cruzi is composed of a ‘core compartment’ and a ‘disruptive compartment’ which exhibit opposite GC content and gene composition. Novel tandem and dispersed repetitive sequences were identified, including some located inside coding sequences. Additionally, homologous chromosomes were separately assembled, allowing us to retrieve haplotypes as separate contigs instead of a unique mosaic sequence. Finally, manual annotation of surface multigene families, mucins and trans-sialidases allows now a better overview of these complex groups of genes.


July 19, 2019

Ultradeep single-molecule real-time sequencing of HIV envelope reveals complete compartmentalization of highly macrophage-tropic R5 proviral variants in brain and CXCR4-using variants in immune and peripheral tissues.

Despite combined antiretroviral therapy (cART), HIV+ patients still develop neurological disorders, which may be due to persistent HIV infection and selective evolution in brain tissues. Single-molecule real-time (SMRT) sequencing technology offers an improved opportunity to study the relationship among HIV isolates in the brain and lymphoid tissues because it is capable of generating thousands of long sequence reads in a single run. Here, we used SMRT sequencing to generate ~?50,000 high-quality full-length HIV envelope sequences (>?2200 bp) from seven autopsy tissues from an HIV+/cART+ subject, including three brain and four non-brain sites. Sanger sequencing was used for comparison with SMRT data and to clone functional pseudoviruses for in vitro tropism assays. Phylogenetic analysis demonstrated that brain-derived HIV was compartmentalized from HIV outside the brain and that the variants from each of the three brain tissues grouped independently. Variants from all peripheral tissues were intermixed on the tree but independent of the brain clades. Due to the large number of sequences, a clustering analysis at three similarity thresholds (99, 99.5, and 99.9%) was also performed. All brain sequences clustered exclusive of any non-brain sequences at all thresholds; however, frontal lobe sequences clustered independently of occipital and parietal lobes. Translated sequences revealed potentially functional differences between brain and non-brain sequences in the location of putative N-linked glycosylation sites (N-sites), V1 length, V3 charge, and the number of V4 N-sites. All brain sequences were predicted to use the CCR5 co-receptor, while most non-brain sequences were predicted to use CXCR4 co-receptor. Tropism results were confirmed by in vitro infection assays. The study is the first to use a SMRT sequencing approach to study HIV compartmentalization in tissues and supports other reports of limited trafficking between brain and non-brain sequences during cART. Due to the long sequence length, we could observe changes along the entire envelope gene, likely caused by differential selective pressure in the brain that may contribute to neurological disease.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.