With Single Molecule, Real-Time (SMRT) Sequencing and the Sequel Systems, you can affordably assemble reference-quality microbial genomes that are >99.999% (Q50) accurate.
Complete HIV-1 genomes from single molecules: Diversity estimates in two linked transmission pairs using clustering and mutual information.
We sequenced complete HIV-1 genomes from single molecules using Single Molecule, Real- Time (SMRT) Sequencing and derive de novo full-length genome sequences. SMRT sequencing yields long-read sequencing results from individual DNA molecules with a rapid time-to-result. These attributes make it a useful tool for continuous monitoring of viral populations. The single-molecule nature of the sequencing method allows us to estimate variant subspecies and relative abundances by counting methods. We detail mathematical techniques used in viral variant subspecies identification including clustering distance metrics and mutual information. Sequencing was performed in order to better understand the relationships between the specific sequences of transmitted viruses in linked transmission pairs. Samples representing HIV transmission pairs were selected from the Zambia Emory HIV Research Project (Lusaka, Zambia) and sequenced. We examine Single Genome Amplification (SGA) prepped samples and samples containing complex mixtures of genomes. Whole genome consensus estimates for each of the samples were made. Genome reads were clustered using a simple distance metric on aligned reads. Appropriate thresholds were chosen to yield distinct clusters of HIV genomes within samples. Mutual information between columns in the genome alignments was used to measure dependence. In silico mixtures of reads from the SGA samples were made to simulate samples containing exactly controlled complex mixtures of genomes and our clustering methods were applied to these complex mixtures. SMRT Sequencing data contained multiple full-length (greater than 9 kb) continuous reads for each sample. Simple whole genome consensus estimates easily identified transmission pairs. The clustering of the genome reads showed diversity differences between the samples, allowing us to characterize the diversity of the individual quasi-species comprising the patient viral populations across the full genome. Mutual information identified possible dependencies of different positions across the full HIV-1 genome. The SGA consensus genomes agreed with prior Sanger sequencing. Our clustering methods correctly segregated reads to their correct originating genome for the synthetic SGA mixtures. The results open up the potential for reference-agnostic and cost effective full genome sequencing of HIV-1.
In this AGBT 2017 poster, the University of Helsinki’s Petri Auevinen reports on efforts to understand bacteria that grow on, and subsequently spoil, food. This analysis monitored DNA modifications and…
Complete genome screening of clinical MRSA isolates identifies lineage diversity and provides full resolution of transmission and outbreak events
Whole-genome sequencing (WGS) of Staphylococcus aureus is increasingly used as part of infection prevention practices, but most applications are focused on conserved core genomic regions due to limitations of short-read technologies. In this study we established a long-read technology-based WGS screening program of all first-episode MRSA blood infections at a major urban hospital. A survey of 132 MRSA genomes assembled from long reads revealed widespread gain/loss of accessory mobile genetic elements among established hospital- and community-associated lineages impacting >10% of each genome, and frequent megabase-scale inversions between endogenous prophages. We also characterized an outbreak of a CC5/ST105/USA100 clone among 3 adults and 18 infants in a neonatal intensive care unit (NICU) lasting 7 months. The pattern of changes among complete outbreak genomes provided full spatiotemporal resolution of its origins and progression, which was characterized by multiple sub-transmissions and likely precipitated by equipment sharing. Compared to other hospital strains, the outbreak strain carried distinct mutations and accessory genetic elements that impacted genes with roles in metabolism, resistance and persistence. This included a DNA-recognition domain recombination in the hsdS gene of a Type-I restriction-modification system that altered DNA methylation. RNA-Seq profiling showed that the (epi)genetic changes in the outbreak clone attenuated agr gene expression and upregulated genes involved in stress response and biofilm formation. Overall our findings demonstrate that long-read sequencing substantially improves our ability to characterize accessory genomic elements that impact MRSA virulence and persistence, and provides valuable information for infection control efforts.
Cryptococcus neoformans (C. neoformans var. grubii) is an environmentally acquired pathogen causing 181,000 HIV-associated deaths each year. We sequenced 699 isolates, primarily C. neoformans from HIV-infected patients, from 5 countries in Asia and Africa. The phylogeny of C. neoformans reveals a recent exponential population expansion, consistent with the increase in the number of susceptible hosts. In our study population, this expansion has been driven by three sub-clades of the C. neoformans VNIa lineage; VNIa-4, VNIa-5 and VNIa-93. These three sub-clades account for 91% of clinical isolates sequenced in our study. Combining the genome data with clinical information, we find that the VNIa-93 sub-clade, the most common sub-clade in Uganda and Malawi, was associated with better outcomes than VNIa-4 and VNIa-5, which predominate in Southeast Asia. This study lays the foundation for further work investigating the dominance of VNIa-4, VNIa-5 and VNIa-93 and the association between lineage and clinical phenotype.
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II’s sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Application of manure from antibiotic-treated animals to crops facilitates the dissemination of antibiotic resistance determinants into the environment. However, our knowledge of the identity, diversity, and patterns of distribution of these antibiotic resistance determinants remains limited. We used a new combination of methods to examine the resistome of dairy cow manure, a common soil amendment. Metagenomic libraries constructed with DNA extracted from manure were screened for resistance to beta-lactams, phenicols, aminoglycosides, and tetracyclines. Functional screening of fosmid and small-insert libraries identified 80 different antibiotic resistance genes whose deduced protein sequences were on average 50 to 60% identical to sequences deposited in GenBank. The resistance genes were frequently found in clusters and originated from a taxonomically diverse set of species, suggesting that some microorganisms in manure harbor multiple resistance genes. Furthermore, amid the great genetic diversity in manure, we discovered a novel clade of chloramphenicol acetyltransferases. Our study combined functional metagenomics with third-generation PacBio sequencing to significantly extend the roster of functional antibiotic resistance genes found in animal gut bacteria, providing a particularly broad resource for understanding the origins and dispersal of antibiotic resistance genes in agriculture and clinical settings. IMPORTANCE The increasing prevalence of antibiotic resistance among bacteria is one of the most intractable challenges in 21st-century public health. The origins of resistance are complex, and a better understanding of the impacts of antibiotics used on farms would produce a more robust platform for public policy. Microbiomes of farm animals are reservoirs of antibiotic resistance genes, which may affect distribution of antibiotic resistance genes in human pathogens. Previous studies have focused on antibiotic resistance genes in manures of animals subjected to intensive antibiotic use, such as pigs and chickens. Cow manure has received less attention, although it is commonly used in crop production. Here, we report the discovery of novel and diverse antibiotic resistance genes in the cow microbiome, demonstrating that it is a significant reservoir of antibiotic resistance genes. The genomic resource presented here lays the groundwork for understanding the dispersal of antibiotic resistance from the agroecosystem to other settings.
Horizontal gene transfer has played a role in developing the global public health crisis of antimicrobial resistance (AMR). However, the dynamics of AMR transfer through bacterial populations and its direct impact on human disease is poorly elucidated. Here, we study parallel epidemic emergences of multiple Shigella species, a priority AMR organism, in men who have sex with men to gain insight into AMR emergence and spread. Using genomic epidemiology, we show that repeated horizontal transfer of a single AMR plasmid among Shigella enhanced existing and facilitated new epidemics. These epidemic patterns contrasted with slighter, slower increases in disease caused by organisms with vertically inherited (chromosomally encoded) AMR. This demonstrates that horizontal transfer of AMR directly affects epidemiological outcomes of globally important AMR pathogens and highlights the need for integration of genomic analyses into all areas of AMR research, surveillance and management.
Computational comparison of availability in CTL/gag epitopes among patients with acute and chronic HIV-1 infection.
Recent studies indicate that there is selection bias for transmission of viral polymorphisms associated with higher viral fitness. Furthermore, after transmission and before a specific immune response is mounted in the recipient, the virus undergoes a number of reversions which allow an increase in their replicative capacity. These aspects, and others, affect the viral population characteristic of early acute infection.160 singlegag-gene amplifications were obtained by limiting-dilution RT-PCR from plasma samples of 8 ARV-naïve patients with early acute infection (<30?days, 22?days average) and 8 ARV-naive patients with approximately a year of infection (10 amplicons per patient). Sanger sequencing and NGS SMRT technology (Pacific Biosciences) were implemented to sequence the amplicons. Phylogenetic analysis was performed by using MEGA 6.06. HLA-I (A and B) typing was performed by SSOP-PCR method. The chromatograms were analyzed with Sequencher 4.10. Epitopes and immune-proteosomal cleavages prediction was performed with CBS prediction server for the 30 HLA-A and -B alleles most prevalent in our population with peptide lengths from 8 to 14 mer. Cytotoxic response prediction was performed by using IEDB Analysis Resource.After implementing epitope prediction analysis, we identified a total number of 325 possible viral epitopes present in two or more acute or chronic patients. 60.3% (n?=?196) of them were present only in acute infection (prevalent acute epitopes) while 39.7% (n?=?129) were present only in chronic infection (prevalent chronic epitopes). Within p24, the difference was equally dramatic with 59.4% (79/133) being acute epitopes (p?0.05). This is consistent with progressive viral adaptation to immune response in time and further supported by the fact that cytotoxic responses prediction showed that acute epitopes are more likely to generate immune response than chronic epitopes. Interestingly, only 27.5% of acute epitopes match the population-level consensus sequence of the virus.Our results indicate that certain non-consensus viral residues might be transmitted more frequently than consensus-residues when located in immunological relevant positions (epitopes). This observation might be relevant to the rationale behind development of an effective vaccineto reduce viral reservoir and induce functional cure of HIV infection based in prevalent acute epitopes. Copyright © 2018 Elsevier Ltd. All rights reserved.
Transmission of methicillin-resistant Staphylococcus aureus via deceased donor liver transplantation confirmed by whole genome sequencing.
Donor-derived bacterial infection is a recognized complication of solid organ transplantation (SOT). The present report describes the clinical details and successful outcome in a liver transplant recipient despite transmission of methicillin-resistant Staphylococcus aureus (MRSA) from a deceased donor with MRSA endocarditis and bacteremia. We further describe whole genome sequencing (WGS) and complete de novo assembly of the donor and recipient MRSA isolate genomes, which confirms that both isolates are genetically 100% identical. We propose that similar application of WGS techniques to future investigations of donor bacterial transmission would strengthen the definition of proven bacterial transmission in SOT, particularly in the presence of highly clonal bacteria such as MRSA. WGS will further improve our understanding of the epidemiology of bacterial transmission in SOT and the risk of adverse patient outcomes when it occurs.© Copyright 2014 The American Society of Transplantation and the American Society of Transplant Surgeons.
Vertical transmission of highly similar bla CTX-M-1-harboring IncI1 plasmids in Escherichia coli with different MLST types in the poultry production pyramid.
The purpose of this study was to characterize sets of extended-spectrum ß-lactamases (ESBL)-producing Enterobacteriaceae collected longitudinally from different flocks of broiler breeders, meconium of 1-day-old broilers from theses breeder flocks, as well as from these broiler flocks before slaughter.Five sets of ESBL-producing Escherichia coli were studied by multi-locus sequence typing (MLST), phylogenetic grouping, PCR-based replicon typing and resistance profiling. The bla CTX-M-1-harboring plasmids of one set (pHV295.1, pHV114.1, and pHV292.1) were fully sequenced and subjected to comparative analysis.Eleven different MLST sequence types (ST) were identified with ST1056 the predominant one, isolated in all five sets either on the broiler breeder or meconium level. Plasmid sequencing revealed that bla CTX-M-1 was carried by highly similar IncI1/ST3 plasmids that were 105 076 bp, 110 997 bp, and 117 269 bp in size, respectively.The fact that genetically similar IncI1/ST3 plasmids were found in ESBL-producing E. coli of different MLST types isolated at the different levels in the broiler production pyramid provides strong evidence for a vertical transmission of these plasmids from a common source (nucleus poultry flocks).
Sequencing plasmids can reveal the transmission of resistance among bacterial from patients in a clinical setting.
Parallel epidemics of community-associated methicillin-resistant Staphylococcus aureus USA300 infection in North and South America.
The community-associated methicillin-resistant Staphylococcus aureus (CA-MRSA) epidemic in the United States is attributed to the spread of the USA300 clone. An epidemic of CA-MRSA closely related to USA300 has occurred in northern South America (USA300 Latin-American variant, USA300-LV). Using phylogenomic analysis, we aimed to understand the relationships between these 2 epidemics.We sequenced the genomes of 51 MRSA clinical isolates collected between 1999 and 2012 from the United States, Colombia, Venezuela, and Ecuador. Phylogenetic analysis was used to infer the relationships and times since the divergence of the major clades.Phylogenetic analyses revealed 2 dominant clades that segregated by geographical region, had a putative common ancestor in 1975, and originated in 1989, in North America, and in 1985, in South America. Emergence of these parallel epidemics coincides with the independent acquisition of the arginine catabolic mobile element (ACME) in North American isolates and a novel copper and mercury resistance (COMER) mobile element in South American isolates.Our results reveal the existence of 2 parallel USA300 epidemics that shared a recent common ancestor. The simultaneous rapid dissemination of these 2 epidemic clades suggests the presence of shared, potentially convergent adaptations that enhance fitness and ability to spread.© The Author 2015. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: firstname.lastname@example.org.
Heterosexual transmission of subtype C HIV-1 selects consensus-like variants without increased replicative capacity or interferon-a resistance.
Heterosexual transmission of HIV-1 is characterized by a genetic bottleneck that selects a single viral variant, the transmitted/founder (TF), during most transmission events. To assess viral characteristics influencing HIV-1 transmission, we sequenced 167 near full-length viral genomes and generated 40 infectious molecular clones (IMC) including TF variants and multiple non-transmitted (NT) HIV-1 subtype C variants from six linked heterosexual transmission pairs near the time of transmission. Consensus-like genomes sensitive to donor antibodies were selected for during transmission in these six transmission pairs. However, TF variants did not demonstrate increased viral fitness in terms of particle infectivity or viral replicative capacity in activated peripheral blood mononuclear cells (PBMC) and monocyte-derived dendritic cells (MDDC). In addition, resistance of the TF variant to the antiviral effects of interferon-a (IFN-a) was not significantly different from that of non-transmitted variants from the same transmission pair. Thus neither in vitro viral replicative capacity nor IFN-a resistance discriminated the transmission potential of viruses in the quasispecies of these chronically infected individuals. However, our findings support the hypothesis that within-host evolution of HIV-1 in response to adaptive immune responses reduces viral transmission potential.
Influenza A virus is characterized by high genetic diversity. However, most of what is known about influenza evolution has come from consensus sequences sampled at the epidemiological scale that only represent the dominant virus lineage within each infected host. Less is known about the extent of within-host virus diversity and what proportion of this diversity is transmitted between individuals. To characterize virus variants that achieve sustainable transmission in new hosts, we examined within-host virus genetic diversity in household donor-recipient pairs from the first wave of the 2009 H1N1 pandemic when seasonal H3N2 was co-circulating. Although the same variants were found in multiple members of the community, the relative frequencies of variants fluctuated, with patterns of genetic variation more similar within than between households. We estimated the effective population size of influenza A virus across donor-recipient pairs to be approximately 100-200 contributing members, which enabled the transmission of multiple lineages, including antigenic variants.