Menu
July 19, 2019  |  

Completing bacterial genome assemblies: strategy and performance comparisons.

Determining the genomic sequences of microorganisms is the basis and prerequisite for understanding their biology and functional characterization. While the advent of low-cost, extremely high-throughput second-generation sequencing technologies and the parallel development of assembly algorithms have generated rapid and cost-effective genome assemblies, such assemblies are often unfinished, fragmented draft genomes as a result of short read lengths and long repeats present in multiple copies. Third-generation, PacBio sequencing technologies circumvented this problem by greatly increasing read length. Hybrid approaches including ALLPATHS-LG, PacBio corrected reads pipeline, SPAdes, and SSPACE-LongRead, and non-hybrid approaches-hierarchical genome-assembly process (HGAP) and PacBio corrected reads pipeline via self-correction-have therefore been proposed to utilize the PacBio long reads that can span many thousands of bases to facilitate the assembly of complete microbial genomes. However, standardized procedures that aim at evaluating and comparing these approaches are currently insufficient. To address the issue, we herein provide a comprehensive comparison by collecting datasets for the comparative assessment on the above-mentioned five assemblers. In addition to offering explicit and beneficial recommendations to practitioners, this study aims to aid in the design of a paradigm positioned to complete bacterial genome assembly.


July 19, 2019  |  

PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations.

Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.


July 19, 2019  |  

Intrahost dynamics of antiviral resistance in influenza a virus reflect complex patterns of segment linkage, reassortment, and natural selection.

Resistance following antiviral therapy is commonly observed in human influenza viruses. Although this evolutionary process is initiated within individual hosts, little is known about the pattern, dynamics, and drivers of antiviral resistance at this scale, including the role played by reassortment. In addition, the short duration of human influenza virus infections limits the available time window in which to examine intrahost evolution. Using single-molecule sequencing, we mapped, in detail, the mutational spectrum of an H3N2 influenza A virus population sampled from an immunocompromised patient who shed virus over a 21-month period. In this unique natural experiment, we were able to document the complex dynamics underlying the evolution of antiviral resistance. Individual resistance mutations appeared weeks before they became dominant, evolved independently on cocirculating lineages, led to a genome-wide reduction in genetic diversity through a selective sweep, and were placed into new combinations by reassortment. Notably, despite frequent reassortment, phylogenetic analysis also provided evidence for specific patterns of segment linkage, with a strong association between the hemagglutinin (HA)- and matrix (M)-encoding segments that matches that previously observed at the epidemiological scale. In sum, we were able to reveal, for the first time, the complex interaction between multiple evolutionary processes as they occur within an individual host.Understanding the evolutionary forces that shape the genetic diversity of influenza virus is crucial for predicting the emergence of drug-resistant strains but remains challenging because multiple processes occur concurrently. We characterized the evolution of antiviral resistance in a single persistent influenza virus infection, representing the first case in which reassortment and the complex patterns of drug resistance emergence and evolution have been determined within an individual host. Deep-sequence data from multiple time points revealed that the evolution of antiviral resistance reflects a combination of frequent mutation, natural selection, and a complex pattern of segment linkage and reassortment. In sum, these data show how immunocompromised hosts may help reveal the drivers of strain emergence. Copyright © 2015 Rogers et al.


July 19, 2019  |  

Molecular analysis of asymptomatic bacteriuria Escherichia coli strain VR50 reveals adaptation to the urinary tract by gene acquisition.

Urinary tract infections (UTIs) are among the most common infectious diseases of humans, with Escherichia coli responsible for >80% of all cases. One extreme of UTI is asymptomatic bacteriuria (ABU), which occurs as an asymptomatic carrier state that resembles commensalism. To understand the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50 mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability to adhere to human bladder epithelial cells. In the mouse model of UTI, VR50afa and VR50afaE displayed reduced bladder colonization compared to wild-type VR50, similar to the colonization level of the GI-VR50-pheV mutant. Our study suggests that E. coli VR50 is a commensal-like strain that has acquired fitness factors that facilitate colonization of the human bladder. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


July 19, 2019  |  

Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies.

During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20?kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.


July 19, 2019  |  

Genome-wide methylation patterns in Salmonella enterica subsp. enterica serovars.

The methylation of DNA bases plays an important role in numerous biological processes including development, gene expression, and DNA replication. Salmonella is an important foodborne pathogen, and methylation in Salmonella is implicated in virulence. Using single molecule real-time (SMRT) DNA-sequencing, we sequenced and assembled the complete genomes of eleven Salmonella enterica isolates from nine different serovars, and analysed the whole-genome methylation patterns of each genome. We describe 16 distinct N6-methyladenine (m6A) methylated motifs, one N4-methylcytosine (m4C) motif, and one combined m6A-m4C motif. Eight of these motifs are novel, i.e., they have not been previously described. We also identified the methyltransferases (MTases) associated with 13 of the motifs. Some motifs are conserved across all Salmonella serovars tested, while others were found only in a subset of serovars. Eight of the nine serovars contained a unique methylated motif that was not found in any other serovar (most of these motifs were part of Type I restriction modification systems), indicating the high diversity of methylation patterns present in Salmonella.


July 19, 2019  |  

Targeted single molecule sequencing methodology for ovarian hyperstimulation syndrome.

One of the most significant issues surrounding next generation sequencing is the cost and the difficulty assembling short read lengths. Targeted capture enrichment of longer fragments using single molecule sequencing (SMS) is expected to improve both sequence assembly and base-call accuracy but, at present, there are very few examples of successful application of these technologic advances in translational research and clinical testing. We developed a targeted single molecule sequencing (T-SMS) panel for genes implicated in ovarian response to controlled ovarian hyperstimulation (COH) for infertility.Target enrichment was carried out using droplet-base multiplex polymerase chain reaction (PCR) technology (RainDance®) designed to yield amplicons averaging 1 kb fragment size from candidate 44 loci (99.8% unique base-pair coverage). The total targeted sequence was 3.18 Mb per sample. SMS was carried out using single molecule, real-time DNA sequencing (SMRT® Pacific Biosciences®), average raw read length?=?1178 nucleotides, 5% of the amplicons >6000 nucleotides). After filtering with circular consensus (CCS) reads, the mean read length was 3200 nucleotides (97% CCS accuracy). Primary data analyses, alignment and filtering utilized the Pacific Biosciences® SMRT portal. Secondary analysis was conducted using the Genome Analysis Toolkit for SNP discovery l and wANNOVAR for functional analysis of variants. Filtered functional variants 18 of 19 (94.7%) were further confirmed using conventional Sanger sequencing. CCS reads were able to accurately detect zygosity. Coverage within GC rich regions (i.e.VEGFR; 72% GC rich) was achieved by capturing long genomic DNA (gDNA) fragments and reading into regions that flank the capture regions. As proof of concept, a non-synonymous LHCGR variant captured in two severe OHSS cases, and verified by conventional sequencing.Combining emulsion PCR-generated 1 kb amplicons and SMRT DNA sequencing permitted greater depth of coverage for T-SMS and facilitated easier sequence assembly. To the best of our knowledge, this is the first report combining emulsion PCR and T-SMS for long reads using human DNA samples, and NGS panel designed for biomarker discovery in OHSS.


July 19, 2019  |  

Specificity of the ModA11, ModA12 and ModD1 epigenetic regulator N6-adenine DNA methyltransferases of Neisseria meningitidis.

Phase variation (random ON/OFF switching) of gene expression is a common feature of host-adapted pathogenic bacteria. Phase variably expressed N(6)-adenine DNA methyltransferases (Mod) alter global methylation patterns resulting in changes in gene expression. These systems constitute phase variable regulons called phasevarions. Neisseria meningitidis phasevarions regulate genes including virulence factors and vaccine candidates, and alter phenotypes including antibiotic resistance. The target site recognized by these Type III N(6)-adenine DNA methyltransferases is not known. Single molecule, real-time (SMRT) methylome analysis was used to identify the recognition site for three key N. meningitidis methyltransferases: ModA11 (exemplified by M.NmeMC58I) (5′-CGY M6A: G-3′), ModA12 (exemplified by M.Nme77I, M.Nme18I and M.Nme579II) (5′-AC M6A: CC-3′) and ModD1 (exemplified by M.Nme579I) (5′-CC M6A: GC-3′). Restriction inhibition assays and mutagenesis confirmed the SMRT methylome analysis. The ModA11 site is complex and atypical and is dependent on the type of pyrimidine at the central position, in combination with the bases flanking the core recognition sequence 5′-CGY M6A: G-3′. The observed efficiency of methylation in the modA11 strain (MC58) genome ranged from 4.6% at 5′-GCGC M6A: GG-3′ sites, to 100% at 5′-ACGT M6A: GG-3′ sites. Analysis of the distribution of modified sites in the respective genomes shows many cases of association with intergenic regions of genes with altered expression due to phasevarion switching. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 19, 2019  |  

An adenine code for DNA: A second life for N6-methyladenine.

DNA N6-methyladenine (6mA) protects against restriction enzymes in bacteria. However, isolated reports have suggested additional activities and its presence in other organisms, such as unicellular eukaryotes. New data now find that 6mA may have a gene regulatory function in green alga, worm, and fly, suggesting m6A as a potential “epigenetic” mark. Copyright © 2015 Elsevier Inc. All rights reserved.


July 19, 2019  |  

Genome modification in Enterococcus faecalis OG1RF assessed by bisulfite sequencing and Single-Molecule Real-Time Sequencing.

Enterococcus faecalis is a Gram-positive bacterium that natively colonizes the human gastrointestinal tract and opportunistically causes life-threatening infections. Multidrug-resistant (MDR) E. faecalis strains have emerged, reducing treatment options for these infections. MDR E. faecalis strains have large genomes containing mobile genetic elements (MGEs) that harbor genes for antibiotic resistance and virulence determinants. Bacteria commonly possess genome defense mechanisms to block MGE acquisition, and we hypothesize that these mechanisms have been compromised in MDR E. faecalis. In restriction-modification (R-M) defense, the bacterial genome is methylated at cytosine (C) or adenine (A) residues by a methyltransferase (MTase), such that nonself DNA can be distinguished from self DNA. A cognate restriction endonuclease digests improperly modified nonself DNA. Little is known about R-M in E. faecalis. Here, we use genome resequencing to identify DNA modifications occurring in the oral isolate OG1RF. OG1RF has one of the smallest E. faecalis genomes sequenced to date and possesses few MGEs. Single-molecule real-time (SMRT) and bisulfite sequencing revealed that OG1RF has global 5-methylcytosine (m5C) methylation at 5′-GCWGC-3′ motifs. A type II R-M system confers the m5C modification, and disruption of this system impacts OG1RF electrotransformability and conjugative transfer of an antibiotic resistance plasmid. A second DNA MTase was poorly expressed under laboratory conditions but conferred global N(4)-methylcytosine (m4C) methylation at 5′-CCGG-3′ motifs when expressed in Escherichia coli. Based on our results, we conclude that R-M can act as a barrier to MGE acquisition and likely influences antibiotic resistance gene dissemination in the E. faecalis species.The horizontal transfer of antibiotic resistance genes among bacteria is a critical public health concern. Enterococcus faecalis is an opportunistic pathogen that causes life-threatening infections in humans. Multidrug resistance acquired by horizontal gene transfer limits treatment options for these infections. In this study, we used innovative DNA sequencing methodologies to investigate how a model strain of E. faecalis discriminates its own DNA from foreign DNA, i.e., self versus nonself discrimination. We also assess the role of an E. faecalis genome modification system in modulating conjugative transfer of an antibiotic resistance plasmid. These results are significant because they demonstrate that differential genome modification impacts horizontal gene transfer frequencies in E. faecalis. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


July 19, 2019  |  

Single-molecule sequencing reveals the molecular basis of multidrug-resistance in ST772 methicillin-resistant Staphylococcus aureus.

Methicillin-resistant Staphylococcus aureus (MRSA) is a major cause of hospital-associated infection, but there is growing awareness of the emergence of multidrug-resistant lineages in community settings around the world. One such lineage is ST772-MRSA-V, which has disseminated globally and is increasingly prevalent in India. Here, we present the complete genome sequence of DAR4145, a strain of the ST772-MRSA-V lineage from India, and investigate its genomic characteristics in regards to antibiotic resistance and virulence factors.Sequencing using single-molecule real-time technology resulted in the assembly of a single continuous chromosomal sequence, which was error-corrected, annotated and compared to nine draft genome assemblies of ST772-MRSA-V from Australia, Malaysia and India. We discovered numerous and redundant resistance genes associated with mobile genetic elements (MGEs) and known core genome mutations that explain the highly antibiotic resistant phenotype of DAR4145. Staphylococcal toxins and superantigens, including the leukotoxin Panton-Valentinin Leukocidin, were predominantly associated with genomic islands and the phage f-IND772PVL. Some of these mobile resistance and virulence factors were variably present in other strains of the ST772-MRSA-V lineage.The genomic characteristics presented here emphasize the contribution of MGEs to the emergence of multidrug-resistant and highly virulent strains of community-associated MRSA. Antibiotic resistance was further augmented by chromosomal mutations and redundancy of resistance genes. The complete genome of DAR4145 provides a valuable resource for future investigations into the global dissemination and phylogeography of ST772-MRSA-V.


July 19, 2019  |  

Complete bypass of restriction systems for major Staphylococcus aureus lineages.

Staphylococcus aureus is a prominent global nosocomial and community-acquired bacterial pathogen. A strong restriction barrier presents a major hurdle for the introduction of recombinant DNA into clinical isolates of S. aureus. Here, we describe the construction and characterization of the IMXXB series of Escherichia coli strains that mimic the type I adenine methylation profiles of S. aureus clonal complexes 1, 8, 30, and ST93. The IMXXB strains enable direct, high-efficiency transformation and streamlined genetic manipulation of major S. aureus lineages.The genetic manipulation of clinical S. aureus isolates has been hampered due to the presence of restriction modification barriers that detect and subsequently degrade inappropriately methylated DNA. Current methods allow the introduction of plasmid DNA into a limited subset of S. aureus strains at high efficiency after passage of plasmid DNA through the restriction-negative, modification-proficient strain RN4220. Here, we have constructed and validated a suite of E. coli strains that mimic the adenine methylation profiles of different clonal complexes and show high-efficiency plasmid DNA transfer. The ability to bypass RN4220 will reduce the cost and time involved for plasmid transfer into S. aureus. The IMXXB series of E. coli strains should expedite the process of mutant construction in diverse genetic backgrounds and allow the application of new techniques to the genetic manipulation of S. aureus. Copyright © 2015 Monk et al.


July 19, 2019  |  

The complete methylome of Helicobacter pylori UM032.

The genome of the human gastric pathogen Helicobacter pylori encodes a large number of DNA methyltransferases (MTases), some of which are shared among many strains, and others of which are unique to a given strain. The MTases have potential roles in the survival of the bacterium. In this study, we sequenced a Malaysian H. pylori clinical strain, designated UM032, by using a combination of PacBio Single Molecule, Real-Time (SMRT) and Illumina MiSeq next generation sequencing platforms, and used the SMRT data to characterize the set of methylated bases (the methylome).The N4-methylcytosine and N6-methyladenine modifications detected at single-base resolution using SMRT technology revealed 17 methylated sequence motifs corresponding to one Type I and 16 Type II restriction-modification (R-M) systems. Previously unassigned methylation motifs were now assigned to their respective MTases-coding genes. Furthermore, one gene that appears to be inactive in the H. pylori UM032 genome during normal growth was characterized by cloning.Consistent with previously-studied H. pylori strains, we show that strain UM032 contains a relatively large number of R-M systems, including some MTase activities with novel specificities. Additional studies are underway to further elucidating the biological significance of the R-M systems in the physiology and pathogenesis of H. pylori.


July 19, 2019  |  

Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes.

Beyond its role in host defense, bacterial DNA methylation also plays important roles in the regulation of gene expression, virulence and antibiotic resistance. Bacterial cells in a clonal population can generate epigenetic heterogeneity to increase population-level phenotypic plasticity. Single molecule, real-time (SMRT) sequencing enables the detection of N6-methyladenine and N4-methylcytosine, two major types of DNA modifications comprising the bacterial methylome. However, existing SMRT sequencing-based methods for studying bacterial methylomes rely on a population-level consensus that lacks the single-cell resolution required to observe epigenetic heterogeneity. Here, we present SMALR (single-molecule modification analysis of long reads), a novel framework for single molecule-level detection and phasing of DNA methylation. Using seven bacterial strains, we show that SMALR yields significantly improved resolution and reveals distinct types of epigenetic heterogeneity. SMALR is a powerful new tool that enables de novo detection of epigenetic heterogeneity and empowers investigation of its functions in bacterial populations.


July 19, 2019  |  

Assembly and diploid architecture of an individual human genome via single-molecule technologies.

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.