Menu
July 7, 2019

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.

Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10?kb in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads.We present a new mapper, minimap and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold Caenorhabditis elegans data in 9?min, orders of magnitude faster than the existing pipelines, though the consensus sequence error rate is as high as raw reads. We also introduce a pairwise read mapping format and a graphical fragment assembly format, and demonstrate the interoperability between ours and current tools.https://github.com/lh3/minimap and https://github.com/lh3/miniasmhengli@broadinstitute.orgSupplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Whole genome DNA sequence analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS) to Salmonella subspecies enterica serotype Tennessee (S. Tennessee) to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana), which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP) analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs), suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future outbreaks. Using WGS can delimit contamination sources for foodborne illnesses across multiple outbreaks and reveal otherwise undetected DNA sequence differences essential to the tracing of bacterial pathogens as they emerge.


July 7, 2019

Extensive sequencing of seven human genomes to characterize benchmark reference materials.

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.


July 7, 2019

Accelerated dysbiosis of gut microbiota during aggravation of DSS-induced colitis by a butyrate-producing bacterium.

Butyrate-producing bacteria (BPB) are potential probiotic candidates for inflammatory bowel diseases as they are often depleted in the diseased gut microbiota. However, here we found that augmentation of a human-derived butyrate-producing strain, Anaerostipes hadrus BPB5, significantly aggravated colitis in dextran sulphate sodium (DSS)-treated mice while exerted no detrimental effect in healthy mice. We explored how the interaction between BPB5 and gut microbiota may contribute to this differential impact on the hosts. Butyrate production and severity of colitis were assessed in both healthy and DSS-treated mice, and gut microbiota structural changes were analysed using high-throughput sequencing. BPB5-inoculated healthy mice showed no signs of colitis, but increased butyrate content in the gut. In DSS-treated mice, BPB5 augmentation did not increase butyrate content, but induced significantly more severe disease activity index and much higher mortality. BPB5 didn’t induce significant changes of gut microbiota in healthy hosts, but expedited the structural shifts 3 days earlier toward the disease phase in BPB5-augmented than DSS-treated animals. The differential response of gut microbiota in healthy and DSS-treated mice to the same potentially beneficial bacterium with drastically different health consequences suggest that animals with dysbiotic gut microbiota should also be employed for the safety assessment of probiotic candidates.


July 7, 2019

Complete genomes of Bacillus coagulans S-lac and Bacillus subtilis TO-A JPC, two phylogenetically distinct probiotics.

Several spore-forming strains of Bacillus are marketed as probiotics due to their ability to survive harsh gastrointestinal conditions and confer health benefits to the host. We report the complete genomes of two commercially available probiotics, Bacillus coagulans S-lac and Bacillus subtilis TO-A JPC, and compare them with the genomes of other Bacillus and Lactobacillus. The taxonomic position of both organisms was established with a maximum-likelihood tree based on twenty six housekeeping proteins. Analysis of all probiotic strains of Bacillus and Lactobacillus reveal that the essential sporulation proteins are conserved in all Bacillus probiotic strains while they are absent in Lactobacillus spp. We identified various antibiotic resistance, stress-related, and adhesion-related domains in these organisms, which likely provide support in exerting probiotic action by enabling adhesion to host epithelial cells and survival during antibiotic treatment and harsh conditions.


July 7, 2019

Atypical Salmonella enterica serovars in murine and human infection models: Is it time to reassess our approach to the study of salmonellosis?

Nontyphoidal Salmonella species are globally disseminated pathogens and the predominant cause of gastroenteritis. The pathogenesis of salmonellosis has been extensively studied using in vivo murine models and cell lines typically challenged with Salmonella Typhimurium. Although serovars Enteritidis and Typhimurium are responsible for the most of human infections reported to the CDC, several other serovars also contribute to clinical cases of salmonellosis. Despite their epidemiological importance, little is known about their infection phenotypes. Here, we report the virulence characteristics and genomes of 10 atypical S. enterica serovars linked to multistate foodborne outbreaks in the United States. We show that the murine RAW 264.7 macrophage model of infection is unsuitable for inferring human relevant differences in nontyphoidal Salmonella infections whereas differentiated human THP-1 macrophages allowed these isolates to be further characterised in a more relevant, human context.


July 7, 2019

A roadmap for gene system development in Clostridium.

Clostridium species are both heroes and villains. Some cause serious human and animal diseases, those present in the gut microbiota generally contribute to health and wellbeing, while others represent useful industrial chassis for the production of chemicals and fuels. To understand, counter or exploit, there is a fundamental requirement for effective systems that may be used for directed or random genome modifications. We have formulated a simple roadmap whereby the necessary gene systems maybe developed and deployed. At its heart is the use of ‘pseudo-suicide’ vectors and the creation of a pyrE mutant (a uracil auxotroph), initially aided by ClosTron technology, but ultimately made using a special form of allelic exchange termed ACE (Allele-Coupled Exchange). All mutants, regardless of the mutagen employed, are made in this host. This is because through the use of ACE vectors, mutants can be rapidly complemented concomitant with correction of the pyrE allele and restoration of uracil prototrophy. This avoids the phenotypic effects frequently observed with high copy number plasmids and dispenses with the need to add antibiotic to ensure plasmid retention. Once available, the pyrE host may be used to stably insert all manner of application specific modules. Examples include, a sigma factor to allow deployment of a mariner transposon, hydrolases involved in biomass deconstruction and therapeutic genes in cancer delivery vehicles. To date, provided DNA transfer is obtained, we have not encountered any clostridial species where this technology cannot be applied. These include, Clostridium difficile, Clostridium acetobutylicum, Clostridium beijerinckii, Clostridium botulinum, Clostridium perfringens, Clostridium sporogenes, Clostridium pasteurianum, Clostridium ljungdahlii, Clostridium autoethanogenum and even Geobacillus thermoglucosidasius. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.


July 7, 2019

Evaluation of an optimal epidemiologic typing scheme for Legionella pneumophila with whole genome sequence data using validation guidelines.

Sequence-based typing (SBT), analogous to multi-locus sequence typing (MLST), is the current gold-standard typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila However, as common sequence types (STs) cause many infections, some investigations remain unresolved. Here, various whole genome sequencing (WGS)-based methods were evaluated according to published guidelines, including: i) single nucleotide polymorphism (SNP)-based; ii) extended multi-locus sequence typing (MLST) using different numbers of genes; iii) gene presence/absence, and iv) kmer-based. L. pneumophila serogroup 1 isolates (n=106) from the standard “typing panel”, previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI) were tested together with another 229 isolates.Over 98% isolates were considered typable using the mapping- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50-gene) to 86.8% (1455-gene) whilst only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP-based), and all values are higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ~50 genes provides optimal epidemiological concordance whilst substantially improving the discrimination offered by SBT, and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila. Copyright © 2016 David et al.


July 7, 2019

Horizontal transfer of carbapenemase-encoding plasmids and comparison with hospital epidemiology data.

Carbapenemase-producing organisms have spread worldwide, and infections with these bacteria cause significant morbidity. Horizontal transfer of plasmids that encode carbapenemases plays an important role in the spread of multidrug resistant Gram-negative bacteria. Here we investigate parameters regulating conjugation using an E. coli laboratory strain that lacks plasmids or restriction-enzyme modification systems as a recipient and also using patient isolates as donors and recipients. Because conjugation is tightly regulated, we performed a systematic analysis of the transfer of Klebsiella pneumoniae carbapenemase (blaKPC)-encoding plasmids into multiple strains under different environmental conditions to investigate critical variables. We used four blaKPC-plasmids isolated from patient strains obtained from two hospitals: pKpQIL and pKPC-47e from the National Institutes of Health, and pKPC_UVA01 and pKPC_UVA02 from the University of Virginia. Plasmid transfer frequency differed substantially between different donor and recipient pairs, and was influenced by plasmid content, temperature, and substrate, in addition to donor and recipient strain. pKPC-47e was attenuated in conjugation efficiency across all conditions tested. Despite its presence in multiple clinical species, pKPC_UVA01 had lower conjugation efficiencies than pKpQIL into recipient strains. The conjugation frequency of these plasmids into K. pneumoniae and E. coli patient isolates ranged widely without a clear correlation with clinical epidemiological data. Our results highlight the importance of each variable examined in these controlled experiments. The in vitro models did not reliably predict plasmid mobilization observed in a patient population, indicating that further studies are needed to understand the most important variables affecting horizontal transfer in vivo. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Glutathione-S-transferase FosA6 of Klebsiella pneumoniae origin conferring fosfomycin resistance in ESBL-producing Escherichia coli.

The objectives of this study were to elucidate the genetic context of a novel plasmid-mediated fosA variant, fosA6, conferring fosfomycin resistance and to characterize the kinetic properties of FosA6.The genome of fosfomycin-resistant Escherichia coli strain YD786 was sequenced. Homologues of FosA6 were identified through BLAST searches. FosA6 and FosA(ST258) were purified and characterized using a steady-state kinetic approach. Inhibition of FosA activity was examined with sodium phosphonoformate.Plasmid-encoded glutathione-S-transferase (GST) FosA6 conferring high-level fosfomycin resistance was identified in a CTX-M-2-producing E. coli clinical strain at a US hospital. fosA6 was carried on a self-conjugative, 69 kb IncFII plasmid. The ?lysR-fosA6-?yjiR_1 fragment, located between IS10R and ?IS26, was nearly identical to those on the chromosomes of some Klebsiella pneumoniae strains (MGH78578, PMK1 and KPPR1). FosA6 shared >99% identity with chromosomally encoded FosA(PMK1) in K. pneumoniae of various STs and 98% identity with FosA(ST258), which is commonly found in K. pneumoniae clonal complex (CC) 258 including ST258. FosA6 and FosA(ST258) demonstrated robust GST activities that were comparable to each other. Sodium phosphonoformate, a GST inhibitor, reduced the fosfomycin MICs by 6- to 24-fold for K. pneumoniae and E. coli strains carrying fosA genes on the chromosomes and plasmids, respectively.fosA6, probably captured from the chromosome of K. pneumoniae, conferred high-level fosfomycin resistance in E. coli. FosA6 functioned as a GST and inactivated fosfomycin efficiently. K. pneumoniae may serve as a reservoir of fosfomycin resistance for E. coli.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Structural variation detection using next-generation sequencing data: A comparative technical review.

Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

Normocyte-binding protein required for human erythrocyte invasion by the zoonotic malaria parasite Plasmodium knowlesi.

The dominant cause of malaria in Malaysia is now Plasmodium knowlesi, a zoonotic parasite of cynomolgus macaque monkeys found throughout South East Asia. Comparative genomic analysis of parasites adapted to in vitro growth in either cynomolgus or human RBCs identified a genomic deletion that includes the gene encoding normocyte-binding protein Xa (NBPXa) in parasites growing in cynomolgus RBCs but not in human RBCs. Experimental deletion of the NBPXa gene in parasites adapted to growth in human RBCs (which retain the ability to grow in cynomolgus RBCs) restricted them to cynomolgus RBCs, demonstrating that this gene is selectively required for parasite multiplication and growth in human RBCs. NBPXa-null parasites could bind to human RBCs, but invasion of these cells was severely impaired. Therefore, NBPXa is identified as a key mediator of P. knowlesi human infection and may be a target for vaccine development against this emerging pathogen.


July 7, 2019

Complete genome sequence of Mycobacterium chelonae type strain CCUG 47445, a rapidly growing species of nontuberculous mycobacteria.

Mycobacterium chelonae strains are ubiquitous rapidly growing mycobacteria associated with skin and soft tissue infections, cellulitis, abscesses, osteomyelitis, catheter infections, disseminated diseases, and postsurgical infections after implants with prostheses, transplants, and even hemodialysis procedures. Here, we report the complete genome sequence of M. chelonae type strain CCUG 47445. Copyright © 2016 Jaén-Luchoro et al.


July 7, 2019

Whole-genome sequence of Hafnia alvei HUMV-5920, a human isolate.

A clinical isolate of Hafnia alvei (strain HUMV-5920) was obtained from a urine sample from an adult patient. We report here its complete genome assembly using PacBio single-molecule real-time (SMRT) sequencing, which resulted in a chromosome with 4.5 Mb and a circular contig of 87 kb. About 4,146 protein-coding genes are predicted from this assembly. Copyright © 2016 Lázaro-Díez et al.


July 7, 2019

Escherichia coli harboring mcr-1 and blaCTX-M on a novel IncF plasmid: first report of mcr-1 in the United States.

The recent discovery of a plasmid-borne colistin resistance gene, mcr-1, in China heralds the emergence of truly pan-drug-resistant bacteria (1). The gene has been found primarily in Escherichia coli but has also been identified in other members of the Enterobacteriaceae in human, animal, food, and environmental samples on every continent (2–5). In response to this threat, starting in May 2016, all extended-spectrum-ß-lactamase (ESBL)-producing E. coli clinical isolates submitted to the clinical microbiology laboratory at the Walter Reed National Military Medical Center (WRNMMC) have been tested for resistance to colistin by Etest. Here we report the presence of mcr-1 in an E. coli strain cultured from a patient with a urinary tract infection (UTI) in the United States. The strain was resistant to colistin, but it remained susceptible to several other agents, including amikacin, piperacillin-tazobactam, all carbapenems, and nitrofurantoin (Table 1).


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.