Menu
July 7, 2019

rHAT: fast alignment of noisy long reads with regional hashing.

Single Molecule Real-Time (SMRT) sequencing has been widely applied in cutting-edge genomic studies. However, it is still an expensive task to align the noisy long SMRT reads to reference genome by state-of-the-art aligners, which is becoming a bot-tleneck in applications with SMRT sequencing. Novel approach is on demand for improving the efficiency and effectiveness of SMRT read alignment.We propose Regional Hashing-based Alignment Tool (rHAT), a seed-and-extension-based read alignment approach specifically designed for noisy long reads. rHAT indexes reference genome by regional hash table (RHT), a hash table-based index which describes the short tokens within local windows of reference genome. In the seeding phase, rHAT utilizes RHT for efficiently calculating the occurrences of short token matches between partial read and local genomic windows to find highly possible candidate sites. In the extension phase, a sparse dynamic programming-based heuristic approach is used for reducing the cost of aligning read to the candidate sites. By benchmarking on the real and simulated datasets from various prokaryote and eukaryote genomes, we demonstrated that rHAT can effectively align SMRT reads with outstanding throughput. rHAT is implemented in C++; the source code is available at https://github.com/derekguan/rHAT CONTACT: ydwang@hit.edu.cn. © The Author (2015). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.


July 7, 2019

hybridSPAdes: an algorithm for hybrid assembly of short and long reads.

Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.hybridSPAdes is implemented in C++?as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades CONTACT: d.antipov@spbu.ruSupplementary information: supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

CauloBrowser: A systems biology resource for Caulobacter crescentus.

Caulobacter crescentus is a premier model organism for studying the molecular basis of cellular asymmetry. The Caulobacter community has generated a wealth of high-throughput spatiotemporal databases including data from gene expression profiling experiments (microarrays, RNA-seq, ChIP-seq, ribosome profiling, LC-ms proteomics), gene essentiality studies (Tn-seq), genome wide protein localization studies, and global chromosome methylation analyses (SMRT sequencing). A major challenge involves the integration of these diverse data sets into one comprehensive community resource. To address this need, we have generated CauloBrowser (www.caulobrowser.org), an online resource for Caulobacter studies. This site provides a user-friendly interface for quickly searching genes of interest and downloading genome-wide results. Search results about individual genes are displayed as tables, graphs of time resolved expression profiles, and schematics of protein localization throughout the cell cycle. In addition, the site provides a genome viewer that enables customizable visualization of all published high-throughput genomic data. The depth and diversity of data sets collected by the Caulobacter community makes CauloBrowser a unique and valuable systems biology resource.© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Tigecycline resistance in clinical isolates of Enterococcus faecium is mediated by an upregulation of plasmid-encoded tetracycline determinants tet(L) and tet(M).

Tigecycline represents one of the last-line therapeutics to combat multidrug-resistant bacterial pathogens, including VRE and MRSA. The German National Reference Centre for Staphylococci and Enterococci has received 73 tigecycline-resistant Enterococcus faecium and Enterococcus faecalis isolates in recent years. The precise mechanism of how enterococci become resistant to tigecycline remains undetermined. This study documents an analysis of the role of efflux pumps in tigecycline resistance in clinical isolates of Enterococcus spp.Various tigecycline MICs were found for the different isolates analysed. Tigecycline-resistant strains were analysed with respect to genome and transcriptome differences by means of WGS and RT-qPCR. Genes of interest were cloned and expressed in Listeria monocytogenes for verification of their functionality.Detailed comparative whole-genome analyses of three isogenic strains, showing different levels of tigecycline resistance, revealed the major facilitator superfamily (MFS) efflux pump TetL and the ribosomal protection protein TetM as possible drug resistance proteins. Subsequent RT-qPCR confirmed up-regulation of the respective genes. A correlation of gene copy number and level of MIC was inferred from further qPCR analyses. Expression of both tet(L) and tet(M) in L. monocytogenes unequivocally demonstrated the potential to increase tigecycline MICs upon acquisition of either locus.Our results indicate that increased expression of two tetracycline resistance determinants, a tet(L)-encoded MFS pump and a tet(M)-encoded ribosomal protection protein, is capable of conferring tigecycline resistance in enterococcal clinical isolates.© The Author 2015. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

The Vigna Genome Server, ‘VigGS’: A genomic knowledge base of the genus Vigna based on high-quality, annotated genome sequence of the azuki bean, Vigna angularis (Willd.) Ohwi & Ohashi.

The genus Vigna includes legume crops such as cowpea, mungbean and azuki bean, as well as >100 wild species. A number of the wild species are highly tolerant to severe environmental conditions including high-salinity, acid or alkaline soil; drought; flooding; and pests and diseases. These features of the genus Vigna make it a good target for investigation of genetic diversity in adaptation to stressful environments; however, a lack of genomic information has hindered such research in this genus. Here, we present a genome database of the genus Vigna, Vigna Genome Server (‘VigGS’, http://viggs.dna.affrc.go.jp), based on the recently sequenced azuki bean genome, which incorporates annotated exon-intron structures, along with evidence for transcripts and proteins, visualized in GBrowse. VigGS also facilitates user construction of multiple alignments between azuki bean genes and those of six related dicot species. In addition, the database displays sequence polymorphisms between azuki bean and its wild relatives and enables users to design primer sequences targeting any variant site. VigGS offers a simple keyword search in addition to sequence similarity searches using BLAST and BLAT. To incorporate up to date genomic information, VigGS automatically receives newly deposited mRNA sequences of pre-set species from the public database once a week. Users can refer to not only gene structures mapped on the azuki bean genome on GBrowse but also relevant literature of the genes. VigGS will contribute to genomic research into plant biotic and abiotic stresses and to the future development of new stress-tolerant crops.© The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.


July 7, 2019

N(6)-methyladenosine in mRNA disrupts tRNA selection and translation-elongation dynamics.

N(6)-methylation of adenosine (forming m(6)A) is the most abundant post-transcriptional modification within the coding region of mRNA, but its role during translation remains unknown. Here, we used bulk kinetic and single-molecule methods to probe the effect of m(6)A in mRNA decoding. Although m(6)A base-pairs with uridine during decoding, as shown by X-ray crystallographic analyses of Thermus thermophilus ribosomal complexes, our measurements in an Escherichia coli translation system revealed that m(6)A modification of mRNA acts as a barrier to tRNA accommodation and translation elongation. The interaction between an m(6)A-modified codon and cognate tRNA echoes the interaction between a near-cognate codon and tRNA, because delay in tRNA accommodation depends on the position and context of m(6)A within codons and on the accuracy level of translation. Overall, our results demonstrate that chemical modification of mRNA can change translational dynamics.


July 7, 2019

Multiple parallel pathways of translation initiation on the CrPV IRES.

The complexity of eukaryotic translation allows fine-tuned regulation of protein synthesis. Viruses use internal ribosome entry sites (IRESs) to minimize or, like the CrPV IRES, eliminate the need for initiation factors. Here, by exploiting the CrPV IRES, we observed the entire process of initiation and transition to elongation in real time. We directly tracked the CrPV IRES, 40S and 60S ribosomal subunits, and tRNA using single-molecule fluorescence spectroscopy and identified multiple parallel initiation pathways within the system. Our results distinguished two pathways of 80S:CrPV IRES complex assembly that produce elongation-competent complexes. Following 80S assembly, the requisite eEF2-mediated translocation results in an unstable intermediate that is captured by binding of the elongator tRNA. Whereas initiation can occur in the 0 and +1 frames, the arrival of the first tRNA defines the reading frame and strongly favors 0 frame initiation. Overall, even in the simplest system, an intricate reaction network regulates translation initiation. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

PEPR: pipelines for evaluating prokaryotic references.

The rapid adoption of microbial whole genome sequencing in public health, clinical testing, and forensic laboratories requires the use of validated measurement processes. Well-characterized, homogeneous, and stable microbial genomic reference materials can be used to evaluate measurement processes, improving confidence in microbial whole genome sequencing results. We have developed a reproducible and transparent bioinformatics tool, PEPR, Pipelines for Evaluating Prokaryotic References, for characterizing the reference genome of prokaryotic genomic materials. PEPR evaluates the quality, purity, and homogeneity of the reference material genome, and purity of the genomic material. The quality of the genome is evaluated using high coverage paired-end sequence data; coverage, paired-end read size and direction, as well as soft-clipping rates, are used to identify mis-assemblies. The homogeneity and purity of the material relative to the reference genome are characterized by comparing base calls from replicate datasets generated using multiple sequencing technologies. Genomic purity of the material is assessed by checking for DNA contaminants. We demonstrate the tool and its output using sequencing data while developing a Staphylococcus aureus candidate genomic reference material. PEPR is open source and available at https://github.com/usnistgov/pepr .


July 7, 2019

Microevolution of monophasic Salmonella Typhimurium during epidemic, United Kingdom, 2005-2010.

Microevolution associated with emergence and expansion of new epidemic clones of bacterial pathogens holds the key to epidemiologic success. To determine microevolution associated with monophasic Salmonella Typhimurium during an epidemic, we performed comparative whole-genome sequencing and phylogenomic analysis of isolates from the United Kingdom and Italy during 2005-2012. These isolates formed a single clade distinct from recent monophasic epidemic clones previously described from North America and Spain. The UK monophasic epidemic clones showed a novel genomic island encoding resistance to heavy metals and a composite transposon encoding antimicrobial drug resistance genes not present in other Salmonella Typhimurium isolates, which may have contributed to epidemiologic success. A remarkable amount of genotypic variation accumulated during clonal expansion that occurred during the epidemic, including multiple independent acquisitions of a novel prophage carrying the sopE gene and multiple deletion events affecting the phase II flagellin locus. This high level of microevolution may affect antigenicity, pathogenicity, and transmission.


July 7, 2019

SimLoRD: Simulation of Long Read Data.

Third generation sequencing methods provide longer reads than second generation methods and have distinct error characteristics. While there exist many read simulators for second generation data, there is a very limited choice for third generation data.We analyzed public data from Pacific Biosciences (PacBio) SMRT sequencing, developed an error model and implemented it in a new read simulator called SimLoRD. It offers options to choose the read length distribution and to model error probabilities depending on the number of passes through the sequencer. The new error model makes SimLoRD the most realistic SMRT read simulator available.SimLoRD is available open source at http://bitbucket.org/genomeinformatics/simlord/ and installable via Bioconda (http://bioconda.github.io).Bianca.Stoecker@uni-due.de or Sven.Rahmann@uni-due.deSupplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.

Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10?kb in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads.We present a new mapper, minimap and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold Caenorhabditis elegans data in 9?min, orders of magnitude faster than the existing pipelines, though the consensus sequence error rate is as high as raw reads. We also introduce a pairwise read mapping format and a graphical fragment assembly format, and demonstrate the interoperability between ours and current tools.https://github.com/lh3/minimap and https://github.com/lh3/miniasmhengli@broadinstitute.orgSupplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.


July 7, 2019

Glutathione-S-transferase FosA6 of Klebsiella pneumoniae origin conferring fosfomycin resistance in ESBL-producing Escherichia coli.

The objectives of this study were to elucidate the genetic context of a novel plasmid-mediated fosA variant, fosA6, conferring fosfomycin resistance and to characterize the kinetic properties of FosA6.The genome of fosfomycin-resistant Escherichia coli strain YD786 was sequenced. Homologues of FosA6 were identified through BLAST searches. FosA6 and FosA(ST258) were purified and characterized using a steady-state kinetic approach. Inhibition of FosA activity was examined with sodium phosphonoformate.Plasmid-encoded glutathione-S-transferase (GST) FosA6 conferring high-level fosfomycin resistance was identified in a CTX-M-2-producing E. coli clinical strain at a US hospital. fosA6 was carried on a self-conjugative, 69 kb IncFII plasmid. The ?lysR-fosA6-?yjiR_1 fragment, located between IS10R and ?IS26, was nearly identical to those on the chromosomes of some Klebsiella pneumoniae strains (MGH78578, PMK1 and KPPR1). FosA6 shared >99% identity with chromosomally encoded FosA(PMK1) in K. pneumoniae of various STs and 98% identity with FosA(ST258), which is commonly found in K. pneumoniae clonal complex (CC) 258 including ST258. FosA6 and FosA(ST258) demonstrated robust GST activities that were comparable to each other. Sodium phosphonoformate, a GST inhibitor, reduced the fosfomycin MICs by 6- to 24-fold for K. pneumoniae and E. coli strains carrying fosA genes on the chromosomes and plasmids, respectively.fosA6, probably captured from the chromosome of K. pneumoniae, conferred high-level fosfomycin resistance in E. coli. FosA6 functioned as a GST and inactivated fosfomycin efficiently. K. pneumoniae may serve as a reservoir of fosfomycin resistance for E. coli.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

The novel 2016 WHO Neisseria gonorrhoeae reference strains for global quality assurance of laboratory investigations: phenotypic, genetic and reference genome characterization.

Gonorrhoea and MDR Neisseria gonorrhoeae remain public health concerns globally. Enhanced, quality-assured, gonococcal antimicrobial resistance (AMR) surveillance is essential worldwide. The WHO global Gonococcal Antimicrobial Surveillance Programme (GASP) was relaunched in 2009. We describe the phenotypic, genetic and reference genome characteristics of the 2016 WHO gonococcal reference strains intended for quality assurance in the WHO global GASP, other GASPs, diagnostics and research worldwide.The 2016 WHO reference strains (n?=?14) constitute the eight 2008 WHO reference strains and six novel strains. The novel strains represent low-level to high-level cephalosporin resistance, high-level azithromycin resistance and a porA mutant. All strains were comprehensively characterized for antibiogram (n?=?23), serovar, prolyliminopeptidase, plasmid types, molecular AMR determinants, N. gonorrhoeae multiantigen sequence typing STs and MLST STs. Complete reference genomes were produced using single-molecule PacBio sequencing.The reference strains represented all available phenotypes, susceptible and resistant, to antimicrobials previously and currently used or considered for future use in gonorrhoea treatment. All corresponding resistance genotypes and molecular epidemiological types were described. Fully characterized, annotated and finished references genomes (n?=?14) were presented.The 2016 WHO gonococcal reference strains are intended for internal and external quality assurance and quality control in laboratory investigations, particularly in the WHO global GASP and other GASPs, but also in phenotypic (e.g. culture, species determination) and molecular diagnostics, molecular AMR detection, molecular epidemiology and as fully characterized, annotated and finished reference genomes in WGS analysis, transcriptomics, proteomics and other molecular technologies and data analysis.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.