Menu
July 7, 2019

deBWT: parallel construction of Burrows-Wheeler Transform for large collection of genomes with de Bruijn-branch encoding.

With the development of high-throughput sequencing, the number of assembled genomes continues to rise. It is critical to well organize and index many assembled genomes to promote future genomics studies. Burrows-Wheeler Transform (BWT) is an important data structure of genome indexing, which has many fundamental applications; however, it is still non-trivial to construct BWT for large collection of genomes, especially for highly similar or repetitive genomes. Moreover, the state-of-the-art approaches cannot well support scalable parallel computing owing to their incremental nature, which is a bottleneck to use modern computers to accelerate BWT construction.We propose de Bruijn branch-based BWT constructor (deBWT), a novel parallel BWT construction approach. DeBWT innovatively represents and organizes the suffixes of input sequence with a novel data structure, de Bruijn branch encoding. This data structure takes the advantage of de Bruijn graph to facilitate the comparison between the suffixes with long common prefix, which breaks the bottleneck of the BWT construction of repetitive genomic sequences. Meanwhile, deBWT also uses the structure of de Bruijn graph for reducing unnecessary comparisons between suffixes. The benchmarking suggests that, deBWT is efficient and scalable to construct BWT for large dataset by parallel computing. It is well-suited to index many genomes, such as a collection of individual human genomes, with multiple-core servers or clusters.deBWT is implemented in C language, the source code is available at https://github.com/hitbc/deBWT or https://github.com/DixianZhu/deBWTContact: ydwang@hit.edu.cnSupplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press.


July 7, 2019

Complete genome sequence of Bacillus subtilis BSD-2, a microbial germicide isolated from cultivated cotton.

Bacillus subtilis BSD-2, isolated from cotton (Gossypium spp.), had strong antagonistic activity to Verticillium dahlia Kleb and Botrytis cinerea. We sequenced and annotated the BSD-2 complete genome to help us the better use of this strain, which has surfactin, bacilysin, bacillibactin, subtilosin A, Tas A and a potential class IV lanthipeptide biosynthetic pathways. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Glutathione-S-transferase FosA6 of Klebsiella pneumoniae origin conferring fosfomycin resistance in ESBL-producing Escherichia coli.

The objectives of this study were to elucidate the genetic context of a novel plasmid-mediated fosA variant, fosA6, conferring fosfomycin resistance and to characterize the kinetic properties of FosA6.The genome of fosfomycin-resistant Escherichia coli strain YD786 was sequenced. Homologues of FosA6 were identified through BLAST searches. FosA6 and FosA(ST258) were purified and characterized using a steady-state kinetic approach. Inhibition of FosA activity was examined with sodium phosphonoformate.Plasmid-encoded glutathione-S-transferase (GST) FosA6 conferring high-level fosfomycin resistance was identified in a CTX-M-2-producing E. coli clinical strain at a US hospital. fosA6 was carried on a self-conjugative, 69 kb IncFII plasmid. The ?lysR-fosA6-?yjiR_1 fragment, located between IS10R and ?IS26, was nearly identical to those on the chromosomes of some Klebsiella pneumoniae strains (MGH78578, PMK1 and KPPR1). FosA6 shared >99% identity with chromosomally encoded FosA(PMK1) in K. pneumoniae of various STs and 98% identity with FosA(ST258), which is commonly found in K. pneumoniae clonal complex (CC) 258 including ST258. FosA6 and FosA(ST258) demonstrated robust GST activities that were comparable to each other. Sodium phosphonoformate, a GST inhibitor, reduced the fosfomycin MICs by 6- to 24-fold for K. pneumoniae and E. coli strains carrying fosA genes on the chromosomes and plasmids, respectively.fosA6, probably captured from the chromosome of K. pneumoniae, conferred high-level fosfomycin resistance in E. coli. FosA6 functioned as a GST and inactivated fosfomycin efficiently. K. pneumoniae may serve as a reservoir of fosfomycin resistance for E. coli.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

The Solanum demissum R8 late blight resistance gene is an Sw-5 homologue that has been deployed worldwide in late blight resistant varieties.

The potato late blight resistance gene R8 has been cloned. R8 is found in five late blight resistant varieties deployed in three different continents. R8 recognises Avr8 and is homologous to the NB-LRR protein Sw-5 from tomato. The broad spectrum late blight resistance gene R8 from Solanum demissum was cloned based on a previously published coarse map position on the lower arm of chromosome IX. Fine mapping in a recombinant population and bacterial artificial chromosome (BAC) library screening resulted in a BAC contig spanning 170 kb of the R8 haplotype. Sequencing revealed a cluster of at least ten R gene analogues (RGAs). The seven RGAs in the genetic window were subcloned for complementation analysis. Only one RGA provided late blight resistance and caused recognition of Avr8. From these results, it was concluded that the newly cloned resistance gene was indeed R8. R8 encodes a typical intracellular immune receptor with an N-terminal coiled coil, a central nucleotide binding site and 13 C-terminal leucine rich repeats. Phylogenetic analysis of a set of representative Solanaceae R proteins shows that R8 resides in a clearly distinct clade together with the Sw-5 tospovirus R protein from tomato. It was found that the R8 gene is present in late blight resistant potato varieties from Europe (Sarpo Mira), USA (Jacqueline Lee, Missaukee) and China (PB-06, S-60). Indeed, when tested under field conditions, R8 transgenic potato plants showed broad spectrum resistance to the current late blight population in the Netherlands, similar to Sarpo Mira.


July 7, 2019

Long single-molecule reads can resolve the complexity of the influenza virus composed of rare, closely related mutant variants

As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous “swarm” of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this paper, we present 2SNV, a method able to tolerate the high error-rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2 % and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction. The open source implementation of 2SNV is freely available for download at http://?alan.?cs.?gsu.?edu/?NGS/???q=?content/?2snv.


July 7, 2019

1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana.

Arabidopsis thaliana serves as a model organism for the study of fundamental physiological, cellular, and molecular processes. It has also greatly advanced our understanding of intraspecific genome variation. We present a detailed map of variation in 1,135 high-quality re-sequenced natural inbred lines representing the native Eurasian and North African range and recently colonized North America. We identify relict populations that continue to inhabit ancestral habitats, primarily in the Iberian Peninsula. They have mixed with a lineage that has spread to northern latitudes from an unknown glacial refugium and is now found in a much broader spectrum of habitats. Insights into the history of the species and the fine-scale distribution of genetic diversity provide the basis for full exploitation of A. thaliana natural variation through integration of genomes and epigenomes with molecular and non-molecular phenotypes. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.


July 7, 2019

Structural variation detection using next-generation sequencing data: A comparative technical review.

Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

Normocyte-binding protein required for human erythrocyte invasion by the zoonotic malaria parasite Plasmodium knowlesi.

The dominant cause of malaria in Malaysia is now Plasmodium knowlesi, a zoonotic parasite of cynomolgus macaque monkeys found throughout South East Asia. Comparative genomic analysis of parasites adapted to in vitro growth in either cynomolgus or human RBCs identified a genomic deletion that includes the gene encoding normocyte-binding protein Xa (NBPXa) in parasites growing in cynomolgus RBCs but not in human RBCs. Experimental deletion of the NBPXa gene in parasites adapted to growth in human RBCs (which retain the ability to grow in cynomolgus RBCs) restricted them to cynomolgus RBCs, demonstrating that this gene is selectively required for parasite multiplication and growth in human RBCs. NBPXa-null parasites could bind to human RBCs, but invasion of these cells was severely impaired. Therefore, NBPXa is identified as a key mediator of P. knowlesi human infection and may be a target for vaccine development against this emerging pathogen.


July 7, 2019

Complete genome sequence of Mycobacterium chelonae type strain CCUG 47445, a rapidly growing species of nontuberculous mycobacteria.

Mycobacterium chelonae strains are ubiquitous rapidly growing mycobacteria associated with skin and soft tissue infections, cellulitis, abscesses, osteomyelitis, catheter infections, disseminated diseases, and postsurgical infections after implants with prostheses, transplants, and even hemodialysis procedures. Here, we report the complete genome sequence of M. chelonae type strain CCUG 47445. Copyright © 2016 Jaén-Luchoro et al.


July 7, 2019

Complete genome and plasmid sequences for Rhodococcus fascians D188 and draft sequences for Rhodococcus isolates PBTS 1 and PBTS 2.

Rhodococcus fascians, a phytopathogen that alters plant development, inflicts significant losses in plant production around the world. We report here the complete genome sequence of R. fascians D188, a well-characterized model isolate, and Rhodococcus species PBTS (pistachio bushy top syndrome) 1 and 2, which were shown to be responsible for a disease outbreak in pistachios. Copyright © 2016 Stamler et al.


July 7, 2019

Complete genome sequence of Mesorhizobium ciceri bv. biserrulae strain WSM1284, an efficient nitrogen-fixing microsymbiont of the pasture legume Biserrula pelecinus.

We report the complete genome sequence of Mesorhizobium ciceri bv. biserrulae strain WSM1284, a nitrogen-fixing microsymbiont of the pasture legume Biserrula pelecinus The genome consists of 6.88 Mb distributed between a single chromosome (6.33 Mb) and a single plasmid (0.55 Mb). Copyright © 2016 Haskett et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.