Menu
July 7, 2019

Genomic dark matter illuminated: Anopheles Y chromosomes.

Hall et al. have strategically used long-read sequencing technology to characterize the structure and highly repetitive content of the Y chromosome in Anopheles malaria mosquitoes. Their work confirms that this important but elusive heterochromatic sex chromosome is evolving extremely rapidly and harbors a remarkably small number of genes. Copyright © 2016 Elsevier Ltd. All rights reserved.


July 7, 2019

deBWT: parallel construction of Burrows-Wheeler Transform for large collection of genomes with de Bruijn-branch encoding.

With the development of high-throughput sequencing, the number of assembled genomes continues to rise. It is critical to well organize and index many assembled genomes to promote future genomics studies. Burrows-Wheeler Transform (BWT) is an important data structure of genome indexing, which has many fundamental applications; however, it is still non-trivial to construct BWT for large collection of genomes, especially for highly similar or repetitive genomes. Moreover, the state-of-the-art approaches cannot well support scalable parallel computing owing to their incremental nature, which is a bottleneck to use modern computers to accelerate BWT construction.We propose de Bruijn branch-based BWT constructor (deBWT), a novel parallel BWT construction approach. DeBWT innovatively represents and organizes the suffixes of input sequence with a novel data structure, de Bruijn branch encoding. This data structure takes the advantage of de Bruijn graph to facilitate the comparison between the suffixes with long common prefix, which breaks the bottleneck of the BWT construction of repetitive genomic sequences. Meanwhile, deBWT also uses the structure of de Bruijn graph for reducing unnecessary comparisons between suffixes. The benchmarking suggests that, deBWT is efficient and scalable to construct BWT for large dataset by parallel computing. It is well-suited to index many genomes, such as a collection of individual human genomes, with multiple-core servers or clusters.deBWT is implemented in C language, the source code is available at https://github.com/hitbc/deBWT or https://github.com/DixianZhu/deBWTContact: ydwang@hit.edu.cnSupplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press.


July 7, 2019

Complete genome sequence of Bacillus subtilis BSD-2, a microbial germicide isolated from cultivated cotton.

Bacillus subtilis BSD-2, isolated from cotton (Gossypium spp.), had strong antagonistic activity to Verticillium dahlia Kleb and Botrytis cinerea. We sequenced and annotated the BSD-2 complete genome to help us the better use of this strain, which has surfactin, bacilysin, bacillibactin, subtilosin A, Tas A and a potential class IV lanthipeptide biosynthetic pathways. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Long single-molecule reads can resolve the complexity of the influenza virus composed of rare, closely related mutant variants

As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous “swarm” of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this paper, we present 2SNV, a method able to tolerate the high error-rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2 % and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction. The open source implementation of 2SNV is freely available for download at http://?alan.?cs.?gsu.?edu/?NGS/???q=?content/?2snv.


July 7, 2019

Structural variation detection using next-generation sequencing data: A comparative technical review.

Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

Normocyte-binding protein required for human erythrocyte invasion by the zoonotic malaria parasite Plasmodium knowlesi.

The dominant cause of malaria in Malaysia is now Plasmodium knowlesi, a zoonotic parasite of cynomolgus macaque monkeys found throughout South East Asia. Comparative genomic analysis of parasites adapted to in vitro growth in either cynomolgus or human RBCs identified a genomic deletion that includes the gene encoding normocyte-binding protein Xa (NBPXa) in parasites growing in cynomolgus RBCs but not in human RBCs. Experimental deletion of the NBPXa gene in parasites adapted to growth in human RBCs (which retain the ability to grow in cynomolgus RBCs) restricted them to cynomolgus RBCs, demonstrating that this gene is selectively required for parasite multiplication and growth in human RBCs. NBPXa-null parasites could bind to human RBCs, but invasion of these cells was severely impaired. Therefore, NBPXa is identified as a key mediator of P. knowlesi human infection and may be a target for vaccine development against this emerging pathogen.


July 7, 2019

Complete genome sequence of Mycobacterium chelonae type strain CCUG 47445, a rapidly growing species of nontuberculous mycobacteria.

Mycobacterium chelonae strains are ubiquitous rapidly growing mycobacteria associated with skin and soft tissue infections, cellulitis, abscesses, osteomyelitis, catheter infections, disseminated diseases, and postsurgical infections after implants with prostheses, transplants, and even hemodialysis procedures. Here, we report the complete genome sequence of M. chelonae type strain CCUG 47445. Copyright © 2016 Jaén-Luchoro et al.


July 7, 2019

Complete genome and plasmid sequences for Rhodococcus fascians D188 and draft sequences for Rhodococcus isolates PBTS 1 and PBTS 2.

Rhodococcus fascians, a phytopathogen that alters plant development, inflicts significant losses in plant production around the world. We report here the complete genome sequence of R. fascians D188, a well-characterized model isolate, and Rhodococcus species PBTS (pistachio bushy top syndrome) 1 and 2, which were shown to be responsible for a disease outbreak in pistachios. Copyright © 2016 Stamler et al.


July 7, 2019

Whole-genome sequence of Hafnia alvei HUMV-5920, a human isolate.

A clinical isolate of Hafnia alvei (strain HUMV-5920) was obtained from a urine sample from an adult patient. We report here its complete genome assembly using PacBio single-molecule real-time (SMRT) sequencing, which resulted in a chromosome with 4.5 Mb and a circular contig of 87 kb. About 4,146 protein-coding genes are predicted from this assembly. Copyright © 2016 Lázaro-Díez et al.


July 7, 2019

Chromosome and plasmids of the tick-borne relapsing fever agent Borrelia hermsii.

The zoonotic pathogen Borrelia hermsii bears its multiple paralogous genes for variable antigens on several linear plasmids. Application of combined long-read and short-read next-generation sequencing provided complete sequences for antigen-encoding plasmids as well as other linear and circular plasmids and the linear chromosome of the genome. Copyright © 2016 Barbour.


July 7, 2019

The rubber tree genome shows expansion of gene family associated with rubber biosynthesis.

Hevea brasiliensis Muell. Arg, a member of the family Euphorbiaceae, is the sole natural resource exploited for commercial production of high-quality natural rubber. The properties of natural rubber latex are almost irreplaceable by synthetic counterparts for many industrial applications. A paucity of knowledge on the molecular mechanisms of rubber biosynthesis in high yield traits still persists. Here we report the comprehensive genome-wide analysis of the widely planted H. brasiliensis clone, RRIM 600. The genome was assembled based on ~155-fold combined coverage with Illumina and PacBio sequence data and has a total length of 1.55?Gb with 72.5% comprising repetitive DNA sequences. A total of 84,440 high-confidence protein-coding genes were predicted. Comparative genomic analysis revealed strong synteny between H. brasiliensis and other Euphorbiaceae genomes. Our data suggest that H. brasiliensis’s capacity to produce high levels of latex can be attributed to the expansion of rubber biosynthesis-related genes in its genome and the high expression of these genes in latex. Using cap analysis gene expression data, we illustrate the tissue-specific transcription profiles of rubber biosynthesis-related genes, revealing alternative means of transcriptional regulation. Our study adds to the understanding of H. brasiliensis biology and provides valuable genomic resources for future agronomic-related improvement of the rubber tree.


July 7, 2019

Draft genome sequences of two strains of Paenibacillus glucanolyticus with the ability to degrade lignocellulose.

Paenibacillus glucanolyticus 5162, a bacterium isolated from soil, and Paenibacillus glucanolyticus SLM1, a bacterium isolated from pulp mill waste, can utilize cellulose, hemicellulose and lignin as sole carbon sources for growth. These two strains of Paenibacillus glucanolyticus were sequenced using PacBio and Illumina MiSeq technologies. Copyright © 2016 Mathews et al.


July 7, 2019

Bacillus pumilus SAFR-032 genome revisited: sequence update and re-annotation.

Bacillus pumilus strain SAFR-032 is a non-pathogenic spore-forming bacterium exhibiting an anomalously high persistence in bactericidal environments. In its dormant state, it is capable of withstanding doses of ultraviolet (UV) radiation or hydrogen peroxide, which are lethal for the vast majority of microorganisms. This unusual resistance profile has made SAFR-032 a reference strain for studies of bacterial spore resistance. The complete genome sequence of B. pumilus SAFR-032 was published in 2007 early in the genomics era. Since then, the SAFR-032 strain has frequently been used as a source of genetic/genomic information that was regarded as representative of the entire B. pumilus species group. Recently, our ongoing studies of conservation of gene distribution patterns in the complete genomes of various B. pumilus strains revealed indications of misassembly in the B. pumilus SAFR-032 genome. Synteny-driven local genome resequencing confirmed that the original SAFR-032 sequence contained assembly errors associated with long sequence repeats. The genome sequence was corrected according to the new findings. In addition, a significantly improved annotation is now available. Gene orders were compared and portions of the genome arrangement were found to be similar in a wide spectrum of Bacillus strains.


July 7, 2019

Complete genome sequence of a multidrug-resistant Acinetobacter baumannii isolate obtained from a Mexican hospital (sequence type 422).

Acinetobacter baumannii has emerged as a dangerous nosocomial pathogen, particularly for severely ill patients in intensive care units and patients with hematologic malignancies. Here, we present the complete genome sequence of a multidrug-resistant A. baumannii isolate, recovered from a Mexican hospital and classified as sequence type 422 according to the multilocus sequence typing Pasteur scheme. Copyright © 2016 Castro-Jaimes et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.