Menu
July 7, 2019

Meta-aligner: long-read alignment based on genome statistics.

Current development of sequencing technologies is towards generating longer and noisier reads. Evidently, accurate alignment of these reads play an important role in any downstream analysis. Similarly, reducing the overall cost of sequencing is related to the time consumption of the aligner. The tradeoff between accuracy and speed is the main challenge in designing long read aligners.We propose Meta-aligner which aligns long and very long reads to the reference genome very efficiently and accurately. Meta-aligner incorporates available short/long aligners as subcomponents and uses statistics from the reference genome to increase the performance. Meta-aligner estimates statistics from reads and the reference genome automatically. Meta-aligner is implemented in C++ and runs in popular POSIX-like operating systems such as Linux.Meta-aligner achieves high recall rates and precisions especially for long reads and high error rates. Also, it improves performance of alignment in the case of PacBio long-reads in comparison with traditional schemes.


July 7, 2019

Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art.

Identifying overlaps between error-prone long reads, specifically those from Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PB), is essential for certain downstream applications, including error correction and de novo assembly. Though akin to the read-to-reference alignment problem, read-to-read overlap detection is a distinct problem that can benefit from specialized algorithms that perform efficiently and robustly on high error rate long reads. Here, we review the current state-of-the-art read-to-read overlap tools for error-prone long reads, including BLASR, DALIGNER, MHAP, GraphMap and Minimap. These specialized bioinformatics tools differ not just in their algorithmic designs and methodology, but also in their robustness of performance on a variety of datasets, time and memory efficiency and scalability. We highlight the algorithmic features of these tools, as well as their potential issues and biases when utilizing any particular method. To supplement our review of the algorithms, we benchmarked these tools, tracking their resource needs and computational performance, and assessed the specificity and precision of each. In the versions of the tools tested, we observed that Minimap is the most computationally efficient, specific and sensitive method on the ONT datasets tested; whereas GraphMap and DALIGNER are the most specific and sensitive methods on the tested PB datasets. The concepts surveyed may apply to future sequencing technologies, as scalability is becoming more relevant with increased sequencing throughput.cjustin@bcgsc.ca , ibirol@bcgsc.ca.Supplementary data are available at Bioinformatics online.


July 7, 2019

ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing.

Isolating and sequencing specific regions in a genome is a cornerstone of molecular biology. This has been facilitated by computationally encoding the thermodynamics of DNA hybridization for automated design of hybridization and priming oligonucleotides. However, the repetitive composition of genomes challenges the identification of target-specific oligonucleotides, which limits genetics and genomics research on many species. Here, a tool called ThermoAlign was developed that ensures the design of target-specific primer pairs for DNA amplification. This is achieved by evaluating the thermodynamics of hybridization for full-length oligonucleotide-template alignments – thermoalignments – across the genome to identify primers predicted to bind specifically to the target site. For amplification-based resequencing of regions that cannot be amplified by a single primer pair, a directed graph analysis method is used to identify minimum amplicon tiling paths. Laboratory validation by standard and long-range polymerase chain reaction and amplicon resequencing with maize, one of the most repetitive genomes sequenced to date (˜85% repeat content), demonstrated the specificity-by-design functionality of ThermoAlign. ThermoAlign is released under an open source license and bundled in a dependency-free container for wide distribution. It is anticipated that this tool will facilitate multiple applications in genetics and genomics and be useful in the workflow of high-throughput targeted resequencing studies.


July 7, 2019

Brucella spp. of amphibians comprise genomically diverse motile strains competent for replication in macrophages and survival in mammalian hosts.

Twenty-one small Gram-negative motile coccobacilli were isolated from 15 systemically diseased African bullfrogs (Pyxicephalus edulis), and were initially identified as Ochrobactrum anthropi by standard microbiological identification systems. Phylogenetic reconstructions using combined molecular analyses and comparative whole genome analysis of the most diverse of the bullfrog strains verified affiliation with the genus Brucella and placed the isolates in a cluster containing B. inopinata and the other non-classical Brucella species but also revealed significant genetic differences within the group. Four representative but molecularly and phenotypically diverse strains were used for in vitro and in vivo infection experiments. All readily multiplied in macrophage-like murine J774-cells, and their overall intramacrophagic growth rate was comparable to that of B. inopinata BO1 and slightly higher than that of B. microti CCM 4915. In the BALB/c murine model of infection these strains replicated in both spleen and liver, but were less efficient than B. suis 1330. Some strains survived in the mammalian host for up to 12 weeks. The heterogeneity of these novel strains hampers a single species description but their phenotypic and genetic features suggest that they represent an evolutionary link between a soil-associated ancestor and the mammalian host-adapted pathogenic Brucella species.


July 7, 2019

Genomic innovation for crop improvement.

Crop production needs to increase to secure future food supplies, while reducing its impact on ecosystems. Detailed characterization of plant genomes and genetic diversity is crucial for meeting these challenges. Advances in genome sequencing and assembly are being used to access the large and complex genomes of crops and their wild relatives. These have helped to identify a wide spectrum of genetic variation and permitted the association of genetic diversity with diverse agronomic phenotypes. In combination with improved and automated phenotyping assays and functional genomic studies, genomics is providing new foundations for crop-breeding systems.


July 7, 2019

Complete genome sequence of Stenotrophomonas sp. KACC 91585, an efficient bacterium for unsaturated fatty acid hydration.

Hydroxy fatty acids (HFAs) such as 10-hydroxystearic acid (10-HSA) and 10-hydroxy-12(Z)-octadecenoic acid (10-HOD), which are similar to ricinoleic acid, are important starting materials and intermediates for the industrial manufacture of many commodities. Stenotrophomonas sp. KACC 91585, which was isolated from lake sediment, is an efficient bacterium for unsaturated fatty acid hydration that produces 10-HSA and 10-HOD from oleic acid and linoleic acid, respectively, with high conversion rates. The complete genome of this strain is 4,541,729bp with 63.83% GC content and devoid of plasmids. Sets of genes involved in the fatty acid biosynthesis and modification as well as modified lipids were identified in the genome, and these genes were concerned with HFA production. This genome sequence provides molecular information and elucidation for HFA production, and will be used as an efficient biocatalyst source for the biotechnological production of HFA. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

The histidine decarboxylase gene cluster of Lactobacillus parabuchneri was gained by horizontal gene transfer and is mobile within the species.

Histamine in food can cause intolerance reactions in consumers. Lactobacillus parabuchneri (L. parabuchneri) is one of the major causes of elevated histamine levels in cheese. Despite its significant economic impact and negative influence on human health, no genomic study has been published so far. We sequenced and analyzed 18 L. parabuchneri strains of which 12 were histamine positive and 6 were histamine negative. We determined the complete genome of the histamine positive strain FAM21731 with PacBio as well as Illumina and the genomes of the remaining 17 strains using the Illumina technology. We developed the synteny aware ortholog finding algorithm SynOrf to compare the genomes and we show that the histidine decarboxylase (HDC) gene cluster is located in a genomic island. It is very likely that the HDC gene cluster was transferred from other lactobacilli, as it is highly conserved within several lactobacilli species. Furthermore, we have evidence that the HDC gene cluster was transferred within the L. parabuchneri species.


July 7, 2019

Genomic changes associated with the evolutionary transition of an insect gut symbiont into a blood-borne pathogen.

The genus Bartonella comprises facultative intracellular bacteria with a unique lifestyle. After transmission by blood-sucking arthropods they colonize the erythrocytes of mammalian hosts causing acute and chronic infectious diseases. Although the pathogen-host interaction is well understood, little is known about the evolutionary origin of the infection strategy manifested by Bartonella species. Here we analyzed six genomes of Bartonella apis, a honey bee gut symbiont that to date represents the closest relative of pathogenic Bartonella species. Comparative genomics revealed that B. apis encodes a large set of vertically inherited genes for amino acid and cofactor biosynthesis and nitrogen metabolism. Most pathogenic bartonellae have lost these ancestral functions, but acquired specific virulence factors and expanded a vertically inherited gene family for harvesting cofactors from the blood. However, the deeply rooted pathogen Bartonella tamiae has retained many of the ancestral genome characteristics reflecting an evolutionary intermediate state toward a host-restricted intraerythrocytic lifestyle. Our findings suggest that the ancestor of the pathogen Bartonella was a gut symbiont of insects and that the adaptation to blood-feeding insects facilitated colonization of the mammalian bloodstream. This study highlights the importance of comparative genomics among pathogens and non-pathogenic relatives to understand disease emergence within an evolutionary-ecological framework.


July 7, 2019

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.


July 7, 2019

Solid-state fermentative production of aroma esters by Myroides sp. ZB35 and its complete genome sequence.

Consumers prefer biotechnological food products with high nutritional values and good flavors. Solid-state fermentation is a commonly used technique with a long history. In the present study, Myroides sp. ZB35 was used in solid-state fermentative production of aroma volatiles on a rice medium. Using the headspace solid phase microextraction coupled with gas chromatography-mass spectrometry technique and authentic standards, 22 esters with molecular weight ranging from 102 to 172 were identified. At 192h, the esters reached a total concentration of 1774µg/kg. Subsequently, the complete genome of ZB35 was sequenced using the PacBio RS II platform. ZB35 has a single circular chromosome of 4,065,010bp with a GC content of 34.1% and six putative novel esterase genes were found. ZB35 is the first bacterium here discovered being capable of producing so many kinds of aroma esters. The data revealed here would provide helpful information for further developing this strain as a promising source of aroma esters relevant in food and fragrance industries and the source of novel enzymes with potential usages. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

The hidden perils of read mapping as a quality assessment tool in genome sequencing.

This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed by using CLC Genomics Workbench read mapping and Optical mapping developed by OpGen. Over-extension of contigs without prior knowledge of contig location can lead to misassembled contigs, even when commonly used quality indicators such as read mapping suggest that a contig is well assembled. Precautions must also be undertaken when using long read sequencing technology, which may also lead to misassembled contigs.


July 7, 2019

Genomic analysis of ST88 community-acquired methicillin resistant Staphylococcus aureus in Ghana.

The emergence and evolution of community-acquired methicillin resistant Staphylococcus aureus (CA-MRSA) strains in Africa is poorly understood. However, one particular MRSA lineage called ST88, appears to be rapidly establishing itself as an “African” CA-MRSA clone. In this study, we employed whole genome sequencing to provide more information on the genetic background of ST88 CA-MRSA isolates from Ghana and to describe in detail ST88 CA-MRSA isolates in comparison with other MRSA lineages worldwide.We first established a complete ST88 reference genome (AUS0325) using PacBio SMRT sequencing. We then used comparative genomics to assess relatedness among 17 ST88 CA-MRSA isolates recovered from patients attending Buruli ulcer treatment centres in Ghana, three non-African ST88s and 15 other MRSA lineages.We show that Ghanaian ST88 forms a discrete MRSA lineage (harbouring SCCmec-IV [2B]). Gene content analysis identified five distinct genomic regions enriched among ST88 isolates compared with the other S. aureus lineages. The Ghanaian ST88 isolates had only 658 core genome SNPs and there was no correlation between phylogeny and geography, suggesting the recent spread of this clone. The lineage was also resistant to multiple classes of antibiotics including ß-lactams, tetracycline and chloramphenicol.This study reveals that S. aureus ST88-IV is a recently emerging and rapidly spreading CA-MRSA clone in Ghana. The study highlights the capacity of small snapshot genomic studies to provide actionable public health information in resource limited settings. To our knowledge this is the first genomic assessment of the ST88 CA-MRSA clone.


July 7, 2019

AidP, a novel N-Acyl homoserine lactonase gene from Antarctic Planococcus sp.

Planococcus is a Gram-positive halotolerant bacterial genus in the phylum Firmicutes, commonly found in various habitats in Antarctica. Quorum quenching (QQ) is the disruption of bacterial cell-to-cell communication (known as quorum sensing), which has previously been described in mesophilic bacteria. This study demonstrated the QQ activity of a psychrotolerant strain, Planococcus versutus strain L10.15(T), isolated from a soil sample obtained near an elephant seal wallow in Antarctica. Whole genome analysis of this bacterial strain revealed the presence of an N-acyl homoserine lactonase, an enzyme that hydrolyzes the ester bond of the homoserine lactone of N-acyl homoserine lactone (AHLs). Heterologous gene expression in E. coli confirmed its functions for hydrolysis of AHLs, and the gene was designated as aidP (autoinducer degrading gene from Planococcus sp.). The low temperature activity of this enzyme suggested that it is a novel and uncharacterized class of AHL lactonase. This study is the first report on QQ activity of bacteria isolated from the polar regions.


July 7, 2019

Efficient CNV breakpoint analysis reveals unexpected structural complexity and correlation of dosage-sensitive genes with clinical severity in genomic disorders.

Genomic disorders are the clinical conditions manifested by submicroscopic genomic rearrangements including copy number variants (CNVs). The CNVs can be identified by array-based comparative genomic hybridization (aCGH), the most commonly used technology for molecular diagnostics of genomic disorders. However, clinical aCGH only informs CNVs in the probe-interrogated regions. Neither orientational information nor the resulting genomic rearrangement structure is provided, which is a key to uncovering mutational and pathogenic mechanisms underlying genomic disorders. Long-range polymerase chain reaction (PCR) is a traditional approach to obtain CNV breakpoint junction, but this method is inefficient when challenged by structural complexity such as often found at the PLP1 locus in association with Pelizaeus-Merzbacher disease (PMD). Here we introduced ‘capture and single-molecule real-time sequencing’ (cap-SMRT-seq) and newly developed ‘asymmetry linker-mediated nested PCR walking’ (ALN-walking) for CNV breakpoint sequencing in 49 subjects with PMD-associated CNVs. Remarkably, 29 (94%) of the 31 CNV breakpoint junctions unobtainable by conventional long-range PCR were resolved by cap-SMRT-seq and ALN-walking. Notably, unexpected CNV complexities, including inter-chromosomal rearrangements that cannot be resolved by aCGH, were revealed by efficient breakpoint sequencing. These sequence-based structures of PMD-associated CNVs further support the role of DNA replicative mechanisms in CNV mutagenesis, and facilitate genotype-phenotype correlation studies. Intriguingly, the lengths of gained segments by CNVs are strongly correlated with clinical severity in PMD, potentially reflecting the functional contribution of other dosage-sensitive genes besides PLP1. Our study provides new efficient experimental approaches (especially ALN-walking) for CNV breakpoint sequencing and highlights their importance in uncovering CNV mutagenesis and pathogenesis in genomic disorders.© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.


July 7, 2019

A spontaneous mutation in kdsD, a biosynthesis gene for 3 Deoxy-D-manno-Octulosonic Acid, occurred in a ciprofloxacin resistant strain of Francisella tularensis and caused a high level of attenuation in murine models of tularemia.

Francisella tularensis, a gram-negative facultative intracellular bacterial pathogen, is the causative agent of tularemia and able to infect many mammalian species, including humans. Because of its ability to cause a lethal infection, low infectious dose, and aerosolizable nature, F. tularensis subspecies tularensis is considered a potential biowarfare agent. Due to its in vitro efficacy, ciprofloxacin is one of the antibiotics recommended for post-exposure prophylaxis of tularemia. In order to identify therapeutics that will be efficacious against infections caused by drug resistant select-agents and to better understand the threat, we sought to characterize an existing ciprofloxacin resistant (CipR) mutant in the Schu S4 strain of F. tularensis by determining its phenotypic characteristics and sequencing the chromosome to identify additional genetic alterations that may have occurred during the selection process. In addition to the previously described genetic alterations, the sequence of the CipR mutant strain revealed several additional mutations. Of particular interest was a frameshift mutation within kdsD which encodes for an enzyme necessary for the production of 3-Deoxy-D-manno-Octulosonic Acid (KDO), an integral component of the lipopolysaccharide (LPS). A kdsD mutant was constructed in the Schu S4 strain. Although it was not resistant to ciprofloxacin, the kdsD mutant shared many phenotypic characteristics with the CipR mutant, including growth defects under different conditions, sensitivity to hydrophobic agents, altered LPS profiles, and attenuation in multiple models of murine tularemia. This study demonstrates that the KdsD enzyme is essential for Francisella virulence and may be an attractive therapeutic target for developing novel medical countermeasures.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.