Menu
July 7, 2019

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Motivation: Despite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects. There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population. Result: Here, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects. Availability and implementation: Pamir is available at https://github.com/vpc-ccg/pamir. Contact:fhach@sfu.ca, prostatecentre.com or calkan@cs.bilkent.edu.tr Supplementary information:Supplementary data are available at Bioinformatics online.


July 7, 2019

Genome sequence of Acinetobacter lactucae OTEC-02, isolated from hydrocarbon-contaminated soil.

Acinetobacter lactucae OTEC-02 was isolated from hydrocarbon-contaminated soils. Whole-genome sequence analysis was performed to learn more about the strain’s ability to degrade different types of recalcitrant toxic monoaromatic hydrocarbons. The genome of this bacterium revealed its genomic properties and versatile metabolic features, as well as a complete prophage. Copyright © 2017 Rogel-Hernandez et al.


July 7, 2019

Multiple genome sequences of heteropolysaccharide-forming acetic acid bacteria.

We report here the complete genome sequences of the acetic acid bacteria (AAB) Acetobacter aceti TMW 2.1153, A. persici TMW 2.1084, and Neoasaia chiangmaiensis NBRC 101099, which secrete biotechnologically relevant heteropolysaccharides (HePSs) into their environments. Upon genome sequencing of these AAB strains, the corresponding HePS biosynthesis pathways were identified. Copyright © 2017 Brandt et al.


July 7, 2019

Multiple genome sequences of Lactobacillus plantarum strains.

We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses. Copyright © 2017 Kafka et al.


July 7, 2019

Assessment of bacterial profiles in aged, home-made Sichuan paocai brine with varying titratable acidity by PacBio SMRT Sequencing technology

Sichuan paocai, a traditional Chinese fermented vegetable, is rife with lactic acid bacteria (LAB). However, the precise bacterial profiles of home-made Sichuan paocai brine (HSPB) remain unclear. In this study, the bacterial compositions of 38 aged HSPB samples with varying titratable acidity (TA) were determined by SMRT sequencing of the full-length 16S rRNA gene. The lactic and acetic acids of HSPBs were also measured to determine any relevance with the bacterial profiles. The SMRT sequencing results reveal that the HSPB bacterial communities were comprised of numerous phylogenetic taxa, including 35 phyla, 371 genera, and 593 species; the bacterial diversity decreased as HSPB acidity increased. Lactobacillus acetotolerans, which was positively correlated to HSPB acidity, was the most dominant species followed by Lactobacillus brevis, which was positively related to acetic acid in the samples. A few opportunistic pathogens (e.g. Serratia marcescens and Stenotrophomonas maltophilia) were also detected. Sample groups with lower acidity had higher bacterial diversity and more Lactobacillus species with relative abundance >1% and opportunistics than higher-acidity samples. The results presented here report the comprehensive bacterial profiles of home-made Sichuan paocai for the first time via SMRT sequencing technology and the correlation between TA and bacterial compositions. It is necessary to further investigate the opportunistics detected in this work as they relate to the safety and quality of paocai.


July 7, 2019

Generation of a collection of mutant tomato lines using pooled CRISPR libraries.

The high efficiency of clustered regularly interspaced short palindromic repeats (CRISPR)-mediated mutagenesis in plants enables the development of high-throughput mutagenesis strategies. By transforming pooled CRISPR libraries into tomato (Solanum lycopersicum), collections of mutant lines were generated with minimal transformation attempts and in a relatively short period of time. Identification of the targeted gene(s) was easily determined by sequencing the incorporated guide RNA(s) in the primary transgenic events. From a single transformation with a CRISPR library targeting the immunity-associated leucine-rich repeat subfamily XII genes, heritable mutations were recovered in 15 of the 54 genes targeted. To increase throughput, a second CRISPR library was made containing three guide RNAs per construct to target 18 putative transporter genes. This resulted in stable mutations in 15 of the 18 targeted genes, with some primary transgenic plants having as many as five mutated genes. Furthermore, the redundancy in this collection of plants allowed for the association of aberrant T0 phenotypes with the underlying targeted genes. Plants with mutations in a homolog of an Arabidopsis (Arabidopsis thaliana) boron efflux transporter displayed boron deficiency phenotypes. The strategy described here provides a technically simple yet high-throughput approach for generating a collection of lines with targeted mutations and should be applicable to any plant transformation system.© 2017 American Society of Plant Biologists. All Rights Reserved.


July 7, 2019

Long-read sequencing offers path to more accurate drug metabolism profiles

In the complex drug discovery process, one of the looming questions for any new compound is how it will be metabolised in a human bodyWhi|e there are several methods for evaluating this, one of the most common involves CYP2D6,the enzyme encoded by the cytochrome P450—2D6 gene.This enzyme is involved in metabolising a quarter of all commonly used medications, making it an important target for ADME and pharmacogenomics studies. It is known to activate some drugs and to play a role in the deactivation or excretion of others.


July 7, 2019

The biofilm inhibitor carolacton enters Gram-negative cells: studies using a TolC-deficient strain of Escherichia coli.

The myxobacterial secondary metabolite carolacton inhibits growth of Streptococcus pneumoniae and kills biofilm cells of the caries- and endocarditis-associated pathogen Streptococcus mutans at nanomolar concentrations. Here, we studied the response to carolacton of an Escherichia coli strain that lacked the outer membrane protein TolC. Whole-genome sequencing of the laboratory E. coli strain TolC revealed the integration of an insertion element, IS5, at the tolC locus and a close phylogenetic relationship to the ancient E. coli K-12. We demonstrated via transcriptome sequencing (RNA-seq) and determination of MIC values that carolacton penetrates the phospholipid bilayer of the Gram-negative cell envelope and inhibits growth of E. coli TolC at similar concentrations as for streptococci. This inhibition is completely lost for a C-9 (R) epimer of carolacton, a derivative with an inverted stereocenter at carbon atom 9 [(S) ? (R)] as the sole difference from the native molecule, which is also inactive in S. pneumoniae and S. mutans, suggesting a specific interaction of native carolacton with a conserved cellular target present in bacterial phyla as distantly related as Firmicutes and Proteobacteria. The efflux pump inhibitor (EPI) phenylalanine arginine ß-naphthylamide (PAßN), which specifically inhibits AcrAB-TolC, renders E. coli susceptible to carolacton. Our data indicate that carolacton has potential for use in antimicrobial chemotherapy against Gram-negative bacteria, as a single drug or in combination with EPIs. Strain E. coli TolC has been deposited at the DSMZ; together with the associated RNA-seq data and MIC values, it can be used as a reference during future screenings for novel bioactive compounds. IMPORTANCE The emergence of pathogens resistant against most or all of the antibiotics currently used in human therapy is a global threat, and therefore the search for antimicrobials with novel targets and modes of action is of utmost importance. The myxobacterial secondary metabolite carolacton had previously been shown to inhibit biofilm formation and growth of streptococci. Here, we investigated if carolacton could act against Gram-negative bacteria, which are difficult targets because of their double-layered cytoplasmic envelope. We found that the model organism Escherichia coli is susceptible to carolacton, similar to the Gram-positive Streptococcus pneumoniae, if its multidrug efflux system AcrAB-TolC is either inactivated genetically, by disruption of the tolC gene, or physiologically by coadministering an efflux pump inhibitor. A carolacton epimer that has a different steric configuration at carbon atom 9 is completely inactive, suggesting that carolacton may interact with the same molecular target in both Gram-positive and Gram-negative bacteria.


July 7, 2019

Quantitative proteomics for the comprehensive analysis of stress responses of Lactobacillus paracasei subsp. paracasei F19.

Lactic acid bacteria are broadly employed as starter cultures in the manufacture of foods. Upon technological preparation, they are confronted with drying stress that amalgamates numerous stress conditions resulting in losses of fitness and survival. To better understand and differentiate physiological stress responses, discover general and specific markers for the investigated stress conditions, and predict optimal preconditioning for starter cultures, we performed a comprehensive genomic and quantitative proteomic analysis of a commonly used model system, Lactobacillus paracasei subsp. paracasei TMW 1.1434 (isogenic with F19) under 11 typical stress conditions, including among others oxidative, osmotic, pH, and pressure stress. We identified and quantified >1900 proteins in triplicate analyses, representing 65% of all genes encoded in the genome. The identified genes were thoroughly annotated in terms of subcellular localization prediction and biological functions, suggesting unbiased and comprehensive proteome coverage. In total, 427 proteins were significantly differentially expressed in at least one condition. Most notably, our analysis suggests that optimal preconditioning toward drying was predicted to be alkaline and high-pressure stress preconditioning. Taken together, we believe the presented strategy may serve as a prototypic example for the analysis and utility of employing quantitative-mass-spectrometry-based proteomics to study bacterial physiology.


July 7, 2019

A novel hybrid plasmid carrying multiple antimicrobial resistance and virulence genes in Salmonella enterica serovar Dublin.

Virulence plasmids and antibiotic resistance plasmids are usually maintained separately in Salmonella spp.; however, we report an instance of a hybrid plasmid (pN13-01125) in Salmonella enterica serovar Dublin. Review of the complete sequence of the 172,265-bp plasmid suggests that pN13-01125 is comprised of the previously described pSDVr and pSH696_135 plasmids and that the mechanism of hybridization likely involves IS6 (IS26) insertion sequence elements. The plasmid has a low conjugation frequency, confers resistance to six classes of antimicrobials, and contains a complete spv virulence operon.© Crown copyright 2017.


July 7, 2019

Hunting structural variants: Population by population

Until recently, most population-scale genome sequencing studies have focused on identifying single nucleotide variants (SNVs) to explore genetic differences between individuals. Like so many SNV-based genome-wide association studies, however, these efforts have had difficulty identifying causative genetic mechanisms underlying most complex functions. More and more, the genomics community has realised that structural variation is likely responsible for many of the traits and phenotypes that scientists have not been able to attribute to SNVs. This class of variants, defined as genetic differences of 50 bp or larger, accounts for most of the DNA sequence differences between any two people. Structural variants (SVs) are also already known to cause many common and rare diseases including ALS, schizophrenia, leukemia, Carney complex, and Huntington’s disease. Despite the importance of SVs, these larger variants have been understudied and underreported compared to their single-nucleotide counterparts. One reason is that they remain difficult to detect. Their length often means they cannot be fully spanned using short sequencing reads. They also often occur in highly repetitive or GC-rich regions of the genome, making them challenging targets. As such, this class of human genetic variation has remained vastly under-explored in global populations and is now ripe for discovery.


July 7, 2019

Rapid gene turnover as a significant source of genetic variation in a recently seeded population of a pathogen.

Genome sequencing has been useful to gain an understanding of bacterial evolution. It has been used for studying the phylogeography and/or the impact of mutation and recombination on bacterial populations. However, it has rarely been used to study gene turnover at microevolutionary scales. Here, we sequenced Mexican strains of the human pathogen Acinetobacter baumannii sampled from the same locale over a 3 year period to obtain insights into the microevolutionary dynamics of gene content variability. We found that the Mexican A. baumannii population was recently founded and has been emerging due to a rapid clonal expansion. Furthermore, we noticed that on average the Mexican strains differed from each other by over 300 genes and, notably, this gene content variation has accrued more frequently and faster than the accumulation of mutations. Moreover, due to its rapid pace, gene content variation reflects the phylogeny only at very short periods of time. Additionally, we found that the external branches of the phylogeny had almost 100 more genes than the internal branches. All in all, these results show that rapid gene turnover has been of paramount importance in producing genetic variation within this population and demonstrate the utility of genome sequencing to study alternative forms of genetic variation.


July 7, 2019

Research Highlights: Packing, trapping and sequencing

Ultralow concentrations of DNA can be optically sequenced with SMRT DNA sequencing. In principle, optical DNA-sequencing protocols have the advantage of reading long strands of DNA in real time and at high speeds. In practice, however, reading long DNA strands is a challenge with current methods, which require high concentrations and suffer from short- chain loading bias. To overcome these limitations, a research team led by Meni Wanunu at Northeastern University in Boston has now developed an efficient voltage-controlled DNA- loading technology that enables single molecule, real time (SMRT) sequencing of long DNA strands at ultralow concentrations.


July 7, 2019

Draft sequencing of the heterozygous diploid genome of Satsuma (Citrus unshiu Marc.) using a hybrid assembly approach.

Satsuma (Citrus unshiu Marc.) is one of the most abundantly produced mandarin varieties of citrus, known for its seedless fruit production and as a breeding parent of citrus. De novo assembly of the heterozygous diploid genome of Satsuma (“Miyagawa Wase”) was conducted by a hybrid assembly approach using short-read sequences, three mate-pair libraries, and a long-read sequence of PacBio by the PLATANUS assembler. The assembled sequence, with a total size of 359.7 Mb at the N50 length of 386,404 bp, consisted of 20,876 scaffolds. Pseudomolecules of Satsuma constructed by aligning the scaffolds to three genetic maps showed genome-wide synteny to the genomes of Clementine, pummelo, and sweet orange. Gene prediction by modeling with MAKER-P proposed 29,024 genes and 37,970 mRNA; additionally, gene prediction analysis found candidates for novel genes in several biosynthesis pathways for gibberellin and violaxanthin catabolism. BUSCO scores for the assembled scaffold and predicted transcripts, and another analysis by BAC end sequence mapping indicated the assembled genome consistency was close to those of the haploid Clementine, pummel, and sweet orange genomes. The number of repeat elements and long terminal repeat retrotransposon were comparable to those of the seven citrus genomes; this suggested no significant failure in the assembly at the repeat region. A resequencing application using the assembled sequence confirmed that both kunenbo-A and Satsuma are offsprings of Kishu, and Satsuma is a back-crossed offspring of Kishu. These results illustrated the performance of the hybrid assembly approach and its ability to construct an accurate heterozygous diploid genome.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.