Bioinformatics Archives - Page 114 of 267

September 22, 2019

Extensive and deep sequencing of the Venter/HuRef genome for developing and benchmarking genome analysis tools.

We produced an extensive collection of deep re-sequencing datasets for the Venter/HuRef genome using the Illumina massively-parallel DNA sequencing platform. The original Venter genome sequence is a very-high quality phased assembly based on Sanger sequencing. Therefore, researchers developing novel computational tools for the analysis of human genome sequence variation for the dominant Illumina sequencing technology can test and hone their algorithms by making variant calls from these Venter/HuRef datasets and then immediately confirm the detected variants in the Sanger assembly, freeing them of the need for further experimental validation. This process also applies to implementing and benchmarking existing genome analysis pipelines. We prepared and sequenced 200?bp and 350?bp short-insert whole-genome sequencing libraries (sequenced to 100x and 40x genomic coverages respectively) as well as 2?kb, 5?kb, and 12?kb mate-pair libraries (49x, 122x, and 145x physical coverages respectively). Lastly, we produced a linked-read library (128x physical coverage) from which we also performed haplotype phasing.

September 22, 2019

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue.We present HG-CoLoR, a hybrid error correction method that focuses on a seed-and-extend approach based on the alignment of the short reads to the long reads, followed by the traversal of a variable-order de Bruijn graph, built from the short reads. Our experiments show that HG-CoLoR manages to efficiently correct highly noisy long reads that display an error rate as high as 44%. When compared to other state-of-the-art long read error correction methods, our experiments also show that HG-CoLoR provides the best trade-off between runtime and quality of the results, and is the only method able to efficiently scale to eukaryotic genomes.HG-CoLoR is implemented is C++, supported on Linux platforms and freely available at https://github.com/morispi/HG-CoLoR.Supplementary data are available at Bioinformatics online.

September 22, 2019

Comparative genomic analysis revealed rapid differentiation in the pathogenicity-related gene repertoires between Pyricularia oryzae and Pyricularia penniseti isolated from a Pennisetum grass.

A number of Pyricularia species are known to infect different grass species. In the case of Pyricularia oryzae (syn. Magnaporthe oryzae), distinct populations are known to be adapted to a wide variety of grass hosts, including rice, wheat and many other grasses. The genome sizes of Pyricularia species are typical for filamentous ascomycete fungi [~?40 Mbp for P. oryzae, and ~?45 Mbp for P. grisea]. Genome plasticity, mediated in part by deletions promoted by recombination between repetitive elements [Genome Res 26:1091-1100, 2016, Nat Rev Microbiol 10:417-430,2012] and transposable elements [Annu Rev Phytopathol 55:483-503,2017] contributes to host adaptation. Therefore, comparisons of genome structure of individual species will provide insight into the evolution of host specificity. However, except for the P. oryzae subgroup, little is known about the gene content or genome organization of other Pyricularia species, such as those infecting Pennisetum grasses.Here, we report the genome sequence of P. penniseti strain P1609 isolated from a Pennisetum grass (JUJUNCAO) using PacBio SMRT sequencing technology. Phylogenomic analysis of 28 Magnaporthales species and 5 non-Magnaporthales species indicated that P1609 belongs to a Pyricularia subclade, which is genetically distant from P. oryzae. Comparative genomic analysis revealed that the pathogenicity-related gene repertoires had diverged between P1609 and the P. oryzae strain 70-15, including the known avirulence genes, other putative secreted proteins, as well as some other predicted Pathogen-Host Interaction (PHI) genes. Genomic sequence comparison also identified many genomic rearrangements relative to P. oryzae.Our results suggested that the genomic sequence of the P. penniseti P1609 could be a useful resource for the genetic study of the Pennisetum-infecting Pyricularia species and provide new insight into evolution of pathogen genomes during host adaptation.

September 22, 2019

Enterobacter cloacae Complex Sequence Type 171 Isolates Expressing KPC-4 Carbapenemase Recovered from Canine Patients in Ohio.

Companion animals are likely relevant in the transmission of antimicrobial-resistant bacteria. Enterobacter xiangfangensis sequence type 171 (ST171), a clone that has been implicated in clusters of infections in humans, was isolated from two dogs with clinical disease in Ohio. The canine isolates contained IncHI2 plasmids encoding blaKPC-4 Whole-genome sequencing was used to put the canine isolates in phylogenetic context with available human ST171 sequences, as well as to characterize their blaKPC-4 plasmids. Copyright © 2018 American Society for Microbiology.

September 22, 2019

Impact of index hopping and bias towards the reference allele on accuracy of genotype calls from low-coverage sequencing.

Inherent sources of error and bias that affect the quality of sequence data include index hopping and bias towards the reference allele. The impact of these artefacts is likely greater for low-coverage data than for high-coverage data because low-coverage data has scant information and many standard tools for processing sequence data were designed for high-coverage data. With the proliferation of cost-effective low-coverage sequencing, there is a need to understand the impact of these errors and bias on resulting genotype calls from low-coverage sequencing.We used a dataset of 26 pigs sequenced both at 2× with multiplexing and at 30× without multiplexing to show that index hopping and bias towards the reference allele due to alignment had little impact on genotype calls. However, pruning of alternative haplotypes supported by a number of reads below a predefined threshold, which is a default and desired step of some variant callers for removing potential sequencing errors in high-coverage data, introduced an unexpected bias towards the reference allele when applied to low-coverage sequence data. This bias reduced best-guess genotype concordance of low-coverage sequence data by 19.0 absolute percentage points.We propose a simple pipeline to correct the preferential bias towards the reference allele that can occur during variant discovery and we recommend that users of low-coverage sequence data be wary of unexpected biases that may be produced by bioinformatic tools that were designed for high-coverage sequence data.

September 22, 2019

Quaternary ammonium compounds with multiple cationic moieties (multiQACs) provide antimicrobial activity against Campylobacter jejuni

Recently developed quaternary ammonium compounds (QACs) possessing multiple cationic moieties, referred to as multiQACs, were tested with strains of Campylobacter jejuni to determine their potential as antimicrobial compounds against this important foodborne pathogen. Eight multiQACs were tested against a cocktail of six C. jejuni strains isolated from environmental and clinical sources. The resulting reductions in C. jejuni numbers mediated by the multiQACs were compared to the reductions produced by the application of four commercially available QACs, each of which bears a single cation. Multiple concentrations and exposure times were utilized for all compounds. The compounds which yielded the maximum C. jejuni reductions at the lowest concentrations and applied over the shortest exposure times were judged to be the most successful. Of the eight multiQACs investigated, four demonstrated reductions in C. jejuni numbers superior to the commercial QACs; these four are biscationic, and two of them bear an additional uncharged nitrogen atom. The remaining four multiQACs, which contain three or four cations, did not produce reductions in bacterial numbers comparable to commercial QACs in the timeframes tested. At the intermediary compound concentration (0.05?mM) and exposure time (5?min) the most effective multiQACs (PQ-12,12 and 12(3)0(3)12) on average killed over 99% of the Campylobacter cells present while the best commercial compound at those parameters (cetyl pyridinium chloride, CPC) only killed on average 84.56% of the Campylobacter cells. At the highest compound concentration tested (0.1?mM) and shortest exposure time (1?min), the same two biscationic multiQACs averaged mean percent reductions of Campylobacter cell numbers around 99.5% while CPC at the same concentration/exposure only managed a percent reduction of 91.3%. The biscationic multiQACs demonstrate the potential for providing a new group of antimicrobial compounds superior to current commercially available QACs in their effectiveness against C. jejuni.

September 22, 2019

DNA Methylation by Restriction Modification Systems Affects the Global Transcriptome Profile in Borrelia burgdorferi.

Prokaryote restriction modification (RM) systems serve to protect bacteria from potentially detrimental foreign DNA. Recent evidence suggests that DNA methylation by the methyltransferase (MTase) components of RM systems can also have effects on transcriptome profiles. The type strain of the causative agent of Lyme disease, Borrelia burgdorferi B31, possesses two RM systems with N6-methyladenosine (m6A) MTase activity, which are encoded by the bbe02 gene located on linear plasmid lp25 and bbq67 on lp56. The specific recognition and/or methylation sequences had not been identified for either of these B. burgdorferi MTases, and it was not previously known whether these RM systems influence transcript levels. In the current study, single-molecule real-time sequencing was utilized to map genome-wide m6A sites and to identify consensus modified motifs in wild-type B. burgdorferi as well as MTase mutants lacking either the bbe02 gene alone or both bbe02 and bbq67 genes. Four novel conserved m6A motifs were identified and were fully attributable to the presence of specific MTases. Whole-genome transcriptome changes were observed in conjunction with the loss of MTase enzymes, indicating that DNA methylation by the RM systems has effects on gene expression. Genes with altered transcription in MTase mutants include those involved in vertebrate host colonization (e.g., rpoS regulon) and acquisition by/transmission from the tick vector (e.g., rrp1 and pdeB). The results of this study provide a comprehensive view of the DNA methylation pattern in B. burgdorferi, and the accompanying gene expression profiles add to the emerging body of research on RM systems and gene regulation in bacteria.IMPORTANCE Lyme disease is the most prevalent vector-borne disease in North America and is classified by the Centers for Disease Control and Prevention (CDC) as an emerging infectious disease with an expanding geographical area of occurrence. Previous studies have shown that the causative bacterium, Borrelia burgdorferi, methylates its genome using restriction modification systems that enable the distinction from foreign DNA. Although much research has focused on the regulation of gene expression in B. burgdorferi, the effect of DNA methylation on gene regulation has not been evaluated. The current study characterizes the patterns of DNA methylation by restriction modification systems in B. burgdorferi and evaluates the resulting effects on gene regulation in this important pathogen. Copyright © 2018 American Society for Microbiology.

September 22, 2019

Novel energy conservation strategies and behaviour of Pelotomaculum schinkii driving syntrophic propionate catabolism.

Under methanogenic conditions, short-chain fatty acids are common byproducts from degradation of organic compounds and conversion of these acids is an important component of the global carbon cycle. Due to the thermodynamic difficulty of propionate degradation, this process requires syntrophic interaction between a bacterium and partner methanogen; however, the metabolic strategies and behaviour involved are not fully understood. In this study, the first genome analysis of obligately syntrophic propionate degraders (Pelotomaculum schinkii HH and P. propionicicum MGP) and comparison with other syntrophic propionate degrader genomes elucidated novel components of energy metabolism behind Pelotomaculum propionate oxidation. Combined with transcriptomic examination of P. schinkii behaviour in co-culture with Methanospirillum hungatei, we found that formate may be the preferred electron carrier for P. schinkii syntrophy. Propionate-derived menaquinol may be primarily re-oxidized to formate, and energy was conserved during formate generation through newly proposed proton-pumping formate extrusion. P. schinkii did not overexpress conventional energy metabolism associated with a model syntrophic propionate degrader Syntrophobacter fumaroxidans MPOB (i.e., CoA transferase, Fix and Rnf). We also found that P. schinkii and the partner methanogen may also interact through flagellar contact and amino acid and fructose exchange. These findings provide new understanding of syntrophic energy acquisition and interactions.© 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.

September 22, 2019

Meiotic drive of female-inherited supernumerary chromosomes in a pathogenic fungus.

Meiosis is a key cellular process of sexual reproduction that includes pairing of homologous sequences. In many species however, meiosis can also involve the segregation of supernumerary chromosomes, which can lack a homolog. How these unpaired chromosomes undergo meiosis is largely unknown. In this study we investigated chromosome segregation during meiosis in the haploid fungus Zymoseptoria tritici that possesses a large complement of supernumerary chromosomes. We used isogenic whole chromosome deletion strains to compare meiotic transmission of chromosomes when paired and unpaired. Unpaired chromosomes inherited from the male parent as well as paired supernumerary chromosomes in general showed Mendelian inheritance. In contrast, unpaired chromosomes inherited from the female parent showed non-Mendelian inheritance but were amplified and transmitted to all meiotic products. We concluded that the supernumerary chromosomes of Z. tritici show a meiotic drive and propose an additional feedback mechanism during meiosis, which initiates amplification of unpaired female-inherited chromosomes.© 2018, Habig et al.

September 22, 2019

Insights into the biology of acidophilic members of the Acidiferrobacteraceae family derived from comparative genomic analyses.

The family Acidiferrobacteraceae (order Acidiferrobacterales) currently contains Gram negative, neutrophilic sulfur oxidizers such as Sulfuricaulis and Sulfurifustis, as well as acidophilic iron and sulfur oxidizers belonging to the Acidiferrobacter genus. The diversity and taxonomy of the genus Acidiferrobacter has remained poorly explored. Although several metagenome and bioleaching studies have identified its presence worldwide, only two strains, namely Acidiferrobacter thiooxydans DSM 2932T, and Acidiferrobacter spp. SP3/III have been isolated and made publically available. Using 16S rRNA sequence data publically available for the Acidiferrobacteraceae, we herein shed light into the molecular taxonomy of this family. Results obtained support the presence of three clades Acidiferrobacter, Sulfuricaulis and Sulfurifustis. Genomic analyses of the genome sequences of A. thiooxydansT and Acidiferrobacter spp. SP3/III indicate that ANI relatedness between the SPIII/3 strain and A. thiooxydansT is below 95-96%, supporting the classification of strain SP3/III as a new species within this genus. In addition, approximately 70% of Acidiferrobacter sp. SPIII/3 predicted genes have a conserved ortholog in A. thiooxydans strains. A comparative analysis of iron, sulfur oxidation pathways, genome plasticity and cell-cell communication mechanisms of Acidiferrobacter spp. are also discussed. Copyright © 2018 The Authors. Published by Elsevier Masson SAS.. All rights reserved.

September 22, 2019

The Genome of Opium Poppy Reveals Evolutionary History of Morphinan Pathway.

Plants, as primary producers, have been playing an indispensable role in other organisms’ survival and the balance of whole ecosystem on Earth. Especially, they provide the main source of energy, food, and medicine for human beings, some of which are derived from the primary or secondary metabolites [1]. Angiosperms, with more than 300,000 species on Earth, are the largest group of land plants by far. Most agricultural crops, fruits, ornamental plants, and medicinal herbs belong to this group. The medicinal herbs are usually rich in specialized metabolites that could provide safe and valuable resources for pharmaceutical development.

September 22, 2019

A large, refractory nosocomial outbreak of Klebsiella pneumoniae carbapenemase (KPC)-producing Escherichia coli demonstrates carbapenemase gene outbreaks involving sink sites require novel approaches to infection control.

Carbapenem-resistant Enterobacteriaceae (CRE) represent a health threat, but effective control interventions remain unclear. Hospital wastewater sites are increasingly being highlighted as important potential reservoirs. We investigated a large Klebsiella pneumoniae carbapenemase (KPC)-producing Escherichia coli outbreak and wider CRE incidence trends in the Central Manchester University Hospital NHS Foundation Trust (CMFT) (United Kingdom) over 8 years, to determine the impact of infection prevention and control measures. Bacteriology and patient administration data (2009 to 2017) were linked, and a subset of CMFT or regional hospital KPC-producing E. coli isolates (n = 268) were sequenced. Control interventions followed international guidelines and included cohorting, rectal screening (n = 184,539 screens), environmental sampling, enhanced cleaning, and ward closure and plumbing replacement. Segmented regression of time trends for CRE detections was used to evaluate the impact of interventions on CRE incidence. Genomic analysis (n = 268 isolates) identified the spread of a KPC-producing E. coli outbreak clone (strain A, sequence type 216 [ST216]; n = 125) among patients and in the environment, particularly on 2 cardiac wards (wards 3 and 4), despite control measures. ST216 strain A had caused an antecedent outbreak and shared its KPC plasmids with other E. coli lineages and Enterobacteriaceae species. CRE acquisition incidence declined after closure of wards 3 and 4 and plumbing replacement, suggesting an environmental contribution. However, ward 3/ward 4 wastewater sites were rapidly recolonized with CRE and patient CRE acquisitions recurred, albeit at lower rates. Patient relocation and plumbing replacement were associated with control of a clonal KPC-producing E. coli outbreak; however, environmental contamination with CRE and patient CRE acquisitions recurred rapidly following this intervention. The large numbers of cases and the persistence of blaKPC in E. coli, including pathogenic lineages, are of concern. Copyright © 2018 American Society for Microbiology.

September 22, 2019

Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.

Annotated sequence data are instrumental in nearly all realms of biology. However, the advent of next-generation sequencing has rapidly facilitated an imbalance between accurate sequence data and accurate annotation data. To increase the annotation accuracy of the Caulobacter vibrioides CB13b1a (CB13) genome, we compared the PGAP and RAST annotations of the CB13 genome. A total of 64 unique genes were identified in the PGAP annotation that were either completely or partially absent in the RAST annotation, and a total of 16 genes were identified in the RAST annotation that were not included in the PGAP annotation. Moreover, PGAP identified 73 frameshifted genes and 22 genes with an internal stop. In contrast, RAST annotated the larger segment of these frameshifted genes without indicating a change in reading frame may have occurred. The RAST annotation did not include any genes with internal stop codons, since it chose start codons that were after the internal stop. To confirm the discrepancies between the two annotations and verify the accuracy of the CB13 genome sequence data, we re-sequenced and re-annotated the entire genome and obtained an identical sequence, except in a small number of homopolymer regions. A genome sequence comparison between the two versions allowed us to determine the correct number of bases in each homopolymer region, which eliminated frameshifts for 31 genes annotated as frameshifted genes and removed 24 pseudogenes from the PGAP annotation. Both annotation systems correctly identified genes that were missed by the other system. In addition, PGAP identified conserved gene fragments that represented the beginning of genes, but it employed no corrective method to adjust the reading frame of frameshifted genes or the start sites of genes harboring an internal stop codon. In doing so, the PGAP annotation identified a large number of pseudogenes, which may reflect evolutionary history but likely do not produce gene products. These results demonstrate that re-sequencing and annotation comparisons can be used to increase the accuracy of genomic data and the corresponding gene annotation.

September 22, 2019

The changing landscape of vancomycin-resistant Enterococcus faecium in Australia: a population-level genomic study.

Vancomycin-resistant Enterococcus faecium (VREfm) represent a major source of nosocomial infection worldwide. In Australia, there has been a recent concerning increase in bacteraemia associated with the vanA genotype, prompting investigation into the genomic epidemiology of VREfm.A population-level study of VREfm (10 November-9 December 2015) was conducted. A total of 321 VREfm isolates (from 286 patients) across Victoria State were collected and sequenced with Illumina NextSeq. SNPs were used to assess relatedness. STs and genes associated with resistance and virulence were identified. The vanA-harbouring plasmid from an isolate from each ST was assembled using long-read data. Illumina reads from remaining isolates were then mapped to these assemblies to identify their probable vanA-harbouring plasmid.vanA-VREfm comprised 17.8% of isolates. ST203, ST80 and a pstS(-) clade, ST1421, predominated (30.5%, 30.5% and 37.2%, respectively). Most vanB-VREfm were ST796 (77.7%). vanA-VREfm were more closely related within hospitals versus between them [core SNPs 10 (IQR 1-357) versus 356 (179-416), respectively], suggesting discrete introductions of vanA-VREfm, with subsequent intra-hospital transmission. In contrast, vanB-VREfm had similar core SNP distributions within versus between hospitals, due to widespread dissemination of ST796. Different vanA-harbouring plasmids were found across STs. With the exception of ST78 and ST796, Tn1546 transposons also varied. Phylogenetic analysis revealed Australian strains were often interspersed with those from other countries, suggesting ongoing cross-continental transmission.Emerging vanA-VREfm in Australia is polyclonal, indicating repeat introductions of vanA-VREfm into hospitals and subsequent dissemination. The close relationship to global strains reinforces the need for ongoing screening and control of VREfm in Australia and abroad.

September 22, 2019

Genome-scale analysis of Acetobacterium bakii reveals the cold adaptation of psychrotolerant acetogens by post-transcriptional regulation.

Acetogens synthesize acetyl-CoA via CO2 or CO fixation, producing organic compounds. Despite their ecological and industrial importance, their transcriptional and post-transcriptional regulation has not been systematically studied. With completion of the genome sequence of Acetobacterium bakii (4.28-Mb), we measured changes in the transcriptome of this psychrotolerant acetogen in response to temperature variations under autotrophic and heterotrophic growth conditions. Unexpectedly, acetogenesis genes were highly up-regulated at low temperatures under heterotrophic, as well as autotrophic, growth conditions. To mechanistically understand the transcriptional regulation of acetogenesis genes via changes in RNA secondary structures of 5′-untranslated regions (5′-UTR), the primary transcriptome was experimentally determined, and 1379 transcription start sites (TSS) and 1100 5′-UTR were found. Interestingly, acetogenesis genes contained longer 5′-UTR with lower RNA-folding free energy than other genes, revealing that the 5′-UTRs control the RNA abundance of the acetogenesis genes under low temperature conditions. Our findings suggest that post-transcriptional regulation via RNA conformational changes of 5′-UTRs is necessary for cold-adaptive acetogenesis.© 2018 Shin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

Auto Tag: Bioinformatics

Extensive and deep sequencing of the Venter/HuRef genome for developing and benchmarking genome analysis tools.

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

Comparative genomic analysis revealed rapid differentiation in the pathogenicity-related gene repertoires between Pyricularia oryzae and Pyricularia penniseti isolated from a Pennisetum grass.

Enterobacter cloacae Complex Sequence Type 171 Isolates Expressing KPC-4 Carbapenemase Recovered from Canine Patients in Ohio.

Impact of index hopping and bias towards the reference allele on accuracy of genotype calls from low-coverage sequencing.

Quaternary ammonium compounds with multiple cationic moieties (multiQACs) provide antimicrobial activity against Campylobacter jejuni

DNA Methylation by Restriction Modification Systems Affects the Global Transcriptome Profile in Borrelia burgdorferi.

Novel energy conservation strategies and behaviour of Pelotomaculum schinkii driving syntrophic propionate catabolism.

Meiotic drive of female-inherited supernumerary chromosomes in a pathogenic fungus.

Insights into the biology of acidophilic members of the Acidiferrobacteraceae family derived from comparative genomic analyses.

The Genome of Opium Poppy Reveals Evolutionary History of Morphinan Pathway.

A large, refractory nosocomial outbreak of Klebsiella pneumoniae carbapenemase (KPC)-producing Escherichia coli demonstrates carbapenemase gene outbreaks involving sink sites require novel approaches to infection control.

Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.

The changing landscape of vancomycin-resistant Enterococcus faecium in Australia: a population-level genomic study.

Genome-scale analysis of Acetobacterium bakii reveals the cold adaptation of psychrotolerant acetogens by post-transcriptional regulation.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert