Bioinformatics Archives - Page 178 of 267

July 7, 2019

Draft genome sequence of Karnal bunt pathogen (Tilletia indica) of wheat provides insights into the pathogenic mechanisms of quarantined fungus.

Karnal bunt disease in wheat is caused by hemibiotrophic fungus, Tilletia indica that has been placed as quarantine pest in more than 70 countries. Despite its economic importance, little knowledge about the molecular components of fungal pathogenesis is known. In this study, first time the genome sequence of T. indica has been deciphered for unraveling the effectors’ functions of molecular pathogenesis of Karnal bunt disease. The T. indica genome was sequenced employing hybrid approach of PacBio Single Molecule Real Time (SMRT) and Illumina HiSEQ 2000 sequencing platforms. The genome was assembled into 10,957 contigs (N50 contig length 3 kb) with total size of 26.7 Mb and GC content of 53.99%. The number of predicted putative genes were 11,535, which were annotated with Gene Ontology databases. Functional annotation of Karnal bunt pathogen genome and classification of identified effectors into protein families revealed interesting functions related to pathogenesis. Search for effectors’ genes using pathogen host interaction database identified 135 genes. The T. indica genome sequence and putative genes involved in molecular pathogenesis would further help in devising novel and effective disease management strategies including development of resistant wheat genotypes, novel biomarkers for pathogen detection and new targets for fungicide development.

July 7, 2019

The unique genomic landscape surrounding the EPSPS gene in glyphosate resistant Amaranthus palmeri: a repetitive path to resistance.

The expanding number and global distributions of herbicide resistant weedy species threaten food, fuel, fiber and bioproduct sustainability and agroecosystem longevity. Amongst the most competitive weeds, Amaranthus palmeri S. Wats has rapidly evolved resistance to glyphosate primarily through massive amplification and insertion of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene across the genome. Increased EPSPS gene copy numbers results in higher titers of the EPSPS enzyme, the target of glyphosate, and confers resistance to glyphosate treatment. To understand the genomic unit and mechanism of EPSPS gene copy number proliferation, we developed and used a bacterial artificial chromosome (BAC) library from a highly resistant biotype to sequence the local genomic landscape flanking the EPSPS gene.By sequencing overlapping BACs, a 297 kb sequence was generated, hereafter referred to as the “EPSPS cassette.” This region included several putative genes, dense clusters of tandem and inverted repeats, putative helitron and autonomous replication sequences, and regulatory elements. Whole genome shotgun sequencing (WGS) of two biotypes exhibiting high and no resistance to glyphosate was performed to compare genomic representation across the EPSPS cassette. Mapping of sequences for both biotypes to the reference EPSPS cassette revealed significant differences in upstream and downstream sequences relative to EPSPS with regard to both repetitive units and coding content between these biotypes. The differences in sequence may have resulted from a compounded-building mechanism such as repetitive transpositional events. The association of putative helitron sequences with the cassette suggests a possible amplification and distribution mechanism. Flow cytometry revealed that the EPSPS cassette added measurable genomic content.The adoption of glyphosate resistant cropping systems in major crops such as corn, soybean, cotton and canola coupled with excessive use of glyphosate herbicide has led to evolved glyphosate resistance in several important weeds. In Amaranthus palmeri, the amplification of the EPSPS cassette, characterized by a complex array of repetitive elements and putative helitron sequences, suggests an adaptive structural genomic mechanism that drives amplification and distribution around the genome. The added genomic content not found in glyphosate sensitive plants may be driving evolution through genome expansion.

July 7, 2019

A genomic view of short tandem repeats.

Short tandem repeats (STRs) are some of the fastest mutating loci in the genome. Tools for accurately profiling STRs from high-throughput sequencing data have enabled genome-wide interrogation of more than a million STRs across hundreds of individuals. These catalogs have revealed that STRs are highly multiallelic and may contribute more de novo mutations than any other variant class. Recent studies have leveraged these catalogs to show that STRs play a widespread role in regulating gene expression and other molecular phenotypes. These analyses suggest that STRs are an underappreciated but rich reservoir of variation that likely make significant contributions to Mendelian diseases, complex traits, and cancer. Copyright Â© 2017 Elsevier Ltd. All rights reserved.

July 7, 2019

A Clostridioides difficile bacteriophage genome encodes functional binary toxin-associated genes.

Pathogenic clostridia typically produce toxins as virulence factors which cause severe diseases in both humans and animals. Whereas many clostridia like e.g., Clostridium perfringens, Clostridium botulinum or Clostridium tetani were shown to contain toxin-encoding plasmids, only toxin genes located on the chromosome were detected in Clostridioides difficile so far. In this study, we determined, annotated, and analyzed the complete genome of the bacteriophage phiSemix9P1 using single-molecule real-time sequencing technology (SMRT). To our knowledge, this represents the first C. difficile-associated bacteriophage genome that carries a complete functional binary toxin locus in its genome. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

The complete genome sequence of the yogurt isolate Streptococcus thermophilus ACA-DC 2.

Streptococcus thermophilus ACA-DC 2 is a newly sequenced strain isolated from traditional Greek yogurt. Among the 14 fully sequenced strains of S. thermophilus currently deposited in the NCBI database, the ACA-DC 2 strain has the smallest chromosome, containing 1,731,838 bp. The annotation of its genome revealed the presence of 1,850 genes, including 1,556 protein-coding genes, 70 RNA genes and 224 potential pseudogenes. A large number of pseudogenes were identified. This was also accompanied by the absence of pathogenic features suggesting evolution of strain ACA-DC 2 through genome decay processes, most probably due to adaptation to the milk ecosystem. Analysis revealed the existence of one complete lactose-galactose operon, several proteolytic enzymes, one exopolysaccharide cluster, stress response genes and four putative antimicrobial peptides. Interestingly, one CRISPR-cas system and one orphan CRISPR, both carrying only one spacer, were predicted indicating low activity or inactivation of the cas proteins. Nevertheless, four putative restriction-modification systems were determined that may compensate any deficiencies of the CRISPR-cas system. Furthermore, whole genome phylogeny indicated three distinct clades within S. thermophilus. Comparative analysis among selected strains representative for each clade, including strain ACA-DC 2, revealed a high degree of conservation at the genomic scale, but also strain specific regions. Unique genes and genomic islands of strain ACA-DC 2 contained a number of genes potentially acquired through horizontal gene transfer events, that could be related to important technological properties for dairy starters. Our study suggests genomic traits in strain ACA-DC 2 compatible to the production of dairy fermented foods.

July 7, 2019

Complete gene sequence of spider attachment silk protein (PySp1) reveals novel linker regions and extreme repeat homogenization.

Spiders use a myriad of silk types for daily survival, and each silk type has a unique suite of task-specific mechanical properties. Of all spider silk types, pyriform silk is distinct because it is a combination of a dry protein fiber and wet glue. Pyriform silk fibers are coated with wet cement and extruded into “attachment discs” that adhere silks to each other and to substrates. The mechanical properties of spider silk types are linked to the primary and higher-level structures of spider silk proteins (spidroins). Spidroins are often enormous molecules (>250 kDa) and have a lengthy repetitive region that is flanked by relatively short (~100 amino acids), non-repetitive amino- and carboxyl-terminal regions. The amino acid sequence motifs in the repetitive region vary greatly between spidroin type, while motif length and number underlie the remarkable mechanical properties of spider silk fibers. Existing knowledge of pyriform spidroins is fragmented, making it difficult to define links between the structure and function of pyriform spidroins. Here, we present the full-length sequence of the gene encoding pyriform spidroin 1 (PySp1) from the silver garden spider Argiope argentata. The predicted protein is similar to previously reported PySp1 sequences but the A. argentata PySp1 has a uniquely long and repetitive “linker”, which bridges the amino-terminal and repetitive regions. Predictions of the hydrophobicity and secondary structure of A. argentata PySp1 identify regions important to protein self-assembly. Analysis of the full complement of A. argentata PySp1 repeats reveals extreme intragenic homogenization, and comparison of A. argentata PySp1 repeats with other PySp1 sequences identifies variability in two sub-repetitive expansion regions. Overall, the full-length A. argentata PySp1 sequence provides new evidence for understanding how pyriform spidroins contribute to the properties of pyriform silk fibers. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

July 7, 2019

First complete Providencia rettgeri genome sequence, the NDM-1-producing clinical strain RB151.

Providencia rettgeri is an opportunistic bacterial pathogen of clinical significance due to its association with urinary tract infections and multidrug resistance. Here, we report the first complete genome sequence of P. rettgeri The genome of strain RB151 consists of a 4.8-Mbp chromosome and a 108-kbp blaNDM-1-positive plasmid. Copyright © 2017 Marquez-Ortiz et al.

July 7, 2019

Complete genome sequence of Akkermansia glycaniphila strain PytT, a mucin-degrading specialist of the reticulated python gut.

Akkermansia glycaniphila is a novel Akkermansia species that was isolated from the intestine of the reticulated python and shares the capacity to degrade mucin with the human strain Akkermansia muciniphila Muc(T) Here, we report the complete genome sequence of strain Pyt(T) of 3,074,121 bp. The genomic analysis reveals genes for mucin degradation and aerobic respiration. Copyright © 2017 Ouwerkerk et al.

July 7, 2019

Genetic adaptation of porcine circovirus type 1 to cultured porcine kidney cells revealed by single-molecule long-read sequencing technology.

Porcine circovirus type 1 (PCV1) is a nonpathogenic circovirus, and a contaminant of the porcine kidney (PK-15) cell line. We present the complete and annotated genome sequence of strain Szeged of PCV1, determined by Pacific Biosciences RSII long-read sequencing platform. Copyright © 2017 Tombácz et al.

July 7, 2019

A pipeline for local assembly of minisatellite alleles from single-molecule sequencing data.

The advent of Next Generation Sequencing (NGS) has led to the generation of enormous volumes of short read sequence data, cheaply and in reasonable time scales. Nevertheless, the quality of genome assemblies generated using NGS technologies has been greatly affected, compared to those generated using Sanger DNA sequencing. This is largely due to the inability of short read sequence data to scaffold repetitive structures, creating gaps, inversions and rearrangements and resulting in assemblies that are, at best, draft forms. Third generation single-molecule sequencing (SMS) technologies (e.g. Pacific Biosciences Single Molecule Real Time (SMRT) system) address this challenge by generating sequences with increased read lengths, offering the prospect to better recover these complex repetitive structures, concomitantly improving assembly quality.Here, we evaluate the ability of SMS data (specifically human genome Pacific Biosciences SMRT data) to recover poorly represented repetitive sequences (specifically, GC-rich human minisatellites). To do this we designed a pipeline for the collection, processing and local assembly of single-molecule sequence data to form accurate contiguous local reconstructions. Our results show the recovery of an allele of the non-coding minisatellite MS1 (located on chromosome 1 at 1p33-35) at greater than 97% identity to reference (GRCh38) from the unprocessed sequence data of a haploid complete hydatidiform mole (CHM1) cell line. Furthermore, our assembly revealed an allele of over 500 repeat units; much larger than the reference (GRCh38), but consistent in structure with naturally occurring alleles that are segregating in human populations. This local assembly’s reconstruction was validated with the release of the whole genome assemblies GCA_001297185.1 and GCA_000772585.3, where this allele occurs. Additionally, application of this pipeline to coding minisatellites in the PRDM9 and ZNF93 genes enabled recovery of high identity allele structures for these sequence regions whose length was confirmed by PCR from cell line genomic DNA. The internal repeat structure of the PRDM9 allele recovered was consistent with common human-specific alleles.Code available at https://github.com/ndliberial/smrt_pipeline CONTACT: dno2@le.ac.uk. © The Author 2016. Published by Oxford University Press.

July 7, 2019

An antimicrobial peptide-resistant minor subpopulation of Photorhabdus luminescens is responsible for virulence.

Some of the bacterial cells in isogenic populations behave differently from others. We describe here how a new type of phenotypic heterogeneity relating to resistance to cationic antimicrobial peptides (CAMPs) is determinant for the pathogenic infection process of the entomopathogenic bacterium Photorhabdus luminescens. We demonstrate that the resistant subpopulation, which accounts for only 0.5% of the wild-type population, causes septicemia in insects. Bacterial heterogeneity is driven by the PhoPQ two-component regulatory system and expression of pbgPE, an operon encoding proteins involved in lipopolysaccharide (LPS) modifications. We also report the characterization of a core regulon controlled by the DNA-binding PhoP protein, which governs virulence in P. luminescens. Comparative RNAseq analysis revealed an upregulation of marker genes for resistance, virulence and bacterial antagonism in the pre-existing resistant subpopulation, suggesting a greater ability to infect insect prey and to survive in cadavers. Finally, we suggest that the infection process of P. luminescens is based on a bet-hedging strategy to cope with the diverse environmental conditions experienced during the lifecycle.

July 7, 2019

Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of the genus Nocardia.

July 7, 2019

Complete genome sequence of Enterobacter sp. strain ODB01, a bacterium that degrades crude oil.

Enterobacter sp. strain ODB01, which was isolated from the Changqing oil field, can degrade crude oil efficiently and use crude oil as its sole source of carbon and energy. We report the complete genome sequence of ODB01. The results promote its application in the remediation of petroleum contaminants. Copyright © 2017 Lan et al.

July 7, 2019

Structural alteration of OmpR as a source of ertapenem resistance in a CTX-M-15-producing Escherichia coli O25b:H4 sequence type 131 clinical isolate.

In this study, an ertapenem-nonsusceptible Escherichia coli isolate was investigated to determine the genetic basis for its carbapenem resistance phenotype. This clinical strain was recovered from a patient that received, 1 year previously, ertapenem to treat a cholangitis due to a carbapenem-susceptible extended-spectrum-ß-lactamase (ESBL)-producing E. coli isolate. Whole-genome sequencing of these strains was performed using Illumina and single-molecule real-time sequencing technologies. It revealed that they belonged to the ST131 clonal group, had the predicted O25b:H4 serotype, and produced the CTX-M-15 and TEM-1 ß-lactamases. One nucleotide substitution was identified between these strains. It affected the ompR gene, which codes for a regulatory protein involved in the control of OmpC/OmpF porin expression, creating a Gly-63-Val substitution. The role of OmpR alteration was confirmed by a complementation experiment that fully restored the susceptibility to ertapenem of the clinical isolate. A modeling study showed that the Gly-63-Val change displaced the histidine-kinase phosphorylation site. SDS-PAGE analysis revealed that the ertapenem-nonsusceptible E. coli strain had a decreased expression of OmpC/OmpF porins. No significant defect in the growth rate or in the resistance to Dictyostelium discoideum amoeba phagocytosis was found in the ertapenem-nonsusceptible E. coli isolate compared to its susceptible parental strain. Our report demonstrates for the first time that ertapenem resistance may emerge clinically from ESBL-producing E. coli due to mutations that modulate the OmpR activity. Copyright © 2017 American Society for Microbiology.

July 7, 2019

Identification of IncA/C plasmid replication and maintenance genes and development of a plasmid multilocus sequence typing scheme.

Plasmids of incompatibility group A/C (IncA/C) are becoming increasingly prevalent within pathogenic Enterobacteriaceae They are associated with the dissemination of multiple clinically relevant resistance genes, including blaCMY and blaNDM Current typing methods for IncA/C plasmids offer limited resolution. In this study, we present the complete sequence of a blaNDM-1-positive IncA/C plasmid, pMS6198A, isolated from a multidrug-resistant uropathogenic Escherichia coli strain. Hypersaturated transposon mutagenesis, coupled with transposon-directed insertion site sequencing (TraDIS), was employed to identify conserved genetic elements required for replication and maintenance of pMS6198A. Our analysis of TraDIS data identified roles for the replicon, including repA, a toxin-antitoxin system; two putative partitioning genes, parAB; and a putative gene, 053 Construction of mini-IncA/C plasmids and examination of their stability within E. coli confirmed that the region encompassing 053 contributes to the stable maintenance of IncA/C plasmids. Subsequently, the four major maintenance genes (repA, parAB, and 053) were used to construct a new plasmid multilocus sequence typing (PMLST) scheme for IncA/C plasmids. Application of this scheme to a database of 82 IncA/C plasmids identified 11 unique sequence types (STs), with two dominant STs. The majority of blaNDM-positive plasmids examined (15/17; 88%) fall into ST1, suggesting acquisition and subsequent expansion of this blaNDM-containing plasmid lineage. The IncA/C PMLST scheme represents a standardized tool to identify, track, and analyze the dissemination of important IncA/C plasmid lineages, particularly in the context of epidemiological studies. Copyright © 2017 American Society for Microbiology.

Auto Tag: Bioinformatics

Draft genome sequence of Karnal bunt pathogen (Tilletia indica) of wheat provides insights into the pathogenic mechanisms of quarantined fungus.

The unique genomic landscape surrounding the EPSPS gene in glyphosate resistant Amaranthus palmeri: a repetitive path to resistance.

A genomic view of short tandem repeats.

A Clostridioides difficile bacteriophage genome encodes functional binary toxin-associated genes.

The complete genome sequence of the yogurt isolate Streptococcus thermophilus ACA-DC 2.

Complete gene sequence of spider attachment silk protein (PySp1) reveals novel linker regions and extreme repeat homogenization.

First complete Providencia rettgeri genome sequence, the NDM-1-producing clinical strain RB151.

Complete genome sequence of Akkermansia glycaniphila strain PytT, a mucin-degrading specialist of the reticulated python gut.

Genetic adaptation of porcine circovirus type 1 to cultured porcine kidney cells revealed by single-molecule long-read sequencing technology.

A pipeline for local assembly of minisatellite alleles from single-molecule sequencing data.

An antimicrobial peptide-resistant minor subpopulation of Photorhabdus luminescens is responsible for virulence.

Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

Complete genome sequence of Enterobacter sp. strain ODB01, a bacterium that degrades crude oil.

Structural alteration of OmpR as a source of ertapenem resistance in a CTX-M-15-producing Escherichia coli O25b:H4 sequence type 131 clinical isolate.

Identification of IncA/C plasmid replication and maintenance genes and development of a plasmid multilocus sequence typing scheme.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert