Bioinformatics Archives - Page 189 of 267

July 7, 2019

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

Candida krusei is a diploid, heterozygous yeast that is an opportunistic fungal pathogen in immunocompromised patients. This species also is utilized for fermenting cocoa beans during chocolate production. One major concern in the clinical setting is the innate resistance of this species to the most commonly used antifungal drug fluconazole. Here we report a high-quality genome sequence and assembly for the first clinical isolate of C. krusei, strain 81-B-5, into 11 scaffolds generated with PacBio sequencing technology. Gene annotation and comparative analysis revealed a unique profile of transporters that could play a role in drug resistance or adaptation to different environments. In addition, we show that while 82% of the genome is highly heterozygous, a 2.0 Mb region of the largest scaffold has undergone loss of heterozygosity. This genome will serve as a reference for further genetic studies of this pathogen. Copyright © 2017 Author et al.

July 7, 2019

ConcatSeq: A method for increasing throughput of single molecule sequencing by concatenating short DNA fragments.

Single molecule sequencing (SMS) platforms enable base sequences to be read directly from individual strands of DNA in real-time. Though capable of long read lengths, SMS platforms currently suffer from low throughput compared to competing short-read sequencing technologies. Here, we present a novel strategy for sequencing library preparation, dubbed ConcatSeq, which increases the throughput of SMS platforms by generating long concatenated templates from pools of short DNA molecules. We demonstrate adaptation of this technique to two target enrichment workflows, commonly used for oncology applications, and feasibility using PacBio single molecule real-time (SMRT) technology. Our approach is capable of increasing the sequencing throughput of the PacBio RSII platform by more than five-fold, while maintaining the ability to correctly call allele frequencies of known single nucleotide variants. ConcatSeq provides a versatile new sample preparation tool for long-read sequencing technologies.

July 7, 2019

Plasmid composition in Aeromonas salmonicida subsp. salmonicida 01-B526 unravels unsuspected type three secretion system loss patterns.

Aeromonas salmonicida subsp. salmonicida is a ubiquitous psychrophilic waterborne bacterium and a fish pathogen. The numerous mobile elements, especially insertion sequences (IS), in its genome promote rearrangements that impact its phenotype. One of the main virulence factors of this bacterium, its type three secretion system (TTSS), is affected by these rearrangements. In Aeromonas salmonicida subsp. salmonicida most of the TTSS genes are encoded in a single locus on a large plasmid called pAsa5, and may be lost when the bacterium is cultivated at a higher temperature (25 °C), producing non-virulent mutants. In a previous study, pAsa5-rearranged strains that lacked the TTSS locus on pAsa5 were produced using parental strains, including 01-B526. Some of the generated deletions were explained by homologous recombination between ISs found on pAsa5, whereas the others remained unresolved. To investigate those rearrangements, short- and long-read high-throughput sequencing technologies were used on the A. salmonicida subsp. salmonicida 01-B526 whole genome.Whole genome sequencing of the 01-B526 strain revealed that its pAsa5 has an additional IS copy, an ISAS5, compared to the reference strain (A449) sequence, which allowed for a previously unknown rearrangement to occur. It also appeared that 01-B526 bears a second large plasmid, named pAsa9, which shares 40 kbp of highly similar sequences with pAsa5. Following these discoveries, previously unexplained deletions were elucidated by genotyping. Furthermore, in one of the derived strains a fusion of pAsa5 and pAsa9, involving the newly discovered ISAS5 copy, was observed.The loss of TTSS and hence virulence is explained by one consistent mechanism: IS-driven homologous recombination. The similarities between pAsa9 and pAsa5 also provide another example of genetic diversity driven by ISs.

July 7, 2019

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Motivation: Despite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects. There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population. Result: Here, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects. Availability and implementation: Pamir is available at https://github.com/vpc-ccg/pamir. Contact:fhach@sfu.ca, prostatecentre.com or calkan@cs.bilkent.edu.tr Supplementary information:Supplementary data are available at Bioinformatics online.

July 7, 2019

The third restriction-modification system from Thermus aquaticus YT-1: solving the riddle of two TaqII specificities.

Two restriction-modification systems have been previously discovered in Thermus aquaticus YT-1. TaqI is a 263-amino acid (aa) Type IIP restriction enzyme that recognizes and cleaves within the symmetric sequence 5′-TCGA-3′. TaqII, in contrast, is a 1105-aa Type IIC restriction-and-modification enzyme, one of a family of Thermus homologs. TaqII was originally reported to recognize two different asymmetric sequences: 5′-GACCGA-3′ and 5′-CACCCA-3′. We previously cloned the taqIIRM gene, purified the recombinant protein from Escherichia coli, and showed that TaqII recognizes the 5′-GACCGA-3′ sequence only. Here, we report the discovery, isolation, and characterization of TaqIII, the third R-M system from T. aquaticus YT-1. TaqIII is a 1101-aa Type IIC/IIL enzyme and recognizes the 5′-CACCCA-3′ sequence previously attributed to TaqII. The cleavage site is 11/9 nucleotides downstream of the A residue. The enzyme exhibits striking biochemical similarity to TaqII. The 93% identity between their aa sequences suggests that they have a common evolutionary origin. The genes are located on two separate plasmids, and are probably paralogs or pseudoparalogs. Putative positions and aa that specify DNA recognition were identified and recognition motifs for 6 uncharacterized Thermus-family enzymes were predicted.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

July 7, 2019

Unravelling the complete genome of Archangium gephyra DSM 2261T and evolutionary insights into myxobacterial chitinases.

Family Cystobacteraceae is a group of eubacteria within order Myxococcales and class Deltaproteobacteria that includes more than 20 species belonging to 6 genera, that is, Angiococcus, Archangium, Cystobacter, Hyalangium, Melittangium, and Stigmatella. Earlier these members have been classified based on chitin degrading efficiency such as Cystobacter fuscus and Stigmatella aurantiaca, which are efficient chitin degraders, C. violaceus a partial chitin degrader and Archangium gephyra a chitin nondegrader. Here we report the 12.5 Mbp complete genome of A. gephyra DSM 2261T and compare it with four available genomes within the family Cystobacteraceae. Phylogeny and DNA-DNA hybridization studies reveal that A. gephyra is closest to Angiococcus disciformis, C. violaceus and C. ferrugineus, which are partial chitin degraders of the family Cystobacteraceae. Homology studies reveal the conservation of approximately half of the proteins in these genomes, with about 15% unique proteins in each genome. The total carbohydrate-active enzymes (CAZome) analysis reveals the presence of one GH18 chitinase in the A. gephyra genome whereas eight copies are present in C. fuscus and S. aurantiaca. Evolutionary studies of myxobacterial GH18 chitinases reveal that most of them are likely related to Terrabacteria and Proteobacteria whereas the Archangium GH18 homolog shares maximum similarity with those of chitin nondegrading Acidobacteria.© The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

July 7, 2019

Complete genome sequence of Leuconostoc suionicum DSM 20241(T) provides insights into its functional and metabolic features.

The genome of Leuconostoc suionicum DSM 20241(T) (=ATCC 9135(T) = LMG 8159(T) = NCIMB 6992(T)) was completely sequenced and its fermentative metabolic pathways were reconstructed to investigate the fermentative properties and metabolites of strain DSM 20241(T) during fermentation. The genome of L. suionicum DSM 20241(T) consists of a circular chromosome (2026.8 Kb) and a circular plasmid (21.9 Kb) with 37.58% G + C content, encoding 997 proteins, 12 rRNAs, and 72 tRNAs. Analysis of the metabolic pathways of L. suionicum DSM 20241(T) revealed that strain DSM 20241(T) performs heterolactic acid fermentation and can metabolize diverse organic compounds including glucose, fructose, galactose, cellobiose, mannose, sucrose, trehalose, arbutin, salcin, xylose, arabinose and ribose.

July 7, 2019

High-quality genome sequence of the highly resistant bacterium Staphylococcus haemolyticus, isolated from a neonatal bloodstream infection.

Using Illumina HiSeq and PacBio technologies, we sequenced the genome of the multidrug-resistant bacterium Staphylococcus haemolyticus, originating from a bloodstream infection in a neonate. The sequence data can be used as an accurate reference sequence. Copyright © 2017 Hosseinkhani et al.

July 7, 2019

A novel inversion in the chloroplast genome of marama (Tylosema esculentum).

Tylosema esculentum (marama bean) is being developed as a possible crop for resource-poor farmers in arid regions of Southern Africa. As part of the molecular characterization of this species, the chloroplast genome has been assembled from next-generation sequencing using both Illumina and Pac-Bio data. The genome is of typical organization with a large single-copy region and a small single-copy region separated by a pair of inverted repeats and covers 161537 bp. It contains a unique inversion not present in any other legumes, even in the closest relatives for which the complete chloroplast genome is available, and two complete copies of the ycf1 gene. These data extend the range of variability of legume chloroplast genomes. The sequencing of multiple individuals has identified two different chloroplast genomes which were geographically separated. The current sampling is limited so that the extent of the intraspecific variation is still to be determined, leaving open the question of legume chloroplast genomes adapted to particular arid environments.© The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.

July 7, 2019

Comparative genomic analysis reveals genetic features related to the virulence of Bacillus cereus FORC_013.

Bacillus cereus is well known as a gastrointestinal pathogen that causes food-borne illness. In the present study, we sequenced the complete genome of B. cereus FORC_013 isolated from fried eel in South Korea. To extend our understanding of the genomic characteristics of FORC_013, we conducted a comparative analysis with the published genomes of other B. cereus strains.We fully assembled the single circular chromosome (5,418,913 bp) and one plasmid (259,749 bp); 5511 open reading frames (ORFs) and 283 ORFs were predicted for the chromosome and plasmid, respectively. Moreover, we detected that the enterotoxin (NHE, HBL, CytK) induces food-borne illness with diarrheal symptom, and that the pleiotropic regulator, along with other virulence factors, plays a role in surviving and biofilm formation. Through comparative analysis using the complete genome sequence of B. cereus FORC_013, we identified both positively selected genes related to virulence regulation and 224 strain-specific genes of FORC_013.Through genome analysis of B. cereus FORC_013, we identified multiple virulence factors that may contribute to pathogenicity. These results will provide insight into further studies regarding B. cereus pathogenesis mechanism at the genomic level.

July 7, 2019

Draft genome sequence of Streptomyces scabrisporus NF3, an endophyte isolated from Amphipterygium adstringens.

We report the draft genome sequence of Streptomyces scabrisporus NF3, an endophyte isolated from Amphipterygium adstringens in Chiapas, Mexico. This strain produces a new modified linaridin peptide. The genome harbors at least 50 gene clusters for synthases of polyketide and nonribosomal peptides, suggesting a prospective production of various secondary metabolites. Copyright © 2017 Vazquez-Hernandez et al.

July 7, 2019

Evidence for the evolutionary steps leading to mecA-mediated ß-lactam resistance in staphylococci.

The epidemiologically most important mechanism of antibiotic resistance in Staphylococcus aureus is associated with mecA-an acquired gene encoding an extra penicillin-binding protein (PBP2a) with low affinity to virtually all ß-lactams. The introduction of mecA into the S. aureus chromosome has led to the emergence of methicillin-resistant S. aureus (MRSA) pandemics, responsible for high rates of mortality worldwide. Nonetheless, little is known regarding the origin and evolution of mecA. Different mecA homologues have been identified in species belonging to the Staphylococcus sciuri group representing the most primitive staphylococci. In this study we aimed to identify evolutionary steps linking these mecA precursors to the ß-lactam resistance gene mecA and the resistance phenotype. We sequenced genomes of 106 S. sciuri, S. vitulinus and S. fleurettii strains and determined their oxacillin susceptibility profiles. Single-nucleotide polymorphism (SNP) analysis of the core genome was performed to assess the genetic relatedness of the isolates. Phylogenetic analysis of the mecA gene homologues and promoters was achieved through nucleotide/amino acid sequence alignments and mutation rates were estimated using a Bayesian analysis. Furthermore, the predicted structure of mecA homologue-encoded PBPs of oxacillin-susceptible and -resistant strains were compared. We showed for the first time that oxacillin resistance in the S. sciuri group has emerged multiple times and by a variety of different mechanisms. Development of resistance occurred through several steps including structural diversification of the non-binding domain of native PBPs; changes in the promoters of mecA homologues; acquisition of SCCmec and adaptation of the bacterial genetic background. Moreover, our results suggest that it was exposure to ß-lactams in human-created environments that has driven evolution of native PBPs towards a resistance determinant. The evolution of ß-lactam resistance in staphylococci highlights the numerous resources available to bacteria to adapt to the selective pressure of antibiotics.

July 7, 2019

Updated reference genome sequence and annotation of Mycobacterium bovis AF2122/97.

We report here an update to the reference genome sequence of the bovine tuberculosis bacillus Mycobacterium bovis AF2122/97, generated using an integrative multiomics approach. The update includes 42 new coding sequences (CDSs), 14 modified annotations, 26 single-nucleotide polymorphism (SNP) corrections, and disclosure that the RD900 locus, previously described as absent from the genome, is in fact present. Copyright © 2017 Malone et al.

July 7, 2019

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Assembly of complex genomes using short reads remains a major challenge, which usually yields highly fragmented assemblies. Generation of ultradense linkage maps is promising for anchoring such assemblies, but traditional linkage mapping methods are hindered by the infrequency and unevenness of meiotic recombination that limit attainable map resolution. Here we develop a sequencing-based “in vitro” linkage mapping approach (called RadMap), where chromosome breakage and segregation are realized by generating hundreds of “subhaploid” fosmid/bacterial-artificial-chromosome clone pools, and by restriction site-associated DNA sequencing of these clone pools to produce an ultradense whole-genome restriction map to facilitate genome scaffolding. A bootstrap-based minimum spanning tree algorithm is developed for grouping and ordering of genome-wide markers and is implemented in a user-friendly, integrated software package (AMMO). We perform extensive analyses to validate the power and accuracy of our approach in the model plant Arabidopsis thaliana and human. We also demonstrate the utility of RadMap for enhancing the contiguity of a variety of whole-genome shotgun assemblies generated using either short Illumina reads (300 bp) or long PacBio reads (6-14 kb), with up to 15-fold improvement of N50 (~816 kb-3.7 Mb) and high scaffolding accuracy (98.1-98.5%). RadMap outperforms BioNano and Hi-C when input assembly is highly fragmented (contig N50 = 54 kb). RadMap can capture wide-range contiguity information and provide an efficient and flexible tool for high-resolution physical mapping and scaffolding of highly fragmented assemblies. Copyright © 2017 Dou et al.

July 7, 2019

Complete genome sequence of Burkholderia stabilis FERMP-21014.

Cholesterol esterase (EC 3.1.1.13) was identified in a bacterium, Burkholderia stabilis strain FERMP-21014. Here, we report the complete genome sequence of B. stabilis FERMP-21014, which has been used in the commercial production of cholesterol esterase. The genome sequence information may be useful for improving production levels of cholesterol esterase. Copyright © 2017 Konishi et al.

Auto Tag: Bioinformatics

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

ConcatSeq: A method for increasing throughput of single molecule sequencing by concatenating short DNA fragments.

Plasmid composition in Aeromonas salmonicida subsp. salmonicida 01-B526 unravels unsuspected type three secretion system loss patterns.

Discovery and genotyping of novel sequence insertions in many sequenced individuals

The third restriction-modification system from Thermus aquaticus YT-1: solving the riddle of two TaqII specificities.

Unravelling the complete genome of Archangium gephyra DSM 2261T and evolutionary insights into myxobacterial chitinases.

Complete genome sequence of Leuconostoc suionicum DSM 20241(T) provides insights into its functional and metabolic features.

High-quality genome sequence of the highly resistant bacterium Staphylococcus haemolyticus, isolated from a neonatal bloodstream infection.

A novel inversion in the chloroplast genome of marama (Tylosema esculentum).

Comparative genomic analysis reveals genetic features related to the virulence of Bacillus cereus FORC_013.

Draft genome sequence of Streptomyces scabrisporus NF3, an endophyte isolated from Amphipterygium adstringens.

Evidence for the evolutionary steps leading to mecA-mediated ß-lactam resistance in staphylococci.

Updated reference genome sequence and annotation of Mycobacterium bovis AF2122/97.

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Complete genome sequence of Burkholderia stabilis FERMP-21014.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert