De novo assembly Archives - Page 263 of 324

July 7, 2019

Complete genome sequence of probiotic Lactobacillus reuteri ZLR003 isolated from healthy weaned pig.

Lactobacillus reuteri ZLR003 was isolated from the caecum mucosa of healthy weaned pigs with displaying probiotic properties in our laboratory. Here, we present the complete genome sequence of L. reuteri ZLR003, which consists of a circular 2, 234, 097bp chromosome (G+C content of 38.66%). Such information will provide insights into the molecular mechanism of its probiotic activity and facilitate its application in animal production. Copyright © 2016. Published by Elsevier B.V.

July 7, 2019

Complete genome sequence of the novel thermophilic polyhydroxyalkanoates producer Aneurinibacillus sp. XH2 isolated from Gudao oilfield in China.

Aneurinibacillus sp. XH2 (CGMCC 1.15535) was isolated from Gudao oilfield in China. It is able to use simple carbon resources to accumulate Polyhydroxyalkanoates (PHAs) in a thermophilic fashion. Here, we describe the genomic features of this strain. The total genome size of Aneurinibacillus sp. XH2 is 3,664,835bp and contains 3441 coding sequences and 114 tRNAs. The annotated genome sequence of this strain provides the genetic basis for revealing its role as a themophilic PHAs producing bacterium. Copyright © 2016 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome sequence of Vibrio parahaemolyticus FORC_023 isolated from raw fish storage water.

Vibrio parahaemolyticusis a Gram-negative halophilic bacterium that causes food-borne gastroenteritis in humans who consumeV. parahaemolyticus-contaminated seafood.The FORC_023 strain was isolated from raw fish storage water, containing live fish at a sashimi restaurant. Here, we aimed to sequence and characterize the genome of the FORC_023 strain. The genome of the FORC_023 strain showed two circular chromosomes, which contained 4227 open reading frames (ORFs), 131 tRNA genes and 37 rRNA genes. Although the genome of FORC_023 did not include major virulence genes, such as genes encoding thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH), it contained genes encoding other hemolysins, secretion systems, iron uptake-related proteins and severalV. parahaemolyticusislands. The highest average nucleotide identity value was obtained between the FORC_023 strain and UCM-V493 (CP007004-6). Comparative genomic analysis of FORC_023 with UCM-V493 revealed that FORC_023 carried an additional genomic region encoding virulence factors, such as repeats-in-toxin and type II secretion factors. Furthermore,in vitrocytotoxicity testing showed that FORC_023 exhibited a high level of cytotoxicity toward INT-407 human epithelial cells. These results suggested that the FORC_023 strain may be a food-borne pathogen.© FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

Complete genome sequence of Streptomyces venezuelae ATCC 15439, producer of the methymycin/pikromycin family of macrolide antibiotics, using PacBio technology.

Here, we report the complete genome sequence of Streptomyces venezuelae ATCC 15439, a producer of the methymycin/pikromycin family of macrolide antibiotics and a model host for natural product studies, obtained exclusively using PacBio sequencing technology. The 9.03-Mbp genome harbors 8,775 genes and 11 polyketide and nonribosomal peptide natural product gene clusters. Copyright © 2016 He et al.

July 7, 2019

Analysis of the genome sequence of the medicinal plant Salvia miltiorrhiza.

Salvia miltiorrhiza Bunge (Danshen) is a medicinal plant of the Lamiaceae family, and its dried roots have long been used in traditional Chinese medicine with hydrophilic phenolic acids and tanshinones as pharmaceutically active components (Zhang et al., 2014; Xu et al., 2016). The first step of tanshinone biosynthesis is bicyclization of the general diterpene precursor (E,E,E)-geranylgeranyl diphosphate (GGPP) to copalyl diphosphate (CPP) by CPP synthases (CPSs), which is followed by a cyclization or rearrangement reaction catalyzed by kaurene synthase-like enzymes (KSL).

July 7, 2019

Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assembly approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. The Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics. Copyright © 2016 McIlwain et al.

July 7, 2019

An improved genome assembly of Azadirachta indica A. Juss.

Neem (Azadirachta indica A. Juss.), an evergreen tree of the Meliaceae family, is known for its medicinal, cosmetic, pesticidal and insecticidal properties. We had previously sequenced and published the draft genome of the plant, using mainly short read sequencing data. In this report, we present an improved genome assembly generated using additional short reads from Illumina and long reads from Pacific Biosciences SMRT sequencer. We assembled short reads and error corrected long reads using Platanus, an assembler designed to perform well for heterozygous genomes. The updated genome assembly (v2.0) yielded 3- and 3.5-fold increase in N50 and N75, respectively; 2.6-fold decrease in the total number of scaffolds; 1.25-fold increase in the number of valid transcriptome alignments; 13.4-fold less mis-assembly and 1.85-fold increase in the percentage repeat, over the earlier assembly (v1.0). The current assembly also maps better to the genes known to be involved in the terpenoid biosynthesis pathway. Together, the data represents an improved assembly of the A. indica genome. The raw data described in this manuscript are submitted to the NCBI Short Read Archive under the accession numbers SRX1074131, SRX1074132, SRX1074133, and SRX1074134 (SRP013453). Copyright © 2016 Author et al.

July 7, 2019

Diverse CRISPR-Cas responses and dramatic cellular DNA changes and cell death in pKEF9-conjugated Sulfolobus species.

The Sulfolobales host a unique family of crenarchaeal conjugative plasmids some of which undergo complex rearrangements intracellularly. Here we examined the conjugation cycle of pKEF9 in the recipient strain Sulfolobus islandicus REY15A. The plasmid conjugated and replicated rapidly generating high average copy numbers which led to strong growth retardation that was coincident with activation of CRISPR-Cas adaptation. Simultaneously, intracellular DNA was extensively degraded and this also occurred in a conjugated ?cas6 mutant lacking a CRISPR-Cas immune response. Furthermore, the integrated forms of pKEF9 in the donor Sulfolobus solfataricus P1 and recipient host were specifically corrupted by transposable orfB elements, indicative of a dual mechanism for inactivating free and integrated forms of the plasmid. In addition, the CRISPR locus of pKEF9 was progressively deleted when conjugated into the recipient strain. Factors influencing activation of CRISPR-Cas adaptation in the recipient strain are considered, including the first evidence for a possible priming effect in Sulfolobus. The 3-Mbp genome sequence of the donor P1 strain is presented..© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

July 7, 2019

Biosynthetic genes for the tetrodecamycin antibiotics.

We recently described 13-deoxytetrodecamycin, a new member of the tetrodecamycin family of antibiotics. A defining feature of these molecules is the presence of a five-membered lactone called a tetronate ring. By sequencing the genome of a producer strain, Streptomyces sp. strain WAC04657, and searching for a gene previously implicated in tetronate ring formation, we identified the biosynthetic genes responsible for producing 13-deoxytetrodecamycin (the ted genes). Using the ted cluster in WAC04657 as a reference, we found related clusters in three other organisms: Streptomyces atroolivaceus ATCC 19725, Streptomyces globisporus NRRL B-2293, and Streptomyces sp. strain LaPpAH-202. Comparing the four clusters allowed us to identify the cluster boundaries. Genetic manipulation of the cluster confirmed the involvement of the ted genes in 13-deoxytetrodecamycin biosynthesis and revealed several additional molecules produced through the ted biosynthetic pathway, including tetrodecamycin, dihydrotetrodecamycin, and another, W5.9, a novel molecule. Comparison of the bioactivities of these four molecules suggests that they may act through the covalent modification of their target(s).The tetrodecamycins are a distinct subgroup of the tetronate family of secondary metabolites. Little is known about their biosynthesis or mechanisms of action, making them an attractive subject for investigation. In this paper we present the biosynthetic gene cluster for 13-deoxytetrodecamycin in Streptomyces sp. strain WAC04657. We identify related clusters in several other organisms and show that they produce related molecules. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

July 7, 2019

Complete chloroplast genome sequences of Eucommia ulmoides: genome structure and evolution.

Eucommia ulmoides is an important traditional medicinal plant that is used for the production of locative Eucommia rubber. In this study, the complete chloroplast (cp) genome sequence of E. ulmoides was obtained by total DNA sequencing; this is the first cp genome sequence of the order Garryales. The cp genome of E. ulmoides was 163,341 bp long and included a pair of inverted repeat (IR) regions (31,300 bp), one large single copy (LSC) region (86,592 bp), and one small single copy (SSC) region (14,149 bp). The genome structure and GC content were similar to those of typical angiosperm cp genomes and contained 115 unique genes, including 80 protein-coding genes, 31 transfer RNA (tRNAs), and four ribosomal RNA (rRNAs). Compared with the entire cp genome sequence, three unique genome rearrangements were observed in the LSC region. Moreover, compared with the Sesamum and Nicotiana cp genomes, E. ulmoides contained no indels in the IR regions, and variable regions were identified in noncoding regions. The E. ulmoides cp genome showed extreme expansion at the IR/SSC boundary owing to the integration of an additional complete gene, ycf1. Twenty-nine simple sequence repeats (SSRs) were identified in the E. ulmoides cp genome. In addition, 36 protein-coding genes were used for phylogenetic inference, supporting a sister relationship between E. ulmoides and Aucuba, which belongs to Euasterids I. In summary, we described the complete cp genome sequence of E. ulmoides; this information will be useful for phylogenetic and evolutionary studies.

July 7, 2019

Isolation and complete genome sequence of the thermophilic Geobacillus sp. 12AMOR1 from an Arctic deep-sea hydrothermal vent site.

Members of the genus Geobacillus have been isolated from a wide variety of habitats worldwide and are the subject for targeted enzyme utilization in various industrial applications. Here we report the isolation and complete genome sequence of the thermophilic starch-degrading Geobacillus sp. 12AMOR1. The strain 12AMOR1 was isolated from deep-sea hot sediment at the Jan Mayen hydrothermal Vent Site. Geobacillus sp. 12AMOR1 consists of a 3,410,035 bp circular chromosome and a 32,689 bp plasmid with a G?+?C content of 52 % and 47 %, respectively. The genome comprises 3323 protein-coding genes, 88 tRNA species and 10 rRNA operons. The isolate grows on a suite of sugars, complex polysaccharides and proteinous carbon sources. Accordingly, a versatility of genes encoding carbohydrate-active enzymes (CAZy) and peptidases were identified in the genome. Expression, purification and characterization of an enzyme of the glycoside hydrolase family 13 revealed a starch-degrading capacity and high thermal stability with a melting temperature of 76.4 °C. Altogether, the data obtained point to a new isolate from a marine hydrothermal vent with a large bioprospecting potential.

July 7, 2019

Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.

Long arrays of near-identical tandem repeats are a common feature of centromeric and subtelomeric regions in complex genomes. These sequences present a source of repeat structure diversity that is commonly ignored by standard genomic tools. Unlike reads shorter than the underlying repeat structure that rely on indirect inference methods, e.g. assembly, long reads allow direct inference of satellite higher order repeat structure. To automate characterization of local centromeric tandem repeat sequence variation we have designed Alpha-CENTAURI (ALPHA satellite CENTromeric AUtomated Repeat Identification), that takes advantage of Pacific Bioscience long-reads from whole-genome sequencing datasets. By operating on reads prior to assembly, our approach provides a more comprehensive set of repeat-structure variants and is not impacted by rearrangements or sequence underrepresentation due to misassembly.We demonstrate the utility of Alpha-CENTAURI in characterizing repeat structure for alpha satellite containing reads in the hydatidiform mole (CHM1, haploid-like) genome. The pipeline is designed to report local repeat organization summaries for each read, thereby monitoring rearrangements in repeat units, shifts in repeat orientation and sites of array transition into non-satellite DNA, typically defined by transposable element insertion. We validate the method by showing consistency with existing centromere high order repeat references. Alpha-CENTAURI can, in principle, run on any sequence data, offering a method to generate a sequence repeat resolution that could be readily performed using consensus sequences available for other satellite families in genomes without high-quality reference assemblies.Documentation and source code for Alpha-CENTAURI are freely available at http://github.com/volkansevim/alpha-CENTAURI CONTACT: ali.bashir@mssm.eduSupplementary information: Supplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press.

July 7, 2019

Antibiotic resistance mechanisms of Myroides sp.

Bacteria of the genus Myroides (Myroides spp.) are rare opportunistic pathogens. Myroides sp. infections have been reported mainly in China. Myroides sp. is highly resistant to most available antibiotics, but the resistance mechanisms are not fully elucidated. Current strain identification methods based on biochemical traits are unable to identify strains accurately at the species level. While 16S ribosomal RNA (rRNA) gene sequencing can accurately achieve this, it fails to give information on the status and mechanisms of antibiotic resistance, because the 16S rRNA sequence contains no information on resistance genes, resistance islands or enzymes. We hypothesized that obtaining the whole genome sequence of Myroides sp., using next generation sequencing methods, would help to clarify the mechanisms of pathogenesis and antibiotic resistance, and guide antibiotic selection to treat Myroides sp. infections. As Myroides sp. can survive in hospitals and the environment, there is a risk of nosocomial infections and pandemics. For better management of Myroides sp. infections, it is imperative to apply next generation sequencing technologies to clarify the antibiotic resistance mechanisms in these bacteria.

July 7, 2019

PEPR: pipelines for evaluating prokaryotic references.

The rapid adoption of microbial whole genome sequencing in public health, clinical testing, and forensic laboratories requires the use of validated measurement processes. Well-characterized, homogeneous, and stable microbial genomic reference materials can be used to evaluate measurement processes, improving confidence in microbial whole genome sequencing results. We have developed a reproducible and transparent bioinformatics tool, PEPR, Pipelines for Evaluating Prokaryotic References, for characterizing the reference genome of prokaryotic genomic materials. PEPR evaluates the quality, purity, and homogeneity of the reference material genome, and purity of the genomic material. The quality of the genome is evaluated using high coverage paired-end sequence data; coverage, paired-end read size and direction, as well as soft-clipping rates, are used to identify mis-assemblies. The homogeneity and purity of the material relative to the reference genome are characterized by comparing base calls from replicate datasets generated using multiple sequencing technologies. Genomic purity of the material is assessed by checking for DNA contaminants. We demonstrate the tool and its output using sequencing data while developing a Staphylococcus aureus candidate genomic reference material. PEPR is open source and available at https://github.com/usnistgov/pepr .

July 7, 2019

Dynamics of mutations during development of resistance by Pseudomonas aeruginosa against five antibiotics.

Pseudomonas aeruginosa is an opportunistic pathogen that causes considerable morbidity and mortality, specifically in the intensive care. Antibiotic resistant variants of this organism are more difficult to treat and cause substantial extra costs compared to susceptible strains. In the laboratory, P. aeruginosa rapidly developed resistance against five medically relevant antibiotics upon exposure to step-wise increasing concentrations. At several time points during the acquisition of resistance samples were taken for whole genome sequencing. The increase of MIC for ciprofloxacin was linked to specific mutations in gyrA, parC and gyrB, appearing sequentially. In the case of tobramycin, mutations were induced in fusA, HP02880, rplB and capD The MIC for the beta-lactam compounds meropenem, ceftazidime and the combination piperacillin/tazobactam correlated linearly with the beta-lactamase activity, but not always with individual mutations. The genes that were mutated during development of beta-lactam resistance differed for each antibiotic. A quantitative relationship between the frequency of mutations and the increase in resistance could not be established for any of the antibiotics. When the adapted strains are grown in the absence of the antibiotic, some mutations remained and others were reverted, but this reversal did not necessarily lower the MIC. The increased MIC came at the cost of moderately reduced cellular functions, or somewhat lower growth rate. In all cases except ciprofloxacin, the increase of resistance seems to be the result of a complex interaction between several cellular systems, rather than individual mutations. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

Auto Tag: De novo assembly

Complete genome sequence of probiotic Lactobacillus reuteri ZLR003 isolated from healthy weaned pig.

Complete genome sequence of the novel thermophilic polyhydroxyalkanoates producer Aneurinibacillus sp. XH2 isolated from Gudao oilfield in China.

Complete genome sequence of Vibrio parahaemolyticus FORC_023 isolated from raw fish storage water.

Complete genome sequence of Streptomyces venezuelae ATCC 15439, producer of the methymycin/pikromycin family of macrolide antibiotics, using PacBio technology.

Analysis of the genome sequence of the medicinal plant Salvia miltiorrhiza.

Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

An improved genome assembly of Azadirachta indica A. Juss.

Diverse CRISPR-Cas responses and dramatic cellular DNA changes and cell death in pKEF9-conjugated Sulfolobus species.

Biosynthetic genes for the tetrodecamycin antibiotics.

Complete chloroplast genome sequences of Eucommia ulmoides: genome structure and evolution.

Isolation and complete genome sequence of the thermophilic Geobacillus sp. 12AMOR1 from an Arctic deep-sea hydrothermal vent site.

Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.

Antibiotic resistance mechanisms of Myroides sp.

PEPR: pipelines for evaluating prokaryotic references.

Dynamics of mutations during development of resistance by Pseudomonas aeruginosa against five antibiotics.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert