Draft genome Archives - Page 65 of 106

July 7, 2019

Genome sequence of the thermotolerant foodborne pathogen Salmonella enterica serovar Senftenberg ATCC 43845 and phylogenetic analysis of loci encoding increased protein quality control mechanisms.

Salmonella enterica subsp. enterica bacteria are important foodborne pathogens with major economic impact. Some isolates exhibit increased heat tolerance, a concern for food safety. Analysis of a finished-quality genome sequence of an isolate commonly used in heat resistance studies, S. enterica subsp. enterica serovar Senftenberg 775W (ATCC 43845), demonstrated an interesting observation that this strain contains not just one, but two horizontally acquired thermotolerance locus homologs. These two loci reside on a large 341.3-kbp plasmid that is similar to the well-studied IncHI2 R478 plasmid but lacks any antibiotic resistance genes found on R478 or other IncHI2 plasmids. As this historical Salmonella isolate has been in use since 1941, comparative analysis of the plasmid and of the thermotolerance loci contained on the plasmid will provide insight into the evolution of heat resistance loci as well as acquisition of resistance determinants in IncHI2 plasmids. IMPORTANCE Thermal interventions are commonly used in the food industry as a means of mitigating pathogen contamination in food products. Concern over heat-resistant food contaminants has recently increased, with the identification of a conserved locus shown to confer heat resistance in disparate lineages of Gram-negative bacteria. Complete sequence analysis of a historical isolate of Salmonella enterica serovar Senftenberg, used in numerous studies because of its novel heat resistance, revealed that this important strain possesses two distinct copies of this conserved thermotolerance locus, residing on a multireplicon IncHI2/IncHI2A plasmid. Phylogenetic analysis of these loci in comparison with homologs identified in various bacterial genera provides an opportunity to examine the evolution and distribution of loci conferring resistance to environmental stressors, such as heat and desiccation.

July 7, 2019

Simultaneous emergence of multidrug-resistant Candida auris on 3 continents confirmed by whole-genome sequencing and epidemiological analyses.

Candida auris, a multidrug-resistant yeast that causes invasive infections, was first described in 2009 in Japan and has since been reported from several countries.To understand the global emergence and epidemiology of C. auris, we obtained isolates from 54 patients with C. auris infection from Pakistan, India, South Africa, and Venezuela during 2012-2015 and the type specimen from Japan. Patient information was available for 41 of the isolates. We conducted antifungal susceptibility testing and whole-genome sequencing (WGS).Available clinical information revealed that 41% of patients had diabetes mellitus, 51% had undergone recent surgery, 73% had a central venous catheter, and 41% were receiving systemic antifungal therapy when C. auris was isolated. The median time from admission to infection was 19 days (interquartile range, 9-36 days), 61% of patients had bloodstream infection, and 59% died. Using stringent break points, 93% of isolates were resistant to fluconazole, 35% to amphotericin B, and 7% to echinocandins; 41% were resistant to 2 antifungal classes and 4% were resistant to 3 classes. WGS demonstrated that isolates were grouped into unique clades by geographic region. Clades were separated by thousands of single-nucleotide polymorphisms, but within each clade isolates were clonal. Different mutations in ERG11 were associated with azole resistance in each geographic clade.C. auris is an emerging healthcare-associated pathogen associated with high mortality. Treatment options are limited, due to antifungal resistance. WGS analysis suggests nearly simultaneous, and recent, independent emergence of different clonal populations on 3 continents. Risk factors and transmission mechanisms need to be elucidated to guide control measures. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

July 7, 2019

Single-Molecule sequencing of the Drosophila serrata genome.

Long-read sequencing technology promises to greatly enhance de novo assembly of genomes for nonmodel species. Although the error rates of long reads have been a stumbling block, sequencing at high coverage permits the self-correction of many errors. Here, we sequence and de novo assemble the genome of Drosophila serrata, a species from the montium subgroup that has been well-studied for latitudinal clines, sexual selection, and gene expression, but which lacks a reference genome. Using 11 PacBio single-molecule real-time (SMRT cells), we generated 12 Gbp of raw sequence data comprising ~65 × whole-genome coverage. Read lengths averaged 8940 bp (NRead50 12,200) with the longest read at 53 kbp. We self-corrected reads using the PBDagCon algorithm and assembled the genome using the MHAP algorithm within the PBcR assembler. Total genome length was 198 Mbp with an N50 just under 1 Mbp. Contigs displayed a high degree of chromosome arm-level conservation with the D. melanogaster genome and many could be sensibly placed on the D. serrata physical map. We also provide an initial annotation for this genome using in silico gene predictions that were supported by RNA-seq data. Copyright © 2017 Allen et al.

July 7, 2019

Genomic innovation for crop improvement.

Crop production needs to increase to secure future food supplies, while reducing its impact on ecosystems. Detailed characterization of plant genomes and genetic diversity is crucial for meeting these challenges. Advances in genome sequencing and assembly are being used to access the large and complex genomes of crops and their wild relatives. These have helped to identify a wide spectrum of genetic variation and permitted the association of genetic diversity with diverse agronomic phenotypes. In combination with improved and automated phenotyping assays and functional genomic studies, genomics is providing new foundations for crop-breeding systems.

July 7, 2019

The histidine decarboxylase gene cluster of Lactobacillus parabuchneri was gained by horizontal gene transfer and is mobile within the species.

Histamine in food can cause intolerance reactions in consumers. Lactobacillus parabuchneri (L. parabuchneri) is one of the major causes of elevated histamine levels in cheese. Despite its significant economic impact and negative influence on human health, no genomic study has been published so far. We sequenced and analyzed 18 L. parabuchneri strains of which 12 were histamine positive and 6 were histamine negative. We determined the complete genome of the histamine positive strain FAM21731 with PacBio as well as Illumina and the genomes of the remaining 17 strains using the Illumina technology. We developed the synteny aware ortholog finding algorithm SynOrf to compare the genomes and we show that the histidine decarboxylase (HDC) gene cluster is located in a genomic island. It is very likely that the HDC gene cluster was transferred from other lactobacilli, as it is highly conserved within several lactobacilli species. Furthermore, we have evidence that the HDC gene cluster was transferred within the L. parabuchneri species.

July 7, 2019

Genomic changes associated with the evolutionary transition of an insect gut symbiont into a blood-borne pathogen.

The genus Bartonella comprises facultative intracellular bacteria with a unique lifestyle. After transmission by blood-sucking arthropods they colonize the erythrocytes of mammalian hosts causing acute and chronic infectious diseases. Although the pathogen-host interaction is well understood, little is known about the evolutionary origin of the infection strategy manifested by Bartonella species. Here we analyzed six genomes of Bartonella apis, a honey bee gut symbiont that to date represents the closest relative of pathogenic Bartonella species. Comparative genomics revealed that B. apis encodes a large set of vertically inherited genes for amino acid and cofactor biosynthesis and nitrogen metabolism. Most pathogenic bartonellae have lost these ancestral functions, but acquired specific virulence factors and expanded a vertically inherited gene family for harvesting cofactors from the blood. However, the deeply rooted pathogen Bartonella tamiae has retained many of the ancestral genome characteristics reflecting an evolutionary intermediate state toward a host-restricted intraerythrocytic lifestyle. Our findings suggest that the ancestor of the pathogen Bartonella was a gut symbiont of insects and that the adaptation to blood-feeding insects facilitated colonization of the mammalian bloodstream. This study highlights the importance of comparative genomics among pathogens and non-pathogenic relatives to understand disease emergence within an evolutionary-ecological framework.

July 7, 2019

The hidden perils of read mapping as a quality assessment tool in genome sequencing.

This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed by using CLC Genomics Workbench read mapping and Optical mapping developed by OpGen. Over-extension of contigs without prior knowledge of contig location can lead to misassembled contigs, even when commonly used quality indicators such as read mapping suggest that a contig is well assembled. Precautions must also be undertaken when using long read sequencing technology, which may also lead to misassembled contigs.

July 7, 2019

Genomic analysis of ST88 community-acquired methicillin resistant Staphylococcus aureus in Ghana.

The emergence and evolution of community-acquired methicillin resistant Staphylococcus aureus (CA-MRSA) strains in Africa is poorly understood. However, one particular MRSA lineage called ST88, appears to be rapidly establishing itself as an “African” CA-MRSA clone. In this study, we employed whole genome sequencing to provide more information on the genetic background of ST88 CA-MRSA isolates from Ghana and to describe in detail ST88 CA-MRSA isolates in comparison with other MRSA lineages worldwide.We first established a complete ST88 reference genome (AUS0325) using PacBio SMRT sequencing. We then used comparative genomics to assess relatedness among 17 ST88 CA-MRSA isolates recovered from patients attending Buruli ulcer treatment centres in Ghana, three non-African ST88s and 15 other MRSA lineages.We show that Ghanaian ST88 forms a discrete MRSA lineage (harbouring SCCmec-IV [2B]). Gene content analysis identified five distinct genomic regions enriched among ST88 isolates compared with the other S. aureus lineages. The Ghanaian ST88 isolates had only 658 core genome SNPs and there was no correlation between phylogeny and geography, suggesting the recent spread of this clone. The lineage was also resistant to multiple classes of antibiotics including ß-lactams, tetracycline and chloramphenicol.This study reveals that S. aureus ST88-IV is a recently emerging and rapidly spreading CA-MRSA clone in Ghana. The study highlights the capacity of small snapshot genomic studies to provide actionable public health information in resource limited settings. To our knowledge this is the first genomic assessment of the ST88 CA-MRSA clone.

July 7, 2019

AidP, a novel N-Acyl homoserine lactonase gene from Antarctic Planococcus sp.

Planococcus is a Gram-positive halotolerant bacterial genus in the phylum Firmicutes, commonly found in various habitats in Antarctica. Quorum quenching (QQ) is the disruption of bacterial cell-to-cell communication (known as quorum sensing), which has previously been described in mesophilic bacteria. This study demonstrated the QQ activity of a psychrotolerant strain, Planococcus versutus strain L10.15(T), isolated from a soil sample obtained near an elephant seal wallow in Antarctica. Whole genome analysis of this bacterial strain revealed the presence of an N-acyl homoserine lactonase, an enzyme that hydrolyzes the ester bond of the homoserine lactone of N-acyl homoserine lactone (AHLs). Heterologous gene expression in E. coli confirmed its functions for hydrolysis of AHLs, and the gene was designated as aidP (autoinducer degrading gene from Planococcus sp.). The low temperature activity of this enzyme suggested that it is a novel and uncharacterized class of AHL lactonase. This study is the first report on QQ activity of bacteria isolated from the polar regions.

July 7, 2019

Combination of short-read, long-read and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications.

Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and telomeric regions it influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly, LR) and single-molecule restriction maps (optical map assembly, OM). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing mis-assemblies. Combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using genome-wide population re-sequencing data, we estimated the population-scaled recombination rate (?) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin, and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three independent technologies, our results highlight the importance of adding a layer of information on genome structure inaccessible to each approach independently. Published by Cold Spring Harbor Laboratory Press.

July 7, 2019

Genome sequencing and analysis of Talaromyces pinophilus provide insights into biotechnological applications.

Species from the genus Talaromyces produce useful biomass-degrading enzymes and secondary metabolites. However, these enzymes and secondary metabolites are still poorly understood and have not been explored in depth because of a lack of comprehensive genetic information. Here, we report a 36.51-megabase genome assembly of Talaromyces pinophilus strain 1-95, with coverage of nine scaffolds of eight chromosomes with telomeric repeats at their ends and circular mitochondrial DNA. In total, 13,472 protein-coding genes were predicted. Of these, 803 were annotated to encode enzymes that act on carbohydrates, including 39 cellulose-degrading and 24 starch-degrading enzymes. In addition, 68 secondary metabolism gene clusters were identified, mainly including T1 polyketide synthase genes and nonribosomal peptide synthase genes. Comparative genomic analyses revealed that T. pinophilus 1-95 harbors more biomass-degrading enzymes and secondary metabolites than other related filamentous fungi. The prediction of the T. pinophilus 1-95 secretome indicated that approximately 50% of the biomass-degrading enzymes are secreted into the extracellular environment. These results expanded our genetic knowledge of the biomass-degrading enzyme system of T. pinophilus and its biosynthesis of secondary metabolites, facilitating the cultivation of T. pinophilus for high production of useful products.

July 7, 2019

Complete genome sequence and comparative genomics of the probiotic yeast Saccharomyces boulardii.

The probiotic yeast, Saccharomyces boulardii (Sb) is known to be effective against many gastrointestinal disorders and antibiotic-associated diarrhea. To understand molecular basis of probiotic-properties ascribed to Sb we determined the complete genomes of two strains of Sb i.e. Biocodex and unique28 and the draft genomes for three other Sb strains that are marketed as probiotics in India. We compared these genomes with 145 strains of S. cerevisiae (Sc) to understand genome-level similarities and differences between these yeasts. A distinctive feature of Sb from other Sc is absence of Ty elements Ty1, Ty3, Ty4 and associated LTR. However, we could identify complete Ty2 and Ty5 elements in Sb. The genes for hexose transporters HXT11 and HXT9, and asparagine-utilization are absent in all Sb strains. We find differences in repeat periods and copy numbers of repeats in flocculin genes that are likely related to the differential adhesion of Sb as compared to Sc. Core-proteome based taxonomy places Sb strains along with wine strains of Sc. We find the introgression of five genes from Z. bailii into the chromosome IV of Sb and wine strains of Sc. Intriguingly, genes involved in conferring known probiotic properties to Sb are conserved in most Sc strains.

July 7, 2019

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3?Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host.

July 7, 2019

Terpene synthases from Cannabis sativa.

Cannabis (Cannabis sativa) plants produce and accumulate a terpene-rich resin in glandular trichomes, which are abundant on the surface of the female inflorescence. Bouquets of different monoterpenes and sesquiterpenes are important components of cannabis resin as they define some of the unique organoleptic properties and may also influence medicinal qualities of different cannabis strains and varieties. Transcriptome analysis of trichomes of the cannabis hemp variety ‘Finola’ revealed sequences of all stages of terpene biosynthesis. Nine cannabis terpene synthases (CsTPS) were identified in subfamilies TPS-a and TPS-b. Functional characterization identified mono- and sesqui-TPS, whose products collectively comprise most of the terpenes of ‘Finola’ resin, including major compounds such as ß-myrcene, (E)-ß-ocimene, (-)-limonene, (+)-a-pinene, ß-caryophyllene, and a-humulene. Transcripts associated with terpene biosynthesis are highly expressed in trichomes compared to non-resin producing tissues. Knowledge of the CsTPS gene family may offer opportunities for selection and improvement of terpene profiles of interest in different cannabis strains and varieties.

July 7, 2019

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25?361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107?821, 61% larger than the previous assembly. © The Author 2017. Published by Oxford University Press.

Auto Tag: Draft genome

Genome sequence of the thermotolerant foodborne pathogen Salmonella enterica serovar Senftenberg ATCC 43845 and phylogenetic analysis of loci encoding increased protein quality control mechanisms.

Simultaneous emergence of multidrug-resistant Candida auris on 3 continents confirmed by whole-genome sequencing and epidemiological analyses.

Single-Molecule sequencing of the Drosophila serrata genome.

Genomic innovation for crop improvement.

The histidine decarboxylase gene cluster of Lactobacillus parabuchneri was gained by horizontal gene transfer and is mobile within the species.

Genomic changes associated with the evolutionary transition of an insect gut symbiont into a blood-borne pathogen.

The hidden perils of read mapping as a quality assessment tool in genome sequencing.

Genomic analysis of ST88 community-acquired methicillin resistant Staphylococcus aureus in Ghana.

AidP, a novel N-Acyl homoserine lactonase gene from Antarctic Planococcus sp.

Combination of short-read, long-read and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications.

Genome sequencing and analysis of Talaromyces pinophilus provide insights into biotechnological applications.

Complete genome sequence and comparative genomics of the probiotic yeast Saccharomyces boulardii.

Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism.

Terpene synthases from Cannabis sativa.

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert