Genome assembly Archives - Page 71 of 196

September 22, 2019

Characterization and genomic analyses of Pseudomonas aeruginosa podovirus TC6: establishment of genus Pa11virus.

Phages have attracted a renewed interest as alternative to chemical antibiotics. Although the number of phages is 10-fold higher than that of bacteria, the number of genomically characterized phages is far less than that of bacteria. In this study, phage TC6, a novel lytic virus of Pseudomonas aeruginosa, was isolated and characterized. TC6 consists of an icosahedral head with a diameter of approximately 54 nm and a short tail with a length of about 17 nm, which are characteristics of the family Podoviridae. TC6 can lyse 86 out of 233 clinically isolated P. aeruginosa strains, thus showing application potentials for phage therapy. The linear double-stranded genomic DNA of TC6 consisted of 49796 base pairs and was predicted to contain 71 protein-coding genes. A total of 11 TC6 structural proteins were identified by mass spectrometry. Comparative analysis revealed that the P. aeruginosa phages TC6, O4, PA11, and IME180 shared high similarity at DNA sequence and proteome levels, among which PA11 was the first phage discovered and published. Meanwhile, these phages contain 54 core genes and have very close phylogenetic relationships, which distinguish them from other known phage genera. We therefore proposed that these four phages can be classified as Pa11virus, comprising a new phage genus of Podoviridae that infects Pseudomonas spp. The results of this work promoted our understanding of phage biology, classification, and diversity.

September 22, 2019

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Cannabis has been cultivated for millennia with distinct cultivars providing either fiber and grain or tetrahydrocannabinol. Recent demand for cannabidiol rather than tetrahydrocannabinol has favored the breeding of admixed cultivars with extremely high cannabidiol content. Despite several draft Cannabis genomes, the genomic structure of cannabinoid synthase loci has remained elusive. A genetic map derived from a tetrahydrocannabinol/cannabidiol segregating population and a complete chromosome assembly from a high-cannabidiol cultivar together resolve the linkage of cannabidiolic and tetrahydrocannabinolic acid synthase gene clusters which are associated with transposable elements. High-cannabidiol cultivars appear to have been generated by integrating hemp-type cannabidiolic acid synthase gene clusters into a background of marijuana-type cannabis. Quantitative trait locus mapping suggests that overall drug potency, however, is associated with other genomic regions needing additional study.

September 22, 2019

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Genome sequencing is revolutionising infectious disease epidemiology, providing a huge step forward in sensitivity and specificity over more traditional molecular typing techniques. However, the complexity of genome data often means that its analysis and interpretation requires high-performance compute infrastructure and dedicated bioinformatics support. Furthermore, current methods have limitations that can differ between analyses and are often opaque to the user, and their reliance on multiple external dependencies makes reproducibility difficult. Here I introduce SKA, a toolkit for analysis of genome sequence data from closely-related, small, haploid genomes. SKA uses split kmers to rapidly identify variation between genome sequences, making it possible to analyse hundreds of genomes on a standard home computer. Tests on publicly available simulated and real-life data show that SKA is both faster and more efficient than the gold standard methods used today while retaining similar levels of accuracy for epidemiological purposes. SKA can take raw read data or genome assemblies as input and calculate pairwise distances, create single linkage clusters and align genomes to a reference genome or using a reference-free approach. SKA requires few decisions to be made by the user, which, along with its computational efficiency, allows genome analysis to become accessible to those with only basic bioinformatics training. The limitations of SKA are also far more transparent than for current approaches, and future improvements to mitigate these limitations are possible. Overall, SKA is a powerful addition to the armoury of the genomic epidemiologist. SKA source code is available from Github (https://github.com/simonrharris/SKA).

September 22, 2019

Physiological genomics of dietary adaptation in a marine herbivorous fish

Adopting a new diet is a significant evolutionary change and can profoundly affect an animaltextquoterights physiology, biochemistry, ecology, and its genome. To study this evolutionary transition, we investigated the physiology and genomics of digestion of a derived herbivorous fish, the monkeyface prickleback (Cebidichthys violaceus). We sequenced and assembled its genome and digestive transcriptome and revealed the molecular changes related to important dietary enzymes, finding abundant evidence for adaptation at the molecular level. In this species, two gene families experienced expansion in copy number and adaptive amino acid substitutions. These families, amylase, and bile salt activated lipase, are involved digestion of carbohydrates and lipids, respectively. Both show elevated levels of gene expression and increased enzyme activity. Because carbohydrates are abundant in the pricklebacktextquoterights diet and lipids are rare, these findings suggest that such dietary specialization involves both exploiting abundant resources and scavenging rare ones, especially essential nutrients, like essential fatty acids.

September 22, 2019

Antimicrobial resistance profile of mcr-1 positive clinical isolates of Escherichia coli in China From 2013 to 2016.

Multidrug-resistant (MDR) Escherichia coli poses a great challenge for public health in recent decades. Polymyxins have been reconsidered as a valuable therapeutic option for the treatment of infections caused by MDR E. coli. A plasmid-encoded colistin resistance gene mcr-1 encoding phosphoethanolamine transferase has been recently described in Enterobacteriaceae. In this study, a total of 123 E. coli isolates obtained from patients with diarrheal diseases in China were used for the genetic analysis of colistin resistance in clinical isolates. Antimicrobial resistance profile of polymyxin B (PB) and 11 commonly used antimicrobial agents were determined. Among the 123 E. coli isolates, 9 isolates (7.3%) were resistant to PB and PCR screening showed that seven (5.7%) isolates carried the mcr-1 gene. A hybrid sequencing analysis using single-molecule, real-time (SMRT) sequencing and Illumina sequencing was then performed to resolve the genomes of the seven mcr-1 positive isolates. These seven isolates harbored multiple plasmids and are MDR, with six isolates carrying one mcr-1 positive plasmid and one isolate (14EC033) carrying two mcr-1 positive plasmids. These eight mcr-1 positive plasmids belonged to the IncX4, IncI2, and IncP1 types. In addition, the mcr-1 gene was the solo antibiotic resistance gene identified in the mcr-1 positive plasmids, while the rest of the antibiotic resistance genes were mostly clustered into one or two plasmids. Interestingly, one mcr-1 positive isolate (14EC047) was susceptible to PB, and we showed that the activity of MCR-1-mediated colistin resistance was not phenotypically expressed in 14EC047 host strain. Furthermore, three isolates exhibited resistance to PB but did not carry previously reported mcr-related genes. Multilocus sequence typing (MLST) showed that these mcr-1 positive E. coli isolates belonged to five different STs, and three isolates belonged to ST301 which carried multiple virulence factors related to diarrhea. Additionally, the mcr-1 positive isolates were all susceptible to imipenem (IMP), suggesting that IMP could be used to treat infection caused by mcr-1 positive E. coli isolates. Collectively, this study showed a high occurrence of mcr-1 positive plasmids in patients with diarrheal diseases of Guangzhou in China and the abolishment of the MCR-1 mediated colistin resistance in one E. coli isolate.

September 22, 2019

Bacterial virulence against an oceanic bloom-forming phytoplankter is mediated by algal DMSP

Emiliania huxleyi is a bloom-forming microalga that affects the global sulfur cycle by producing large amounts of dimethylsulfoniopropionate (DMSP) and its volatile metabolic product dimethyl sulfide. Top-down regulation of E. huxleyi blooms has been attributed to viruses and grazers; however, the possible involvement of algicidal bacteria in bloom demise has remained elusive. We demonstrate that a Roseobacter strain, Sulfitobacter D7, that we isolated from a North Atlantic E. huxleyi bloom, exhibited algicidal effects against E. huxleyi upon coculturing. Both the alga and the bacterium were found to co-occur during a natural E. huxleyi bloom, therefore establishing this host-pathogen system as an attractive, ecologically relevant model for studying algal-bacterial interactions in the oceans. During interaction, Sulfitobacter D7 consumed and metabolized algal DMSP to produce high amounts of methanethiol, an alternative product of DMSP catabolism. We revealed a unique strain-specific response, in which E. huxleyi strains that exuded higher amounts of DMSP were more susceptible to Sulfitobacter D7 infection. Intriguingly, exogenous application of DMSP enhanced bacterial virulence and induced susceptibility in an algal strain typically resistant to the bacterial pathogen. This enhanced virulence was highly specific to DMSP compared to addition of propionate and glycerol which had no effect on bacterial virulence. We propose a novel function for DMSP, in addition to its central role in mutualistic interactions among marine organisms, as a mediator of bacterial virulence that may regulate E. huxleyi blooms.

September 22, 2019

pYR4 from a Norwegian isolate of Yersinia ruckeri is a putative virulence plasmid encoding both a type IV pilus and a type IV secretion system

Enteric redmouth disease caused by the pathogen Yersinia ruckeri is a significant problem for fish farming around the world. Despite its importance, only a few virulence factors of Y. ruckeri have been identified and studied in detail. Here, we report and analyze the complete DNA sequence of pYR4, a plasmid from a highly pathogenic Norwegian Y. ruckeri isolate, sequenced using PacBio SMRT technology. Like the well-known pYV plasmid of human pathogenic Yersiniae, pYR4 is a member of the IncFII family. Thirty-one percent of the pYR4 sequence is unique compared to other Y. ruckeri plasmids. The unique regions contain, among others genes, a large number of mobile genetic elements and two partitioning systems. The G+C content of pYR4 is higher than that of the Y. ruckeri NVH_3758 genome, indicating its relatively recent horizontal acquisition. pYR4, as well as the related plasmid pYR3, comprises operons that encode for type IV pili and for a conjugation system (tra). In contrast to other Yersinia plasmids, pYR4 cannot be cured at elevated temperatures. Our study highlights the power of PacBio sequencing technology for identifying mis-assembled segments of genomic sequences. Comparative analysis of pYR4 and other Y. ruckeri plasmids and genomes, which were sequenced by second and the third generation sequencing technologies, showed errors in second generation sequencing assemblies. Specifically, in the Y. ruckeri 150 and Y. ruckeri ATCC29473 genome assemblies, we mapped the entire pYR3 plasmid sequence. Placing plasmid sequences on the chromosome can result in erroneous biological conclusions. Thus, PacBio sequencing or similar long-read methods should always be preferred for de novo genome sequencing. As the tra operons of pYR3, although misplaced on the chromosome during the genome assembly process, were demonstrated to have an effect on virulence, and type IV pili are virulence factors in many bacteria, we suggest that pYR4 directly contributes to Y. ruckeri virulence.

September 22, 2019

A continuous genome assembly of the corkwing wrasse (Symphodus melops).

The wrasses (Labridae) are one of the most successful and species-rich families of the Perciformes order of teleost fish. Its members display great morphological diversity, and occupy distinct trophic levels in coastal waters and coral reefs. The cleaning behaviour displayed by some wrasses, such as corkwing wrasse (Symphodus melops), is of particular interest for the salmon aquaculture industry to combat and control sea lice infestation as an alternative to chemicals and pharmaceuticals. There are still few genome assemblies available within this fish family for comparative and functional studies, despite the rapid increase in genome resources generated during the past years. Here, we present a highly continuous genome assembly of the corkwing wrasse using PacBio SMRT sequencing (x28.8) followed by error correction with paired-end Illumina data (x132.9). The present genome assembly consists of 5040 contigs (N50?=?461,652?bp) and a total size of 614 Mbp, of which 8.5% of the genome sequence encode known repeated elements. The genome assembly covers 94.21% of highly conserved genes across ray-finned fish species. We find evidence for increased copy numbers specific for corkwing wrasse possibly highlighting diversification and adaptive processes in gene families including N-linked glycosylation (ST8SIA6) and stress response kinases (HIPK1). By comparative analyses, we discover that de novo repeats, often not properly investigated during genome annotation, encode hundreds of immune-related genes. This new genomic resource, together with the ballan wrasse (Labrus bergylta), will allow for in-depth comparative genomics as well as population genetic analyses for the understudied wrasses. Copyright © 2018 Elsevier Inc. All rights reserved.

September 22, 2019

The Butanol Producing Microbe Clostridium beijerinckii NCIMB 14988 Manipulated Using Forward and Reverse Genetic Tools.

The solventogenic anaerobe Clostridium beijerinckii has potential for use in the sustainable bioconversion of plant-derived carbohydrates into solvents, such as butanol or acetone. However, relatively few strains have been extensively characterised either at the genomic level or through exemplification of a complete genetic toolkit. To remedy this situation, a new strain of C. beijerinckii, NCIMB 14988, is selected from among a total of 55 new clostridial isolates capable of growth on hexose and pentose sugars. Chosen on the basis of its favorable properties, the complete genome sequence of NCIMB 14988 is determined and a high-efficiency plasmid transformation protocol devised. The developed DNA transfer procedure allowed demonstration in NCIMB 14988 of the forward and reverse genetic techniques of transposon mutagenesis and gene knockout, respectively. The latter is accomplished through the successful deployment of both group II intron retargeting (ClosTron) and allelic exchange. In addition to gene inactivation, the developed allelic exchange procedure is used to create point mutations in the chromosome, allowing for the effect of amino acid changes in enzymes involved in primary metabolism to be characterized. ClosTron mediated disruption of the currently unannotated non-coding region between genes LF65_05915 and LF65_05920 is found to result in a non-sporulating phenotype.© 2018 The Authors. Biotechnology Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.

September 22, 2019

Comparative genomics of Staphylococcus reveals determinants of speciation and diversification of antimicrobial defense.

The bacterial genus Staphylococcus comprises diverse species with most being described as colonizers of human and animal skin. A relational analysis of features that discriminate its species and contribute to niche adaptation and survival remains to be fully described. In this study, an interspecies, whole-genome comparative analysis of 21 Staphylococcus species was performed based on their orthologues. Three well-defined multi-species groups were identified: group A (including aureus/epidermidis); group B (including saprophyticus/xylosus) and group C (including pseudintermedius/delphini). The machine learning algorithm Random Forest was applied to prioritize orthologs that drive formation of the Staphylococcus species groups A-C. Orthologues driving staphylococcal intrageneric diversity comprised regulatory, metabolic and antimicrobial resistance proteins. Notably, the BraSR (NsaRS) two-component system (TCS) and its associated BraDE transporters that regulate antimicrobial resistance showed limited distribution in the genus and their presence was most closely associated with a subset of Staphylococcus species dominated by those that colonize human skin. Divergence of BraSR and GraSR antimicrobial peptide survival TCS and their associated transporters was observed across the staphylococci, likely reflecting niche specific evolution of these TCS/transporters and their specificities for AMPs. Experimental evolution, with selection for resistance to the lantibiotic nisin, revealed multiple routes to resistance and differences in the selection outcomes of the BraSR-positive species S. hominis and S. aureus. Selection supported a role for GraSR in nisin survival responses of the BraSR-negative species S. saprophyticus. Our study reveals diversification of antimicrobial-sensing TCS across the staphylococci and hints at differential relationships between GraSR and BraSR in those species positive for both TCS.

September 22, 2019

Thermosipho spp. immune system differences affect variation in genome size and geographical distributions.

Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.

September 22, 2019

Functional genomic analysis of phthalate acid ester (PAE) catabolism genes in the versatile PAE-mineralising bacterium Rhodococcus sp. 2G.

Microbial degradation is considered the most promising method for removing phthalate acid esters (PAEs) from polluted environments; however, a comprehensive genomic understanding of the entire PAE catabolic process is still lacking. In this study, the repertoire of PAE catabolism genes in the metabolically versatile bacterium Rhodococcus sp. 2G was examined using genomic, metabolic, and bioinformatic analyses. A total of 4930 coding genes were identified from the 5.6?Mb genome of the 2G strain, including 337 esterase/hydrolase genes and 48 transferase and decarboxylase genes that were involved in hydrolysing PAEs into phthalate acid (PA) and decarboxylating PA into benzoic acid (BA). One gene cluster (xyl) responsible for transforming BA into catechol and two catechol-catabolism gene clusters controlling the ortho (cat) and meta (xyl &mhp) cleavage pathways were also identified. The proposed PAE catabolism pathway and some key degradation genes were validated by intermediate-utilising tests and real-time quantitative polymerase chain reaction. Our results provide novel insight into the mechanisms of PAE biodegradation at the molecular level and useful information on gene resources for future studies. Copyright © 2018 Elsevier B.V. All rights reserved.

September 22, 2019

Understanding explosive diversification through cichlid fish genomics.

Owing to their taxonomic, phenotypic, ecological and behavioural diversity and propensity for explosive diversification, the assemblages of cichlid fish in the East African Great Lakes Victoria, Malawi and Tanganyika are important role models in evolutionary biology. With the release of five reference genomes and many additional genomic resources, as well as the establishment of functional genomic tools, the cichlid system has fully entered the genomic era. The in-depth genomic exploration of the East African cichlid fauna – in combination with the examination of their ecology, morphology and behaviour – permits novel insights into the way organisms diversify.

September 22, 2019

The unique evolution of the pig LRC, a single KIR but expansion of LILR and a novel Ig receptor family.

The leukocyte receptor complex (LRC) encodes numerous immunoglobulin (Ig)-like receptors involved in innate immunity. These include the killer-cell Ig-like receptors (KIR) and the leukocyte Ig-like receptors (LILR) which can be polymorphic and vary greatly in number between species. Using the recent long-read genome assembly, Sscrofa11.1, we have characterized the porcine LRC on chromosome 6. We identified a ~?197-kb region containing numerous LILR genes that were missing in previous assemblies. Out of 17 such LILR genes and fragments, six encode functional proteins, of which three are inhibitory and three are activating, while the majority of pseudogenes had the potential to encode activating receptors. Elsewhere in the LRC, between FCAR and GP6, we identified a novel gene that encodes two Ig-like domains and a long inhibitory intracellular tail. Comparison with two other porcine assemblies revealed a second, nearly identical, non-functional gene encoding a short intracellular tail with ambiguous function. These novel genes were found in a diverse range of mammalian species, including a pseudogene in humans, and typically consist of a single long-tailed receptor and a variable number of short-tailed receptors. Using porcine transcriptome data, both the novel inhibitory gene and the LILR were highly expressed in peripheral blood, while the single KIR gene, KIR2DL1, was either very poorly expressed or not at all. These observations are a prerequisite for improved understanding of immune cell functions in the pig and other species.

September 22, 2019

How complete are “complete” genome assemblies?-An avian perspective.

The genomics revolution has led to the sequencing of a large variety of nonmodel organisms often referred to as “whole” or “complete” genome assemblies. But how complete are these, really? Here, we use birds as an example for nonmodel vertebrates and find that, although suitable in principle for genomic studies, the current standard of short-read assemblies misses a significant proportion of the expected genome size (7% to 42%; mean 20 ± 9%). In particular, regions with strongly deviating nucleotide composition (e.g., guanine-cytosine-[GC]-rich) and regions highly enriched in repetitive DNA (e.g., transposable elements and satellite DNA) are usually underrepresented in assemblies. However, long-read sequencing technologies successfully characterize many of these underrepresented GC-rich or repeat-rich regions in several bird genomes. For instance, only ~2% of the expected total base pairs are missing in the last chicken reference (galGal5). These assemblies still contain thousands of gaps (i.e., fragmented sequences) because some chromosomal structures (e.g., centromeres) likely contain arrays of repetitive DNA that are too long to bridge with currently available technologies. We discuss how to minimize the number of assembly gaps by combining the latest available technologies with complementary strengths. At last, we emphasize the importance of knowing the location, size and potential content of assembly gaps when making population genetic inferences about adjacent genomic regions.© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

Auto Tag: Genome assembly

Characterization and genomic analyses of Pseudomonas aeruginosa podovirus TC6: establishment of genus Pa11virus.

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Physiological genomics of dietary adaptation in a marine herbivorous fish

Antimicrobial resistance profile of mcr-1 positive clinical isolates of Escherichia coli in China From 2013 to 2016.

Bacterial virulence against an oceanic bloom-forming phytoplankter is mediated by algal DMSP

pYR4 from a Norwegian isolate of Yersinia ruckeri is a putative virulence plasmid encoding both a type IV pilus and a type IV secretion system

A continuous genome assembly of the corkwing wrasse (Symphodus melops).

The Butanol Producing Microbe Clostridium beijerinckii NCIMB 14988 Manipulated Using Forward and Reverse Genetic Tools.

Comparative genomics of Staphylococcus reveals determinants of speciation and diversification of antimicrobial defense.

Thermosipho spp. immune system differences affect variation in genome size and geographical distributions.

Functional genomic analysis of phthalate acid ester (PAE) catabolism genes in the versatile PAE-mineralising bacterium Rhodococcus sp. 2G.

Understanding explosive diversification through cichlid fish genomics.

The unique evolution of the pig LRC, a single KIR but expansion of LILR and a novel Ig receptor family.

How complete are “complete” genome assemblies?-An avian perspective.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert