Arctic charr have a circumpolar distribution, persevere under extreme environmental conditions, and reach ages unknown to most other salmonids. The Salvelinus genus is primarily composed of species with genomes that are structured more like the ancestral salmonid genome than most Oncorhynchus and Salmo species of sister genera. It is thought that this aspect of the genome may be important for local adaptation (due to increased recombination) and anadromy (the migration of fish from saltwater to freshwater). In this study, we describe the generation of a new genetic map, the sequencing and assembly of the Arctic charr genome (GenBank accession: GCF_002910315.2) using the newly created genetic map and a previous genetic map, and present several analyses of the Arctic charr genes and genome assembly. The newly generated genetic map consists of 8,574 unique genetic markers and is similar to previous genetic maps with the exception of three major structural differences. The N50, identified BUSCOs, repetitive DNA content, and total size of the Arctic charr assembled genome are all comparable to other assembled salmonid genomes. An analysis to identify orthologous genes revealed that a large number of orthologs could be identified between salmonids and many appear to have highly conserved gene expression profiles between species. Comparing orthologous gene expression profiles may give us a better insight into which genes are more likely to influence species specific phenotypes.
The wrasses (Labridae) are one of the most successful and species-rich families of the Perciformes order of teleost fish. Its members display great morphological diversity, and occupy distinct trophic levels in coastal waters and coral reefs. The cleaning behaviour displayed by some wrasses, such as corkwing wrasse (Symphodus melops), is of particular interest for the salmon aquaculture industry to combat and control sea lice infestation as an alternative to chemicals and pharmaceuticals. There are still few genome assemblies available within this fish family for comparative and functional studies, despite the rapid increase in genome resources generated during the past years. Here, we present a highly continuous genome assembly of the corkwing wrasse using PacBio SMRT sequencing (x28.8) followed by error correction with paired-end Illumina data (x132.9). The present genome assembly consists of 5040 contigs (N50?=?461,652?bp) and a total size of 614 Mbp, of which 8.5% of the genome sequence encode known repeated elements. The genome assembly covers 94.21% of highly conserved genes across ray-finned fish species. We find evidence for increased copy numbers specific for corkwing wrasse possibly highlighting diversification and adaptive processes in gene families including N-linked glycosylation (ST8SIA6) and stress response kinases (HIPK1). By comparative analyses, we discover that de novo repeats, often not properly investigated during genome annotation, encode hundreds of immune-related genes. This new genomic resource, together with the ballan wrasse (Labrus bergylta), will allow for in-depth comparative genomics as well as population genetic analyses for the understudied wrasses. Copyright © 2018 Elsevier Inc. All rights reserved.
Thermosipho spp. immune system differences affect variation in genome size and geographical distributions.
Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.
Edible bananas result from interspecific hybridization between Musa acuminata and Musa balbisiana, as well as among subspecies in M. acuminata. Four particular M. acuminata subspecies have been proposed as the main contributors of edible bananas, all of which radiated in a short period of time in southeastern Asia. Clarifying the evolution of these lineages at a whole-genome scale is therefore an important step toward understanding the domestication and diversification of this crop. This study reports the de novo genome assembly and gene annotation of a representative genotype from three different subspecies of M. acuminata. These data are combined with the previously published genome of the fourth subspecies to investigate phylogenetic relationships. Analyses of shared and unique gene families reveal that the four subspecies are quite homogenous, with a core genome representing at least 50% of all genes and very few M. acuminata species-specific gene families. Multiple alignments indicate high sequence identity between homologous single copy-genes, supporting the close relationships of these lineages. Interestingly, phylogenomic analyses demonstrate high levels of gene tree discordance, due to both incomplete lineage sorting and introgression. This pattern suggests rapid radiation within Musa acuminata subspecies that occurred after the divergence with M. balbisiana. Introgression between M. a. ssp. malaccensis and M. a. ssp. burmannica was detected across the genome, though multiple approaches to resolve the subspecies tree converged on the same topology. To support evolutionary and functional analyses, we introduce the PanMusa database, which enables researchers to exploration of individual gene families and trees.
Complete genome sequence of Tessaracoccus sp. strain T2.5-30 isolated from 139.5 meters deep on the subsurface of the Iberian Pyritic Belt.
Here, we report the complete genome sequence of Tessaracoccus sp. strain T2.5-30, which consists of a chromosome with 3.2 Mbp, 70.4% G+C content, and 3,005 coding DNA sequences. The strain was isolated from a rock core retrieved at a depth of 139.5 m in the subsurface of the Iberian Pyritic Belt (Spain). Copyright © 2017 Leandro et al.
Complete genome sequence of Vibrio anguillarum strain NB10, a virulent isolate from the Gulf of Bothnia.
Vibrio anguillarum causes a fatal hemorrhagic septicemia in marine fish that leads to great economical losses in aquaculture world-wide. Vibrio anguillarum strain NB10 serotype O1 is a Gram-negative, motile, curved rod-shaped bacterium, isolated from a diseased fish on the Swedish coast of the Gulf of Bothnia, and is slightly halophilic. Strain NB10 is a virulent isolate that readily colonizes fish skin and intestinal tissues. Here, the features of this bacterium are described and the annotation and analysis of its complete genome sequence is presented. The genome is 4,373,835 bp in size, consists of two circular chromosomes and one plasmid, and contains 3,783 protein-coding genes and 129 RNA genes.
Genome sequences of Corynebacterium pseudotuberculosis strains 48252 (human, pneumonia), CS_10 (lab strain), Ft_2193/ 67 (goat, pus), and CCUG 27541.
Here we report the genome sequencess of four Corynebacterium pseudotuberculosis strains. These include a strain isolated from a patient with C. pseudotuberculosis pneumonia (48252), a strain isolated from pus in goat (Ft_2193/67), a laboratory strain originating from strain Ft_2193/67 (CS_10), and the draft genome of an equine reference strain, CCUG 27541. Copyright © 2014 Håvelsrud et al.
Seeking the source of Pseudomonas aeruginosa infections in a recently opened hospital: an observational study using whole-genome sequencing.
Pseudomonas aeruginosa is a common nosocomial pathogen responsible for significant morbidity and mortality internationally. Patients may become colonised or infected with P. aeruginosa after exposure to contaminated sources within the hospital environment. The aim of this study was to determine whether whole-genome sequencing (WGS) can be used to determine the source in a cohort of burns patients at high risk of P. aeruginosa acquisition.An observational prospective cohort study.Burns care ward and critical care ward in the UK.Patients with >7% total burns by surface area were recruited into the study.All patients were screened for P. aeruginosa on admission and samples taken from their immediate environment, including water. Screening patients who subsequently developed a positive P. aeruginosa microbiology result were subject to enhanced environmental surveillance. All isolates of P. aeruginosa were genome sequenced. Sequence analysis looked at similarity and relatedness between isolates.WGS for 141 P. aeruginosa isolates were obtained from patients, hospital water and the ward environment. Phylogenetic analysis revealed eight distinct clades, with a single clade representing the majority of environmental isolates in the burns unit. Isolates from three patients had identical genotypes compared with water isolates from the same room. There was clear clustering of water isolates by room and outlet, allowing the source of acquisitions to be unambiguously identified. Whole-genome shotgun sequencing of biofilm DNA extracted from a thermostatic mixer valve revealed this was the source of a P. aeruginosa subpopulation previously detected in water. In the remaining two cases there was no clear link to the hospital environment.This study reveals that WGS can be used for source tracking of P. aeruginosa in a hospital setting, and that acquisitions can be traced to a specific source within a hospital ward. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Dissemination of cephalosporin resistance genes between Escherichia coli strains from farm animals and humans by specific plasmid lineages.
Third-generation cephalosporins are a class of ß-lactam antibiotics that are often used for the treatment of human infections caused by Gram-negative bacteria, especially Escherichia coli. Worryingly, the incidence of human infections caused by third-generation cephalosporin-resistant E. coli is increasing worldwide. Recent studies have suggested that these E. coli strains, and their antibiotic resistance genes, can spread from food-producing animals, via the food-chain, to humans. However, these studies used traditional typing methods, which may not have provided sufficient resolution to reliably assess the relatedness of these strains. We therefore used whole-genome sequencing (WGS) to study the relatedness of cephalosporin-resistant E. coli from humans, chicken meat, poultry and pigs. One strain collection included pairs of human and poultry-associated strains that had previously been considered to be identical based on Multi-Locus Sequence Typing, plasmid typing and antibiotic resistance gene sequencing. The second collection included isolates from farmers and their pigs. WGS analysis revealed considerable heterogeneity between human and poultry-associated isolates. The most closely related pairs of strains from both sources carried 1263 Single-Nucleotide Polymorphisms (SNPs) per Mbp core genome. In contrast, epidemiologically linked strains from humans and pigs differed by only 1.8 SNPs per Mbp core genome. WGS-based plasmid reconstructions revealed three distinct plasmid lineages (IncI1- and IncK-type) that carried cephalosporin resistance genes of the Extended-Spectrum Beta-Lactamase (ESBL)- and AmpC-types. The plasmid backbones within each lineage were virtually identical and were shared by genetically unrelated human and animal isolates. Plasmid reconstructions from short-read sequencing data were validated by long-read DNA sequencing for two strains. Our findings failed to demonstrate evidence for recent clonal transmission of cephalosporin-resistant E. coli strains from poultry to humans, as has been suggested based on traditional, low-resolution typing methods. Instead, our data suggest that cephalosporin resistance genes are mainly disseminated in animals and humans via distinct plasmids.
Complete genome sequence of Lutibacter profundi LP1T isolated from an Arctic deep-sea hydrothermal vent system
Lutibacter profundi LP1T within the family Flavobacteriaceae was isolated from a biofilm growing on the surface of a black smoker chimney at the Loki’s Castle vent field, located on the Arctic Mid-Ocean Ridge. The complete genome of L. profundi LP1T is the first genome to be published within the genus Lutibacter. L. profundi LP1T consists of a single 2,966,978 bp circular chromosome with a GC content of 29.8%. The genome comprises 2,537 protein-coding genes, 40 tRNA species and 2 rRNA operons. The microaerophilic, organotrophic isolate contains genes for all central carbohydrate metabolic pathways. However, genes for the oxidative branch of the pentose-phosphate-pathway, the glyoxylate shunt of the tricarboxylic acid cycle and the ATP citrate lyase for reverse TCA are not present. L. profundi LP1T utilizes starch, sucrose and diverse proteinous carbon sources. In accordance, the genome harbours 130 proteases and 104 carbohydrate-active enzymes, indicating a specialization in degrading organic matter. Among a small arsenal of 24 glycosyl hydrolases, which offer the possibility to hydrolyse diverse poly- and oligosaccharides, a starch utilization cluster was identified. Furthermore, a variety of enzymes may be secreted via T9SS and contribute to the hydrolytic variety of the microorganism. Genes for gliding motility are present, which may enable the bacteria to move within the biofilm. A substantial number of genes encoding for extracellular polysaccharide synthesis pathways, curli fibres and attachment to surfaces could mediate adhesion in the biofilm and may contribute to the biofilm formation. In addition to aerobic respiration, the complete denitrification pathway and genes for sulphide oxidation e.g. sulphide:quinone reductase are present in the genome. sulphide:quinone reductase and denitrification may serve as detoxification systems allowing L. profundi LP1T to thrive in a sulphide and nitrate enriched environment. The information gained from the genome gives a greater insight in the functional role of L. profundi LP1T in the biofilm and its adaption strategy in an extreme environment.
The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Lost in plasmids: next generation sequencing and the complex genome of the tick-borne pathogen Borrelia burgdorferi.
Borrelia (B.) burgdorferi sensu lato, including the tick-transmitted agents of human Lyme borreliosis, have particularly complex genomes, consisting of a linear main chromosome and numerous linear and circular plasmids. The number and structure of plasmids is variable even in strains within a single genospecies. Genes on these plasmids are known to play essential roles in virulence and pathogenicity as well as host and vector associations. For this reason, it is essential to explore methods for rapid and reliable characterisation of molecular level changes on plasmids. In this study we used three strains: a low passage isolate of B. burgdorferi sensu stricto strain B31(-NRZ) and two closely related strains (PAli and PAbe) that were isolated from human patients. Sequences of these strains were compared to the previously sequenced reference strain B31 (available in GenBank) to obtain proof-of-principle information on the suitability of next generation sequencing (NGS) library construction and sequencing methods on the assembly of bacterial plasmids. We tested the effectiveness of different short read assemblers on Illumina sequences, and of long read generation methods on sequence data from Pacific Bioscience single-molecule real-time (SMRT) and nanopore (Oxford Nanopore Technologies) sequencing technology.Inclusion of mate pair library reads improved the assembly in some plasmids as did prior enrichment of plasmids. While cp32 plasmids remained refractory to assembly using only short reads they were effectively assembled by long read sequencing methods. The long read SMRT and nanopore sequences came, however, at the cost of indels (insertions or deletions) appearing in an unpredictable manner. Using long and short read technologies together allowed us to show that the three B. burgdorferi s.s. strains investigated here, whilst having similar plasmid structures to each other (apart from fusion of cp32 plasmids), differed significantly from the reference strain B31-GB, especially in the case of cp32 plasmids.Short read methods are sufficient to assemble the main chromosome and many of the plasmids in B. burgdorferi. However, a combination of short and long read sequencing methods is essential for proper assembly of all plasmids including cp32 and thus, for gaining an understanding of host- or vector adaptations. An important conclusion from our work is that the evolution of Borrelia plasmids appears to be dynamic. This has important implications for the development of useful research strategies to monitor the risk of Lyme disease occurrence and how to medically manage it.
Complete genome sequence of a multidrug-resistant, blaNDM-1-expressing Klebsiella pneumoniae K66-45 clinical isolate from Norway.
Multidrug-resistant Klebsiella pneumoniae is a major cause of hospital-acquired infections. Here, we report the complete genome sequence of the multidrug-resistant, blaNDM-1-positive strain K. pneumoniae K66-45, isolated from a hospitalized Norwegian patient. Copyright © 2017 Heikal et al.
Completed genome sequences of Borrelia burgdorferi sensu stricto B31(NRZ) and closely related patient isolates from Europe.
Borrelia burgdorferi sensu stricto is a causative agent of human Lyme borreliosis in the United States and Europe. We report here the completed genome sequences of strain B31 isolated from a tick in the United States and two closely related strains from Europe, PAli and PAbe, which were isolated from patients with erythema migrans and neuroborreliosis, respectively. Copyright © 2017 Margos et al.
In 1885, Theodor Escherich first described the Bacillus coli commune, which was subsequently renamed Escherichia coli. We report the complete genome sequence of this original strain (NCTC 86). The 5?144?392?bp circular chromosome encodes the genes for 4805 proteins, which include antigens, virulence factors, antimicrobial-resistance factors and secretion systems, of a commensal organism from the pre-antibiotic era. It is located in the E. coli A subgroup and is closely related to E. coli K-12 MG1655. E. coli strain NCTC 86 and the non-pathogenic K-12, C, B and HS strains share a common backbone that is largely co-linear. The exception is a large 2?803?932?bp inversion that spans the replication terminus from gmhB to clpB. Comparison with E. coli K-12 reveals 41 regions of difference (577?351?bp) distributed across the chromosome. For example, and contrary to current dogma, E. coli NCTC 86 includes a nine gene sil locus that encodes a silver-resistance efflux pump acquired before the current widespread use of silver nanoparticles as an antibacterial agent, possibly resulting from the widespread use of silver utensils and currency in Germany in the 1800s. In summary, phylogenetic comparisons with other E. coli strains confirmed that the original strain isolated by Escherich is most closely related to the non-pathogenic commensal strains. It is more distant from the root than the pathogenic organisms E. coli 042 and O157?:?H7; therefore, it is not an ancestral state for the species.