Menu
July 7, 2019

Ten steps to get started in Genome Assembly and Annotation.

As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).


July 7, 2019

FMLRC: Hybrid long read error correction using an FM-index.

Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy.We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods.Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.


July 7, 2019

The odyssey of the ancestral Escherich strain through culture collections: an example of allopatric diversification.

More than a century ago, Theodor Escherich isolated the bacterium that was to become Escherichia coli, one of the most studied organisms. Not long after, the strain began an odyssey and landed in many laboratories across the world. As laboratory culture conditions could be responsible for major changes in bacterial strains, we conducted a genome analysis of isolates of this emblematic strain from different culture collections (England, France, the United States, Germany). Strikingly, many discrepancies between the isolates were observed, as revealed by multilocus sequence typing (MLST), the presence of virulence-associated genes, core genome MLST, and single nucleotide polymorphism/indel analyses. These differences are correlated with the phylogeographic history of the strain and were due to an unprecedented number of mutations in coding DNA repair functions such as mismatch repair (MutL) and oxidized guanine nucleotide pool cleaning (MutT), conferring a specific mutational spectrum and leading to a mutator phenotype. The mutator phenotype was probably acquired during subculturing and corresponded to second-order selection. Furthermore, all of the isolates exhibited hypersusceptibility to antibiotics due to mutations in efflux pump- and porin-encoding genes, as well as a specific mutation in the sigma factor-encoding generpoS. These defects reflect a self-preservation and nutritional competence tradeoff allowing survival under the starvation conditions imposed by storage. From a clinical point of view, dealing with such mutator strains can lead microbiologists to draw false conclusions about isolate relatedness and may impact therapeutic effectiveness. IMPORTANCE Mutator phenotypes have been described in laboratory-evolved bacteria, as well as in natural isolates. Several genes can be impacted, each of them being associated with a typical mutational spectrum. By studying one of the oldest strains available, the ancestral Escherich strain, we were able to identify its mutator status leading to tremendous genetic diversity among the isolates from various collections and allowing us to reconstruct the phylogeographic history of the strain. This mutator phenotype was probably acquired during the storage of the strain, promoting adaptation to a specific environment. Other mutations inrpoSand efflux pump- and porin-encoding genes highlight the acclimatization of the strain through self-preservation and nutritional competence regulation. This strain history can be viewed as unintentional experimental evolution in culture collections all over the word since 1885, mimicking the long-term experimental evolution ofE. coliof Lenski et al. (O. Tenaillon, J. E. Barrick, N. Ribeck, D. E. Deatherage, J. L. Blanchard, A. Dasgupta, G. C. Wu, S. Wielgoss, S. Cruveiller, C. Médigue, D. Schneider, and R. E. Lenski, Nature 536:165-170, 2016, https://doi.org/10.1038/nature18959) that shares numerous molecular features.


July 7, 2019

Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.


July 7, 2019

Paucibacter aquatile sp. nov. isolated from freshwater of the Nakdong River, Republic of Korea.

A Gram-negative, aerobic, motile, and rod-shaped bacterial strain designated CR182T was isolated from freshwater of the Nakdong River, Republic of Korea. Optimal growth conditions for this novel strain were found to be: 25-30 °C, pH 6.5-8.5, and 3% (w/v) NaCl. Phylogenetic analysis based on 16S rRNA gene sequence indicates that the strain CR182T belongs to type strains of genus Paucibacter. Strain CR182T showed 98.0% 16S rRNA gene sequence similarity with Paucibacter oligotrophus CHU3T and formed a robust phylogenetic clade with this species. The average nucleotide identity value between strain CR182T and P. oligotrophus CHU3T was 78.4% and the genome-to-genome distance was 22.2% on average. The genomic DNA G+C content calculated from the genome sequence was 66.3 mol%. Predominant cellular fatty acids of strain CR182T were summed feature 3 (C16:1 ?7c and/or C16:1 ?6c) (31.2%) and C16:0 (16.0%). Its major respiratory quinine was ubiquinone Q-8. Its polar lipids consisted of diphosphatidylglycerol, phosphatidylethanolamine, and two unidentified phospholipids. Its genomic DNA G+C content was 66.3%. Based on data obtained from this polyphasic taxonomic study, strain CR182T represents a novel species belonging to genus Paucibacter, for which a name of P. aquatile sp. nov. is proposed. The type strain is CR182T (=?KCCM 90284T?=?NBRC 113032T).


July 7, 2019

The ‘gifted’ actinomycete Streptomyces leeuwenhoekii.

Streptomyces leeuwenhoekii strains C34T, C38, C58 and C79 were isolated from a soil sample collected from the Chaxa Lagoon, located in the Salar de Atacama in northern Chile. These streptomycetes produce a variety of new specialised metabolites with antibiotic, anti-cancer and anti-inflammatory activities. Moreover, genome mining performed on two of these strains has revealed the presence of biosynthetic gene clusters with the potential to produce new specialised metabolites. This review focusses on this new clade of Streptomyces strains, summarises the literature and presents new information on strain C34T.


July 7, 2019

Complete genome sequence of a heavy metal resistant bacterium Maribacter cobaltidurans B1T, isolated from the deep-sea sediment of the South Atlantic Ocean

Many bacteria in the environment have adopted to the presence of toxic heavy metals. Here we present the complete genome sequence of a heavy metal resistant bacterium, Maribacter cobaltidurans B1T (=CGMCC 1.15508T=KCTC 52882T=MCCC 1K03318T), which was isolated from a deep-sea sediment sample collected from the South Atlantic Ocean. Strain B1T is able to resist high concentrations of Co2+ (10.0mM) in Marine Agar 2216. The genome of strain B1T comprises 4,639,957bp in a circular chromosome with G+C content of 39.7mol%. Resistance to Co2+ is mainly based on efflux system in the genome of stain B1T, including czcCBA operons, czcD genes, corC genes, etc. Comparing with the closely related species M. orientalis DSM 16471T, the genome of B1T harbors twenty more copies of genes in czcCBA operon and two copies of the czcD genes related to Co2+ efflux. The function of these genes may contribute to the high level of cobalt resistance, revealing its potential application in biotechnological industry.


July 7, 2019

The complete genome sequence of Colwellia sp. NB097-1 reveals evidence for the potential genetic basis for its adaptation to cold environment

Colwellia sp. NB097-1, isolated from a marine sediment sample from the Bering Sea, is a psychrophilic bacterium whose optimal and maximal growth temperatures were 13 and 25°C, respectively. Here, we present the complete genome of Colwellia sp. NB097-1, which was 4,661,274bp in length with a GC content of 38.5%. The genome provided evidence for the potential genetic basis for its adaptation to a cold environment, such as producing compatible solutes and cold-shock proteins, increasing membrane fluidity and synthesizing glycogen. Some cold-adaptive proteases were also detected in the genome of Colwellia sp. NB097-1. Protease activity analysis further showed that extracellular proteases of Colwellia sp. NB097-1 remained active at low temperatures. The complete genome sequence may be helpful to reveal how this strain survives at low temperature and to find cold-adaptive proteases that may be useful to industry.


July 7, 2019

Complete genome of Halomonas aestuarii Hb3, isolated from tidal flat

Halomonas aestuarii Hb3, a moderately halophilic bacterium belonging to the class Gammaproteobacteria, was isolated from a tidal flat. Herein, we report the complete genome sequence of its strain Hb3. Its size is estimated at 3.54Mbp with a mean G+C content of 67.9%. The genome includes 3238 open reading frames, 65 transfer RNAs, and four ribosomal RNA gene operons. Genes related to the degradation of monoaromatic compounds, detoxification of arsenic, and production of polymers were identified. These features indicate that this strain may be important for ecological and industrial application.


July 7, 2019

Complete genome sequence of Siansivirga zeaxanthinifaciens CC-SAMT-1T, a flavobacterium isolated from coastal surface seawater

Here we present the complete genome sequence of Siansivirga zeaxanthinifaciens CC-SAMT-1T, a flavobacterium isolated from coastal surface seawater. A 3.3Mb genome revealed remarkable specialization of this bacterium particularly in the degradation of sulfated polysaccharides available as detritus or in dissolved phase. Besides utilizing high molecular weight organic biopolymers, this strain appears to accomplish assimilatory sulfate reduction, sulfide oxidation, and acquisition and inter-conversion of inorganic carbon. Genes encoding zeaxanthin and three different kinds of DNA photolyase/cryptochrome (senses blue light) were present, while genes that code for blue light sensing BLUF domain proteins and red/far-red light sensing phytochromes were absent. Furthermore, CC-SAMT-1T lacked the rhodopsin photosystem and all other genes that confer any other known forms of phototrophy. The genomic data revealed that CC-SAMT-1T is highly adapted to sulfur-rich coastal environments, where it most likely contributes to marine carbon and sulfur cycles by metabolizing sulfated polysaccharides as well as inorganic sulfur.


July 7, 2019

Complete genomes of the marine flavobacterium Nonlabens strains YIK11 and MIC269

Here, we report the complete genome sequences of two strains, which were isolated from sediment samples collected in Korea and Micronesia, and both were classified as members of Nonlabens spp. The complete genome sequence of Nonlabens sp. strain YIK11 consists of 3,260,677bp in two contigs while the one from strain MIC269 consists of 2,884,293bp in one contig, without plasmid. The genomes of YIK11 and MIC269 contain three and two genes encoding rhodopsins of different types, respectively.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.