Genome assembly Archives - Page 180 of 196

July 7, 2019

Complete genome sequence of Escherichia coli 81009, a representative of the sequence type 131 C1-M27 clade with a multidrug-resistant phenotype.

The sequence type 131 (ST131)-H30 clone is responsible for a significant proportion of multidrug-resistant extraintestinal Escherichia coli infections. Recently, the C1-M27 clade of ST131-H30, associated with blaCTX-M-27, has emerged. The complete genome sequence of E. coli isolate 81009 belonging to this clone, previously used during the development of ST131-specific monoclonal antibodies, is reported here. Copyright © 2018 Mutti et al.

July 7, 2019

Complete genome sequences of two porcine enterotoxigenic Escherichia coli strains.

Enterotoxigenic Escherichia coli (ETEC) is one of the main causes of illness and death in neonatal and recently weaned pigs. Here, we sequenced the genomes of two ETEC strains that were previously used as inactivated vaccines in China.

July 7, 2019

High-quality complete genome sequences of three bovine Shiga toxin-producing Escherichia coli O177:H- (fliCH25) isolates harboring virulent stx2 and multiple plasmids.

Shiga toxin-producingEscherichia coli(STEC) bacteria are zoonotic pathogens. We report here the high-quality complete genome sequences of three STEC O177:H- (fliCH25) strains, SMN152SH1, SMN013SH2, and SMN197SH3. The assembled genomes consisted of one optical map-verified circular chromosome for each strain, plus two plasmids for SMN013SH2 and three plasmids for SMN152SH1 and SMN197SH3, respectively. Copyright © 2018 Sheng et al.

July 7, 2019

Complete genome sequence of the model oleaginous alga Nannochloropsis gaditana CCMP1894.

The model oleaginous alga Nannochloropsis gaditana was completely sequenced using a combination of optical mapping and next-generation sequencing technologies to generate one of the most complete eukaryotic genomes published to date. The assembled genome is 30.7?Mb long.

July 7, 2019

Draft genome sequence of an active heterotrophic nitrifier-denitrifier, Cupriavidus pauculus UM1.

Here, we present the draft genome sequence ofCupriavidus pauculusUM1, a metal-resistant heterotrophic nitrifier-denitrifier capable of synthesizing nitrite from pyruvic oxime. The size of the genome is 7,402,815 bp with a GC content of 64.8%. This draft assembly consists of 38 scaffolds. Copyright © 2018 Putonti et al.

July 7, 2019

Draft genome sequences of three strains of a novel Rhizobiales species isolated from forest soil.

Three strains of a novel Rhizobialesspecies were isolated from temperate deciduous forest soil in central Massachusetts. Their genomes consist of 9.09 to 10.29 Mb over 3 to 4 scaffolds each and indicate that diverse nitrogenous compounds are used by these organisms. Copyright © 2018 Pold et al.

July 7, 2019

Draft genome sequence of Rhodococcus sp. strain M8, which can degrade a broad range of nitriles.

Rhodococcus sp. strain M8 is a nitrile-degrading bacterium isolated from acrylonitrile-contaminated sites. This strain produces the enzymes for sequential nitrile degradation, cobalt-type nitrile hydratase, and amidase in large amounts. Its draft genome sequence, announced here, has an estimated size of 6.3 Mbp.

July 7, 2019

Draft genome sequence of Cyanobacterium sp. strain HL-69, isolated from a benthic microbial mat from a magnesium sulfate-dominated hypersaline lake.

The complete genome sequence ofCyanobacteriumsp. strain HL-69 consists of 3,155,247 bp and contains 2,897 predicted genes comprising a chromosome and two plasmids. The genome is consistent with a halophilic nondiazotrophic phototrophic lifestyle, and this organism is able to synthesize most B vitamins and produces several secondary metabolites. Copyright © 2018 Mobberley et al.

July 7, 2019

Complete genome sequence of a new halophilic archaeon, Haloarcula taiwanensis, isolated from a solar saltern in southern Taiwan.

We report here the completion of the genome sequence of a new species of haloarchaea, Haloarcula taiwanensis, isolated in southern Taiwan. The 3,721,706-bp genome consisted of chromosome I (2,966,258 bp, 63.6% GC content), chromosome II (525,233 bp, 59.6% GC content), plasmid pNYT1 (129,893 bp, 55.3% GC content), and plasmid pNYT2 (100,322 bp, 55.7% GC content).

July 7, 2019

Complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil.

We report here the complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil in eastern Massachusetts. We assembled PacBio reads into a single closed contig with 132× mean coverage and then polished this contig using Illumina MiSeq reads, yielding a 6,890,566-bp sequence with 61.1% GC content. Copyright © 2018 Cerra et al.

July 7, 2019

Complete genome sequence of Escherichia coli ML35.

We report here the complete genome sequence of Escherichia coli strain ML35. We assembled PacBio reads into a single closed contig with 169× mean coverage and then polished this contig using Illumina MiSeq reads, yielding a 4,918,774-bp sequence with 50.8% GC content. Copyright © 2018 Casale et al.

July 7, 2019

De novo genome assembly of a Plasmodium falciparum NF54 clone using Single-Molecule Real-Time Sequencing.

Plasmodium falciparum is the species of human malaria parasite that causes the most severe form of the disease. Here, we used single-molecule real-time (SMRT) sequencing technology from Pacific Biosciences (PacBio) to sequence, assemble de novo, and annotate the genome of a P. falciparum NF54 clone. Copyright © 2018 Bryant et al.

July 7, 2019

An empirical evaluation of error correction methods and tools for next generation sequencing data

esearch. However, data produced by NGS is affected by different errors such as substitutions, deletions or insertion. It is essential to differentiate between true biological variants and alterations occurred due to errors for accurate downstream analysis. Many types of methods and tools have been developed for NGS error correction. Some of these methods only correct substitutions errors whereas others correct multi types of data errors. In this article, a comprehensive evaluation of three types of methods (k-spectrum based, Multi- sequencing alignment and Hybrid based) is presented which are implemented and adopted by different tools. Experiments have been conducted to compare the performance based on runtime and error correction rate. Two different computing platforms have been used for the experiments to evaluate effectiveness of runtime and error correction rate. The mission and aim of this comparative evaluation is to provide recommendations for selection of suitable tools to cope with the specific needs of users and practitioners. It has been noticed that k-mer spectrum based methodology generated superior results as compared to other methods. Amongst all the tools being utilized, Racer has shown eminent performance in terms of error correction rate and execution time for both small as well as large data sets. In multisequence alignment based tools, Karect depicts excellent error correction rate whereas Coral shows better execution time for all data sets. In hybrid based tools, Jabba shows better error correction rate and execution time as compared to brownie. Computing platforms mostly affect execution time but have no general effect on error correction rate.

July 7, 2019

Ten steps to get started in Genome Assembly and Annotation.

As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).

July 7, 2019

FMLRC: Hybrid long read error correction using an FM-index.

Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy.We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods.Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.

Auto Tag: Genome assembly

Complete genome sequence of Escherichia coli 81009, a representative of the sequence type 131 C1-M27 clade with a multidrug-resistant phenotype.

Complete genome sequences of two porcine enterotoxigenic Escherichia coli strains.

High-quality complete genome sequences of three bovine Shiga toxin-producing Escherichia coli O177:H- (fliCH25) isolates harboring virulent stx2 and multiple plasmids.

Complete genome sequence of the model oleaginous alga Nannochloropsis gaditana CCMP1894.

Draft genome sequence of an active heterotrophic nitrifier-denitrifier, Cupriavidus pauculus UM1.

Draft genome sequences of three strains of a novel Rhizobiales species isolated from forest soil.

Draft genome sequence of Rhodococcus sp. strain M8, which can degrade a broad range of nitriles.

Draft genome sequence of Cyanobacterium sp. strain HL-69, isolated from a benthic microbial mat from a magnesium sulfate-dominated hypersaline lake.

Complete genome sequence of a new halophilic archaeon, Haloarcula taiwanensis, isolated from a solar saltern in southern Taiwan.

Complete genome sequence of Pseudomonas sp. strain NC02, isolated from soil.

Complete genome sequence of Escherichia coli ML35.

De novo genome assembly of a Plasmodium falciparum NF54 clone using Single-Molecule Real-Time Sequencing.

An empirical evaluation of error correction methods and tools for next generation sequencing data

Ten steps to get started in Genome Assembly and Annotation.

FMLRC: Hybrid long read error correction using an FM-index.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert