Genome assembly Archives - Page 153 of 196

July 7, 2019

Identification and resolution of microdiversity through metagenomic sequencing of parallel consortia.

To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled the de novo reconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 of the 20 detected member species. Two Halomonas spp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of the Halomonas populations, one of the Rhodobacteraceae populations, and the Rhizobiales population. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

July 7, 2019

Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

July 7, 2019

Complete genome sequence of Pandoraea thiooxydans DSM 25325(T), a thiosulfate-oxidizing bacterium.

Pandoraea thiooxydans DSM 25325(T) is a thiosulfate-oxidizing bacterium isolated from rhizosphere soils of a sesame plant. Here, we present the first complete genome of P. thiooxydans DSM 25325(T). Several genes involved in thiosulfate oxidation and biodegradation of aromatic compounds were identified. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

Genome mining of astaxanthin biosynthetic genes from Sphingomonas sp. ATCC 55669 for heterologous overproduction in Escherichia coli.

As a highly valued keto-carotenoid, astaxanthin is widely used in nutritional supplements and pharmaceuticals. Therefore, the demand for biosynthetic astaxanthin and improved efficiency of astaxanthin biosynthesis has driven the investigation of metabolic engineering of native astaxanthin producers and heterologous hosts. However, microbial resources for astaxanthin are limited. In this study, we found that the a-Proteobacterium Sphingomonas sp. ATCC 55669 could produce astaxanthin naturally. We used whole-genome sequencing to identify the astaxanthin biosynthetic pathway using a combined PacBio-Illumina approach. The putative astaxanthin biosynthetic pathway in Sphingomonas sp. ATCC 55669 was predicted. For further confirmation, a high-efficiency targeted engineering carotenoid synthesis platform was constructed in E. coli for identifying the functional roles of candidate genes. All genes involved in astaxanthin biosynthesis showed discrete distributions on the chromosome. Moreover, the overexpression of exogenous E. coli idi in Sphingomonas sp. ATCC 55669 increased astaxanthin production by 5.4-fold. This study described a new astaxanthin producer and provided more biosynthesis components for bioengineering of astaxanthin in the future. © 2015 The Authors. Biotechnology Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

July 7, 2019

Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques.

July 7, 2019

Complete genome sequence of the Variibacter gotjawalensis GJW-30(T) from soil of lava forest, Gotjawal.

Variibacter gotjawalensis GJW-30(T) is a gram-negative, strictly aerobic bacterium to form pleomorphic. Here we present the 4.5-Mb genome sequence of the type strain of V. gotjawalensis GJW-30(T), which consists a chromosome for the total 4,586,237bp with a G+C content of 62.2mol%. This is the first report of the full genome sequence of a species of the novel genus Variibacter isolated from Gotjawal, a unique area in Jeju, Republic of Korea. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome of Pseudoalteromonas phenolica KCTC 12086(T) (= O-BC30(T)), a marine bacterium producing polybrominated aromatic compounds.

Pseudoalteromonas phenolica is a Gram-negative, rod-shaped, flagellated, aerobic, antibiotic-producing bacterium that was isolated from seawater off Ogasawara Island, Japan. Here, we report the complete genome of P. phenolica KCTC 12086(T) (= O-BC30(T)), which consists of 4,868,993 bp (G+C content of 40.6%) with two chromosomes, 4168 protein-coding genes, 113 tRNAs and 9 rRNA operons. In addition, several genes related to phenolic anti-methicillin-resistant Staphylococcus aureus substances were detected in the genome suggesting that biosynthesis of industrially important polybrominated aromatic compounds could be better understood with the availability of genome data of P. phenolica. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

hybridSPAdes: an algorithm for hybrid assembly of short and long reads.

Recent advances in single molecule real-time (SMRT) and nanopore sequencing technologies have enabled high-quality assemblies from long and inaccurate reads. However, these approaches require high coverage by long reads and remain expensive. On the other hand, the inexpensive short reads technologies produce accurate but fragmented assemblies. Thus, a hybrid approach that assembles long reads (with low coverage) and short reads has a potential to generate high-quality assemblies at reduced cost.We describe hybridSPAdes algorithm for assembling short and long reads and benchmark it on a variety of bacterial assembly projects. Our results demonstrate that hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. We further present the first complete assembly of a genome from single cells using SMRT reads.hybridSPAdes is implemented in C++?as a part of SPAdes genome assembler and is publicly available at http://bioinf.spbau.ru/en/spades CONTACT: d.antipov@spbu.ruSupplementary information: supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

Complete genome sequence of an aromatic compound degrader Arthrobacter sp. YC-RL1.

Arthrobacter sp. YC-RL1, isolated from a petroleum-contaminated soil, is capable of degrading and utilizing a wide range of aromatic compounds for growth. Here we report the complete genome sequence of strain YC-RL1, which may facilitate the investigation of environmental bioremediation and provide new gene resources for biotechnology and gene engineering. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome sequence of Celeribacter marinus IMCC12053(T), the host strain of marine bacteriophage P12053L.

Isolated from coastal seawater from Yellow Sea of Korea, Celeribacter marinus IMCC12053 was used as the host bacterium for bacteriophage P12053L. Here we report the complete genome sequence of strain IMCC12053 for further study of the marine bacteriophage P12053L functional genes. Single molecule real-time technology (PacBio RSII) was used for the single circular chromosome that is 3,096,705 base pairs in length and the GC content is 56.24%. It contains 3155 ORFs with 45 tRNAs and 6 rRNAs genes. N(6)-methyladenosine patterns were also investigated for 32 unmethylated genes and intergenic regions that covered many regulators and phage genes as well as ribosomal RNA genes and tRNA genes. Cryptic N(4)-methylcytosine pattern was investigated to speculate GpC methylase activity throughout the genome. Comparative genomics with other Celeribacter genomes were carried out for polyaromatic hydrocarbon degradation, but there were no aromatic ring oxygenases in IMCC12053 when compared to Celeribacter indicus P73. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

The complete genome sequence of a lactic acid bacterium Leuconostoc mesenteroides ssp. dextranicum strain DSM 20484(T).

Leuconostoc species are widespread in the natural environment and play an important role in several types of industrial and food fermentation processes. Here, we report the 1,854,727-bp complete genome sequence of the Leuconostoc mesenteroides ssp. dextranicum strain DSM 20484(T). Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome sequence of Agarivorans gilvus WH0801(T), an agarase-producing bacterium isolated from seaweed.

Agarivorans gilvus WH0801(T), an agarase-producing bacterium, was isolated from the surface of seaweed. Here, we present the complete genome sequence, which consists of one circular chromosome of 4,416,600bp with a GC content of 45.9%. This genetic information will provide insight into biotechnological applications of producing agar for food and industry. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019

Complete genome sequence of Acinetobacter baumannii XH386 (ST208), a multi-drug resistant bacteria isolated from pediatric hospital in China.

Acinetobacter baumannii is an important bacterium that emerged as a significant nosocomial pathogen worldwide. The rise of A. baumannii was due to its multi-drug resistance (MDR), while it was difficult to treat multi-drug resistant A. baumannii with antibiotics, especially in pediatric patients for the therapeutic options with antibiotics were quite limited in pediatric patients. A. baumannii ST208 was identified as predominant sequence type of carbapenem resistant A. baumannii in the United States and China. As we knew, there was no complete genome sequence reproted for A. baumannii ST208, although several whole genome shotgun sequences had been reported. Here, we sequenced the 4087-kilobase (kb) chromosome and 112-kb plasmid of A. baumannii XH386 (ST208), which was isolated from a pediatric hospital in China. The genome of A. baumannii XH386 contained 3968 protein-coding genes and 94 RNA-only encoding genes. Genomic analysis and Minimum inhibitory concentration assay showed that A. baumannii XH386 was multi-drug resistant strain, which showed resistance to most of antibiotics, except for tigecycline. The data may be accessed via the GenBank accession number CP010779 and CP010780.

July 7, 2019

MuffinEc: Error correction for de novo assembly via greedy partitioning and sequence alignment

Error correction is typically the first step of de novo genome assembly from NGS data. This step has an important impact on the quality and speed of the assembly process. However, the majority of available stand-alone error correction solutions can only detect and correct mismatches. Therefore, these solutions only support correcting reads generated by Illumina sequencers. Several solutions support insertions and deletions (indels) and are capable of working with multiple technologies. However, these solutions are limited by correction performance and resource consumption. In this paper, we introduce MuffinEc, an indel-aware multi-technology correction method for NGS data. This method uses a greedy approach to create groups of reads and subsequently corrects them using their consensus. MuffinEc surpasses existing solutions by offering better correction ratios for multiple technologies. This method also exploits parallel processing via OpenMP and uses less computational resources than similar programs, thereby being capable of handling large datasets. MuffinEc is open source and freely available at http://muffinec.sourceforge.net.

July 7, 2019

Long read and single molecule DNA sequencing simplifies genome assembly and TAL effector gene analysis of Xanthomonas translucens.

The species Xanthomonas translucens encompasses a complex of bacterial strains that cause diseases and yield loss on grass species including important cereal crops. Three pathovars, X. translucens pv. undulosa, X. translucens pv. translucens and X. translucens pv.cerealis, have been described as pathogens of wheat, barley, and oats. However, no complete genome sequence for a strain of this complex is currently available.A complete genome sequence of X. translucens pv. undulosa strain XT4699 was obtained by using PacBio long read, single molecule, real time (SMRT) DNA sequences and Illumina sequences. Draft genome sequences of nineteen additional X. translucens strains, which were collected from wheat or barley in different regions and at different times, were generated by Illumina sequencing. Phylogenetic relationships among different Xanthomonas strains indicates that X. translucens are members of a distinct clade from so-called group 2 xanthomonads and three pathovars of this species, undulosa, translucens and cerealis, represent distinct subclades in the group 1 clade. Knockout mutation of type III secretion system of XT4699 eliminated the ability to cause water-soaking symptoms on wheat and barley and resulted in a reduction in populations on wheat in comparison to the wild type strain. Sequence comparison of X. translucens strains revealed the genetic variation on type III effector repertories among different pathovars or within one pathovar. The full genome sequence of XT4699 reveals the presence of eight members of the Transcription-Activator Like (TAL) effector genes, which are phylogenetically distant from previous known TAL effector genes of group 2 xanthomonads. Microarray and qRT-PCR analyses revealed TAL effector-specific wheat gene expression modulation.PacBio long read sequencing facilitates the assembly of Xanthomonas genomes and the multiple TAL effector genes, which are difficult to assemble from short read platforms. The complete genome sequence of X. translucens pv. undulosa strain XT4699 and draft genome sequences of nineteen additional X. translucens strains provides a resource for further genetic analyses of pathogenic diversity and host range of the X. translucens species complex. TAL effectors of XT4699 strain play roles in modulating wheat host gene expressions.

Auto Tag: Genome assembly

Identification and resolution of microdiversity through metagenomic sequencing of parallel consortia.

Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

Complete genome sequence of Pandoraea thiooxydans DSM 25325(T), a thiosulfate-oxidizing bacterium.

Genome mining of astaxanthin biosynthetic genes from Sphingomonas sp. ATCC 55669 for heterologous overproduction in Escherichia coli.

Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery.

Complete genome sequence of the Variibacter gotjawalensis GJW-30(T) from soil of lava forest, Gotjawal.

Complete genome of Pseudoalteromonas phenolica KCTC 12086(T) (= O-BC30(T)), a marine bacterium producing polybrominated aromatic compounds.

hybridSPAdes: an algorithm for hybrid assembly of short and long reads.

Complete genome sequence of an aromatic compound degrader Arthrobacter sp. YC-RL1.

Complete genome sequence of Celeribacter marinus IMCC12053(T), the host strain of marine bacteriophage P12053L.

The complete genome sequence of a lactic acid bacterium Leuconostoc mesenteroides ssp. dextranicum strain DSM 20484(T).

Complete genome sequence of Agarivorans gilvus WH0801(T), an agarase-producing bacterium isolated from seaweed.

Complete genome sequence of Acinetobacter baumannii XH386 (ST208), a multi-drug resistant bacteria isolated from pediatric hospital in China.

MuffinEc: Error correction for de novo assembly via greedy partitioning and sequence alignment

Long read and single molecule DNA sequencing simplifies genome assembly and TAL effector gene analysis of Xanthomonas translucens.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert