Bioinformatics Archives - Page 255 of 267

July 7, 2019

Draft genome sequence of the phytopathogenic fungus Ganoderma boninense, the causal agent of basal stem rot disease on oil palm.

Ganoderma boninense is the dominant fungal pathogen of basal stem rot (BSR) disease on Elaeis guineensis We sequenced the nuclear genome of mycelia using both Illumina and Pacific Biosciences platforms for assembly of scaffolds. The draft genome comprised 79.24?Mb, 495 scaffolds, and 26,226 predicted coding sequences. Copyright © 2018 Utomo et al.

July 7, 2019

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.

July 7, 2019

A fast approximate algorithm for mapping long reads to large reference databases.

Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290?×?faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each =5?kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.

July 7, 2019

Satellite DNA evolution: old ideas, new approaches.

A substantial portion of the genomes of most multicellular eukaryotes consists of large arrays of tandemly repeated sequence, collectively called satellite DNA. The processes generating and maintaining different satellite DNA abundances across lineages are important to understand as satellites have been linked to chromosome mis-segregation, disease phenotypes, and reproductive isolation between species. While much theory has been developed to describe satellite evolution, empirical tests of these models have fallen short because of the challenges in assessing satellite repeat regions of the genome. Advances in computational tools and sequencing technologies now enable identification and quantification of satellite sequences genome-wide. Here, we describe some of these tools and how their applications are furthering our knowledge of satellite evolution and function. Copyright © 2018 Elsevier Ltd. All rights reserved.

July 7, 2019

Tigmint: correcting assembly errors using linked reads from large molecules.

Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap.To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing.Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone.

July 7, 2019

PlasmidTron: assembling the cause of phenotypes and genotypes from NGS data.

Increasingly rich metadata are now being linked to samples that have been whole-genome sequenced. However, much of this information is ignored. This is because linking this metadata to genes, or regions of the genome, usually relies on knowing the gene sequence(s) responsible for the particular trait being measured and looking for its presence or absence in that genome. Examples of this would be the spread of antimicrobial resistance genes carried on mobile genetic elements (MGEs). However, although it is possible to routinely identify the resistance gene, identifying the unknown MGE upon which it is carried can be much more difficult if the starting point is short-read whole-genome sequence data. The reason for this is that MGEs are often full of repeats and so assemble poorly, leading to fragmented consensus sequences. Since mobile DNA, which can carry many clinically and ecologically important genes, has a different evolutionary history from the host, its distribution across the host population will, by definition, be independent of the host phylogeny. It is possible to use this phenomenon in a genome-wide association study to identify both the genes associated with the specific trait and also the DNA linked to that gene, for example the flanking sequence of the plasmid vector on which it is encoded, which follows the same patterns of distribution as the marker gene/sequence itself. We present PlasmidTron, which utilizes the phenotypic data normally available in bacterial population studies, such as antibiograms, virulence factors, or geographical information, to identify traits that are likely to be present on DNA that can randomly reassort across defined bacterial populations. It is also possible to use this methodology to associate unknown genes/sequences (e.g. plasmid backbones) with a specific molecular signature or marker (e.g. resistance gene presence or absence) using PlasmidTron. PlasmidTron uses a k-mer-based approach to identify reads associated with a phylogenetically unlinked phenotype. These reads are then assembled de novo to produce contigs in a fast and scalable-to-large manner. PlasmidTron is written in Python 3 and is available under the open source licence GNU GPL3 from https://github.com/sanger-pathogens/plasmidtron.

July 7, 2019

Complete genome sequence of Microcystis aeruginosa NIES-2481 and common genomic features of group G M. aeruginosa.

Microcystis aeruginosa is a freshwater bloom-forming cyanobacterium that is distributed worldwide. M. aeruginosa can be divided into at least 8 phylogenetic groups (A-G and X) at the intraspecific level. Here, we report the complete genome sequence of M. aeruginosa NIES-2481, which was isolated from Lake Kasumigaura, Japan, and is assigned to group G. The complete genome sequence of M. aeruginosa NIES-2481 comprises a 4.29-Mbp circular chromosome and a 147,539-bp plasmid; the circular chromosome and the plasmid contain 4,332 and 167 protein-coding genes, respectively. Comparative analysis with the complete genome of M. aeruginosa NIES-2549, which belongs to the same group with NIES-2481, showed that the genome size is the smallest level in previously sequenced M. aeruginosa strains, and the genomes do not contain a microcystin biosynthetic gene cluster in common. Synteny analysis revealed only small-scale rearrangements between the two genomes.

July 7, 2019

Complete genome sequence of the poly-?-glutamate-synthesizing Bacterium Bacillus subtilis Bs-115.

Bacillus subtilis Bs-115 was isolated from the soil of a corn field in Yutai County, Jinan City, Shandong Province, People’s Republic of China, and is characterized by the efficient synthesis of poly-?-glutamate (?-PGA), with corn saccharification liquid as the sole energy and carbon source during the process of ?-PGA formation. Here, we report the complete genome sequence of Bacillus subtilis Bs-115 and the genes associated with poly-?-glutamate synthesis. Copyright © 2018 Wang et al.

July 7, 2019

Reevaluation of the complete genome sequence of Magnetospirillum gryphiswaldense MSR-1 with Single-Molecule Real-Time Sequencing data.

Magnetospirillum gryphiswaldense is a key organism for understanding magnetosome formation and magnetotaxis. As earlier studies suggested a high genomic plasticity, we (re)sequenced the type strain MSR-1 and the laboratory strain R3/S1. Both sequences differ by only 11 point mutations, but organization of the magnetosome island deviates from that of previous genome sequences. Copyright © 2018 Uebe et al.

July 7, 2019

Complete genome sequence of the heavy-Metal-tolerant endophytic type strain of Salinicola tamaricis.

The first complete genome sequence of a recently described Salinicola tamaricis species was determined for the strain F01T (=CCTCC AB 2015304T =KCTC 42855T). The strain was isolated from the leaves of wetland plant Tamarix chinensis Lour and shows a high tolerance to heavy metals, such as manganese, nickel, lead, and copper ions. Copyright © 2018 Shang et al.

July 7, 2019

Complete genome sequence of Methylomonas denitrificans strain FJG1, an obligate aerobic methanotroph that can couple methane oxidation with denitrification.

Methylomonas denitrificans strain FJG1 is a member of the gammaproteobacterial methanotrophs. The sequenced genome of FJG1 reveals the presence of genes that encode methane, methanol, formaldehyde, and formate oxidation. It also contains genes that encode enzymes for nitrate reduction to nitrous oxide, consistent with the ability of FJG1 to couple denitrification with methane oxidation. Copyright © 2018 Orata et al.

July 7, 2019

Complete genome sequence of Lactobacillus plantarum subsp. plantarum strain LB1-2, Iiolated from the hindgut of European honeybees, Apis mellifera L., from the Philippines.

Lactobacillus plantarum subsp. plantarum strain LB1-2, isolated from the hindgut of European honeybees in the Philippines, is active against Paenibacillus larvae and has broad activity against several Gram-positive and Gram-negative bacteria. The complete genome sequence reported herein contains gene clusters for multiple bacteriocins and extensive gene inventories for carbohydrate metabolism. Copyright © 2018 Ilagan-Cruzada et al.

July 7, 2019

Draft genome sequence of Paucibacter aquatile CR182T, a strain with antimicrobial activity isolated from freshwater of Nakdong River in South Korea.

This report details a draft genome sequence of Paucibacter aquatile CR182T, isolated from river water, which contains 5,523,543?bp, has a G+C content of 66.3%, and harbors 4,544 protein-coding genes in 4 contigs. These genome data provide insights into the genetic basis of this strain’s antibacterial activity and adaptive mechanisms. Copyright © 2018 Chung et al.

July 7, 2019

The sequenced angiosperm genomes and genome databases.

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

July 7, 2019

Complete genome sequence of Loktanella vestfoldensis strain SMR4r, a novel strain isolated from a culture of the chain-forming diatom Skeletonema marinoi.

We report here the genome sequence of Loktanella vestfoldensis strain SMR4r, isolated from the marine diatom Skeletonema marinoi strain RO5AC. Its 3,987,360-bp genome consists of a circular chromosome and two circular plasmids, one of which appears to be shared with an S. marinoi-associated Roseovarius species. Copyright © 2018 Töpel et al.

Auto Tag: Bioinformatics

Draft genome sequence of the phytopathogenic fungus Ganoderma boninense, the causal agent of basal stem rot disease on oil palm.

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

A fast approximate algorithm for mapping long reads to large reference databases.

Satellite DNA evolution: old ideas, new approaches.

Tigmint: correcting assembly errors using linked reads from large molecules.

PlasmidTron: assembling the cause of phenotypes and genotypes from NGS data.

Complete genome sequence of Microcystis aeruginosa NIES-2481 and common genomic features of group G M. aeruginosa.

Complete genome sequence of the poly-?-glutamate-synthesizing Bacterium Bacillus subtilis Bs-115.

Reevaluation of the complete genome sequence of Magnetospirillum gryphiswaldense MSR-1 with Single-Molecule Real-Time Sequencing data.

Complete genome sequence of the heavy-Metal-tolerant endophytic type strain of Salinicola tamaricis.

Complete genome sequence of Methylomonas denitrificans strain FJG1, an obligate aerobic methanotroph that can couple methane oxidation with denitrification.

Complete genome sequence of Lactobacillus plantarum subsp. plantarum strain LB1-2, Iiolated from the hindgut of European honeybees, Apis mellifera L., from the Philippines.

Draft genome sequence of Paucibacter aquatile CR182T, a strain with antimicrobial activity isolated from freshwater of Nakdong River in South Korea.

The sequenced angiosperm genomes and genome databases.

Complete genome sequence of Loktanella vestfoldensis strain SMR4r, a novel strain isolated from a culture of the chain-forming diatom Skeletonema marinoi.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert