Menu
July 7, 2019

Draft genome assembly of the sheep scab mite, Psoroptes ovis.

Sheep scab, caused by infestation with Psoroptes ovis, is highly contagious, results in intense pruritus, and represents a major welfare and economic concern. Here, we report the first draft genome assembly and gene prediction of P. ovis based on PacBio de novo sequencing. The ~63.2-Mb genome encodes 12,041 protein-coding genes. Copyright © 2018 Burgess et al.


July 7, 2019

Complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from diseased olive flounder (Paralichthys olivaceus).

Here, we report the complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from the kidney of a diseased olive flounder in South Korea in 2013. Sequencing using a PacBio platform yielded a circular chromosome of 2,128,740?bp and a plasmid of 23,538?bp, harboring 2,123 and 24 protein-coding genes, respectively. Copyright © 2018 Lee et al.


July 7, 2019

Draft genome sequence of the phytopathogenic fungus Ganoderma boninense, the causal agent of basal stem rot disease on oil palm.

Ganoderma boninense is the dominant fungal pathogen of basal stem rot (BSR) disease on Elaeis guineensis We sequenced the nuclear genome of mycelia using both Illumina and Pacific Biosciences platforms for assembly of scaffolds. The draft genome comprised 79.24?Mb, 495 scaffolds, and 26,226 predicted coding sequences. Copyright © 2018 Utomo et al.


July 7, 2019

Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01.

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.


July 7, 2019

A fast approximate algorithm for mapping long reads to large reference databases.

Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290?×?faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each =5?kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.


July 7, 2019

Satellite DNA evolution: old ideas, new approaches.

A substantial portion of the genomes of most multicellular eukaryotes consists of large arrays of tandemly repeated sequence, collectively called satellite DNA. The processes generating and maintaining different satellite DNA abundances across lineages are important to understand as satellites have been linked to chromosome mis-segregation, disease phenotypes, and reproductive isolation between species. While much theory has been developed to describe satellite evolution, empirical tests of these models have fallen short because of the challenges in assessing satellite repeat regions of the genome. Advances in computational tools and sequencing technologies now enable identification and quantification of satellite sequences genome-wide. Here, we describe some of these tools and how their applications are furthering our knowledge of satellite evolution and function. Copyright © 2018 Elsevier Ltd. All rights reserved.


July 7, 2019

Tigmint: correcting assembly errors using linked reads from large molecules.

Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap.To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing.Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone.


July 7, 2019

To B or not to B: a tale of unorthodox chromosomes.

Highlights • B chromosomes are dispensable parts of the karyotype of many eukaryotes. • Deemed genome parasites in plants and animals, provide advantage to pathogenic fungi. • Often enriched in repeats and in fast evolving pathogenicity-related genes. • B chromosomes are not a uniform class, share certain features with core chromosomes.


July 7, 2019

Rhodobacter sp. Rb3, an aerobic anoxygenic phototroph which thrives in the polyextreme ecosystem of the Salar de Huasco, in the Chilean Altiplano.

The Salar de Huasco is an evaporitic basin located in the Chilean Altiplano, which presents extreme environmental conditions for life, i.e. high altitude (3800 m.a.s.l.), negative water balance, a wide salinity range, high daily temperature changes and the occurrence of the highest registered solar radiation on the planet (>?1200 W m-2). This ecosystem is considered as a natural laboratory to understand different adaptations of microorganisms to extreme conditions. Rhodobacter, an anoxygenic aerobic phototrophic bacterial genus, represents one of the most abundant groups reported based on taxonomic diversity surveys in this ecosystem. The bacterial mat isolate Rhodobacter sp. strain Rb3 was used to study adaptation mechanisms to stress-inducing factors potentially explaining its success in a polyextreme ecosystem. We found that the Rhodobacter sp. Rb3 genome was characterized by a high abundance of genes involved in stress tolerance and adaptation strategies, among which DNA repair and oxidative stress were the most conspicuous. Moreover, many other molecular mechanisms associated with oxidative stress, photooxidation and antioxidants; DNA repair and protection; motility, chemotaxis and biofilm synthesis; osmotic stress, metal, metalloid and toxic anions resistance; antimicrobial resistance and multidrug pumps; sporulation; cold shock and heat shock stress; mobile genetic elements and toxin-antitoxin system were detected and identified as potential survival mechanism features in Rhodobacter sp. Rb3. In total, these results reveal a wide set of strategies used by the isolate to adapt and thrive under environmental stress conditions as a model of polyextreme environmental resistome.


July 7, 2019

Correction of persistent errors in arabidopsis reference mitochondrial genomes.

Arabidopsis thaliana remains the foremost model system for plant genetics and genomics, and researchers rely on the accuracy of its genomic resources. The first completely sequenced angiosperm mitochondrial genome was obtained from Arabidopsis C24 (Unseld et al., 1997), and more recent efforts have produced additional Arabidopsis reference genomes, including one for Col-0, the most widely used ecotype in plant genetic research (Davila et al., 2011). These studies were based on older DNA sequencing methods, making them subject to errors associated with lower levels of sequencing coverage or the extremely short read lengths produced by early-generation Illumina technologies. Indeed, although the more recently published Arabidopsis mitochondrial reference genome sequences made substantial progress in improving upon earlier versions, they still have high error rates. By comparing publicly available Illumina sequence data to the Arabidopsis Col-0 reference genome, we found that it contains a sequence error every 2.4 kb on average, including 57 single-nucleotide polymorphisms (SNPs), 96 indels (up to 901 bp in size), and a large repeat-mediated rearrangement. Most of these errors appear to have been carried over from the original Arabidopsis mitochondrial genome sequence by reference-based assembly approaches, which has misled subsequent studies of plant mitochondrial mutation and molecular evolution by giving the false impression that the errors are naturally occurring variants present in multiple ecotypes. Building on the progress made by previous researchers, we provide a corrected reference sequence that we hope will serve as a useful community resource for future investigations in the field of plant mitochondrial genetics.


July 7, 2019

PlasmidTron: assembling the cause of phenotypes and genotypes from NGS data.

Increasingly rich metadata are now being linked to samples that have been whole-genome sequenced. However, much of this information is ignored. This is because linking this metadata to genes, or regions of the genome, usually relies on knowing the gene sequence(s) responsible for the particular trait being measured and looking for its presence or absence in that genome. Examples of this would be the spread of antimicrobial resistance genes carried on mobile genetic elements (MGEs). However, although it is possible to routinely identify the resistance gene, identifying the unknown MGE upon which it is carried can be much more difficult if the starting point is short-read whole-genome sequence data. The reason for this is that MGEs are often full of repeats and so assemble poorly, leading to fragmented consensus sequences. Since mobile DNA, which can carry many clinically and ecologically important genes, has a different evolutionary history from the host, its distribution across the host population will, by definition, be independent of the host phylogeny. It is possible to use this phenomenon in a genome-wide association study to identify both the genes associated with the specific trait and also the DNA linked to that gene, for example the flanking sequence of the plasmid vector on which it is encoded, which follows the same patterns of distribution as the marker gene/sequence itself. We present PlasmidTron, which utilizes the phenotypic data normally available in bacterial population studies, such as antibiograms, virulence factors, or geographical information, to identify traits that are likely to be present on DNA that can randomly reassort across defined bacterial populations. It is also possible to use this methodology to associate unknown genes/sequences (e.g. plasmid backbones) with a specific molecular signature or marker (e.g. resistance gene presence or absence) using PlasmidTron. PlasmidTron uses a k-mer-based approach to identify reads associated with a phylogenetically unlinked phenotype. These reads are then assembled de novo to produce contigs in a fast and scalable-to-large manner. PlasmidTron is written in Python 3 and is available under the open source licence GNU GPL3 from https://github.com/sanger-pathogens/plasmidtron.


July 7, 2019

Complete genome sequence of Microcystis aeruginosa NIES-2481 and common genomic features of group G M. aeruginosa.

Microcystis aeruginosa is a freshwater bloom-forming cyanobacterium that is distributed worldwide. M. aeruginosa can be divided into at least 8 phylogenetic groups (A-G and X) at the intraspecific level. Here, we report the complete genome sequence of M. aeruginosa NIES-2481, which was isolated from Lake Kasumigaura, Japan, and is assigned to group G. The complete genome sequence of M. aeruginosa NIES-2481 comprises a 4.29-Mbp circular chromosome and a 147,539-bp plasmid; the circular chromosome and the plasmid contain 4,332 and 167 protein-coding genes, respectively. Comparative analysis with the complete genome of M. aeruginosa NIES-2549, which belongs to the same group with NIES-2481, showed that the genome size is the smallest level in previously sequenced M. aeruginosa strains, and the genomes do not contain a microcystin biosynthetic gene cluster in common. Synteny analysis revealed only small-scale rearrangements between the two genomes.


July 7, 2019

Complete genome sequences of two Bacillus pumilus strains from Cuatrociénegas, Coahuila, Mexico.

We assembled the complete genome sequences of Bacillus pumilus strains 145 and 150a from Cuatrociénegas, Mexico. We detected genes codifying for proteins potentially involved in antagonism (bacteriocins) and defense mechanisms (abortive infection bacteriophage proteins and 4-azaleucine resistance). Both strains harbored prophage sequences. Our results provide insights into understanding the establishment of microbial interactions. Copyright © 2018 Zarza et al.


July 7, 2019

Complete genome sequence of the poly-?-glutamate-synthesizing Bacterium Bacillus subtilis Bs-115.

Bacillus subtilis Bs-115 was isolated from the soil of a corn field in Yutai County, Jinan City, Shandong Province, People’s Republic of China, and is characterized by the efficient synthesis of poly-?-glutamate (?-PGA), with corn saccharification liquid as the sole energy and carbon source during the process of ?-PGA formation. Here, we report the complete genome sequence of Bacillus subtilis Bs-115 and the genes associated with poly-?-glutamate synthesis. Copyright © 2018 Wang et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.