Menu
April 21, 2020

LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly.

Completing a genome is an important goal of genome assembly. However, many assemblies, including reference assemblies, are unfinished and have a number of gaps. Long reads obtained from third-generation sequencing (TGS) platforms can help close these gaps and improve assembly contiguity. However, current gap-closure approaches using long reads require extensive runtime and high memory usage. Thus, a fast and memory-efficient approach using long reads is needed to obtain complete genomes.We developed LR_Gapcloser to rapidly and efficiently close the gaps in genome assembly. This tool utilizes long reads generated from TGS sequencing platforms. Tested on de novo assembled gaps, repeat-derived gaps, and real gaps, LR_Gapcloser closed a higher number of gaps faster and with a lower error rate and a much lower memory usage than two existing, state-of-the art tools. This tool utilized raw reads to fill more gaps than when using error-corrected reads. It is applicable to gaps in the assemblies by different approaches and from large and complex genomes. After performing gap-closure using this tool, the contig N50 size of the human CHM1 genome was improved from 143 kb to 19 Mb, a 132-fold increase. We also closed the gaps in the Triticum urartu genome, a large genome rich in repeats; the contig N50 size was increased by 40%. Further, we evaluated the contiguity and correctness of six hybrid assembly strategies by combining the optimal TGS-based and next-generation sequencing-based assemblers with LR_Gapcloser. A proposed and optimal hybrid strategy generated a new human CHM1 genome assembly with marked contiguity. The contig N50 value was greater than 28 Mb, which is larger than previous non-reference assemblies of the diploid human genome.LR_Gapcloser is a fast and efficient tool that can be used to close gaps and improve the contiguity of genome assemblies. A proposed hybrid assembly including this tool promises reference-grade assemblies. The software is available at http://www.fishbrowser.org/software/LR_Gapcloser/.


April 21, 2020

De novo assembly of the goldfish (Carassius auratus) genome and the evolution of genes after whole-genome duplication.

For over a thousand years, the common goldfish (Carassius auratus) was raised throughout Asia for food and as an ornamental pet. As a very close relative of the common carp (Cyprinus carpio), goldfish share the recent genome duplication that occurred approximately 14 million years ago in their common ancestor. The combination of centuries of breeding and a wide array of interesting body morphologies provides an exciting opportunity to link genotype to phenotype and to understand the dynamics of genome evolution and speciation. We generated a high-quality draft sequence and gene annotations of a “Wakin” goldfish using 71X PacBio long reads. The two subgenomes in goldfish retained extensive synteny and collinearity between goldfish and zebrafish. However, genes were lost quickly after the carp whole-genome duplication, and the expression of 30% of the retained duplicated gene diverged substantially across seven tissues sampled. Loss of sequence identity and/or exons determined the divergence of the expression levels across all tissues, while loss of conserved noncoding elements determined expression variance between different tissues. This assembly provides an important resource for comparative genomics and understanding the causes of goldfish variants.


April 21, 2020

Long-read amplicon denoising.

Long-read next-generation amplicon sequencing shows promise for studying complete genes or genomes from complex and diverse populations. Current long-read sequencing technologies have challenging error profiles, hindering data processing and incorporation into downstream analyses. Here we consider the problem of how to reconstruct, free of sequencing error, the true sequence variants and their associated frequencies from PacBio reads. Called ‘amplicon denoising’, this problem has been extensively studied for short-read sequencing technologies, but current solutions do not always successfully generalize to long reads with high indel error rates. We introduce two methods: one that runs nearly instantly and is very accurate for medium length reads and high template coverage, and another, slower method that is more robust when reads are very long or coverage is lower. On two Mock Virus Community datasets with ground truth, each sequenced on a different PacBio instrument, and on a number of simulated datasets, we compare our two approaches to each other and to existing algorithms. We outperform all tested methods in accuracy, with competitive run times even for our slower method, successfully discriminating templates that differ by a just single nucleotide. Julia implementations of Fast Amplicon Denoising (FAD) and Robust Amplicon Denoising (RAD), and a webserver interface, are freely available. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020

Chromulinavorax destructans, a pathogen of microzooplankton that provides a window into the enigmatic candidate phylum Dependentiae.

Members of the major candidate phylum Dependentiae (a.k.a. TM6) are widespread across diverse environments from showerheads to peat bogs; yet, with the exception of two isolates infecting amoebae, they are only known from metagenomic data. The limited knowledge of their biology indicates that they have a long evolutionary history of parasitism. Here, we present Chromulinavorax destructans (Strain SeV1) the first isolate of this phylum to infect a representative from a widespread and ecologically significant group of heterotrophic flagellates, the microzooplankter Spumella elongata (Strain CCAP 955/1). Chromulinavorax destructans has a reduced 1.2 Mb genome that is so specialized for infection that it shows no evidence of complete metabolic pathways, but encodes an extensive transporter system for importing nutrients and energy in the form of ATP from the host. Its replication causes extensive reorganization and expansion of the mitochondrion, effectively surrounding the pathogen, consistent with its dependency on the host for energy. Nearly half (44%) of the inferred proteins contain signal sequences for secretion, including many without recognizable similarity to proteins of known function, as well as 98 copies of proteins with an ankyrin-repeat domain; ankyrin-repeats are known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus. These observations help to cement members of this phylum as widespread and diverse parasites infecting a broad range of eukaryotic microbes.


April 21, 2020

Trophic specialization results in genomic reduction in free-living marine idiomarina bacteria.

The streamlining hypothesis is generally used to explain the genomic reduction events related to the small genome size of free-living bacteria like marine bacteria SAR11. However, our current understanding of the correlation between bacterial genome size and environmental adaptation relies on too few species. It is still unclear whether there are other paths leading to genomic reduction in free-living bacteria. The genome size of marine free-living bacteria of the genus Idiomarina belonging to the order Alteromonadales (Gammaproteobacteria) is much smaller than the size of related genomes from bacteria in the same order. Comparative genomic and physiological analyses showed that the genomic reduction pattern in this genus is different from that of the classical SAR11 lineage. Genomic reduction reconstruction and substrate utilization profile showed that Idiomarina spp. lost a large number of genes related to carbohydrate utilization, and instead they specialized on using proteinaceous resources. Here we propose a new hypothesis to explain genomic reduction in this genus; we propose that trophic specialization increasing the metabolic efficiency for using one kind of substrate but reducing the substrate utilization spectrum could result in bacterial genomic reduction, which would be not uncommon in nature. This hypothesis was further tested in another free-living genus, Kangiella, which also shows dramatic genomic reduction. These findings highlight that trophic specialization is potentially an important path leading to genomic reduction in some marine free-living bacteria, which is distinct from the classical lineages like SAR11.IMPORTANCE The streamlining hypothesis is usually used to explain the genomic reduction events in free-living bacteria like SAR11. However, we find that the genomic reduction phenomenon in the bacterial genus Idiomarina is different from that in SAR11. Therefore, we propose a new hypothesis to explain genomic reduction in this genus based on trophic specialization that could result in genomic reduction, which would be not uncommon in nature. Not only can the trophic specialization hypothesis explain the genomic reduction in the genus Idiomarina, but it also sheds new light on our understanding of the genomic reduction processes in other free-living bacterial lineages. Copyright © 2019 Qin et al.


April 21, 2020

Sensory receptor repertoire in cyprid antennules of the barnacle Balanus improvisus.

Barnacle settlement involves sensing of a variety of exogenous cues. A pair of antennules is the main sensory organ that the cyprid larva uses to explore the surface. Antennules are equipped with a number of setae that have both chemo- and mechanosensing function. The current study explores the repertoire of sensory receptors in Balanus improvisus cyprid antennules with the goal to better understand sensory systems involved in the settling behavior of this species. We carried out transcriptome sequencing of dissected B. improvisus cyprid antennules. The generated transcriptome assembly was used to search for sensory receptors using HMM models. Among potential chemosensory genes, we identified the ionotropic receptors IR25a, IR8a and IR93a, and several divergent IR candidates to be expressed in the cyprid antennules. We found one gustatory-like receptor but no odorant receptors, chemosensory or odorant-binding proteins. Apart from chemosensory receptors, we also identified 13 potential mechanosensory genes represented by several transient receptor potential channels (TRP) subfamilies. Furthermore, we analyzed changes in expression profiles of IRs and TRPs during the B. improvisus settling process. Several of the sensory genes were differentially expressed during the course of larval settlement. This study gives expanded knowledge about the sensory systems present in barnacles, a taxonomic group for which only limited information about receptors is currently available. It furthermore serves as a starting point for more in depth studies of how sensory signaling affects settling behavior in barnacles with implications for preventing biofouling.


April 21, 2020

Complete Genome Sequence of an N-Acyl Homoserine Lactone Producer, Breoghania sp. Strain L-A4, Isolated from Rhizosphere of Phragmites australis in a Coastal Wetland.

The Breoghania sp. strain L-A4 was isolated from the rhizosphere of Phragmites australis in the Qinhaungdao coastal wetland in China. Here, we present the complete genome sequence of strain L-A4, which consists of a chromosome of 5,029,620?bp with a G+C content of 64.53% and 4,964 coding DNA sequences. This strain was the first detected to produce N-acyl homoserine lactone (AHL) signals in a member of this genus.


April 21, 2020

Characteristics of crude oil-degrading bacteria Gordonia iterans isolated from marine coastal in Taean sediment.

Crude oil is a major pollutant of marine and coastal ecosystems, and it causes environmental problems more seriously. It is believed ultimate and complete degradation is accomplished mainly by microorganisms. In this study, we aim to search out for bacterial strains with high ability in degrading crude oil. From sediments contaminated by the petroleum spilled in 2007, an accident in Taean, South Korea, we isolated thirty-one bacterial strains in total with potential application in crude oil contamination remediation. In terms of removal percentage after 7 days, one of the strains, Co17, showed the highest removal efficiency with 84.2% of crude oil in Bushnell-Haas media. The Co17 strain even exhibited outstanding ability removing crude oil at a high salt concentration. Through the whole genome sequencing annotation results, many genes related with n-alkane degradation in the genome of Gordonia sp. Co17, revealed alkane-1-monooxygenase, alcohol dehydrogenase, and Baeyer-Villiger monooxygenase. Specially, for confirmation of gene-level, alkB gene encoding alkane hydroxylase (alkane-1-monooxygenase) was found in the strain Co17. The expression of alkB upregulated 125-fold after 18 hr accompany with the removal of n-alkanes of 48.9%. We therefore propose the strain Gordonia iterans Co17, isolated from crude oil-contaminated marine sediment, could be used to offer a new strategy for bioremediation with high efficiency. © 2018 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.


April 21, 2020

Genomic characterization of Nocardia seriolae strains isolated from diseased fish.

Members of the genus Nocardia are widespread in diverse environments; a wide range of Nocardia species are known to cause nocardiosis in several animals, including cat, dog, fish, and humans. Of the pathogenic Nocardia species, N. seriolae is known to cause disease in cultured fish, resulting in major economic loss. We isolated two N. seriolae strains, CK-14008 and EM15050, from diseased fish and sequenced their genomes using the PacBio sequencing platform. To identify their genomic features, we compared their genomes with those of other Nocardia species. Phylogenetic analysis showed that N. seriolae shares a common ancestor with a putative human pathogenic Nocardia species. Moreover, N. seriolae strains were phylogenetically divided into four clusters according to host fish families. Through genome comparison, we observed that the putative pathogenic Nocardia strains had additional genes for iron acquisition. Dozens of antibiotic resistance genes were detected in the genomes of N. seriolae strains; most of the antibiotics were involved in the inhibition of the biosynthesis of proteins or cell walls. Our results demonstrated the virulence features and antibiotic resistance of fish pathogenic N. seriolae strains at the genomic level. These results may be useful to develop strategies for the prevention of fish nocardiosis. © 2018 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.


April 21, 2020

Complete Genome Sequence of “Candidatus Thioglobus sp.” Strain NP1, an Open-Ocean Isolate from the SUP05 Clade of Marine Gammaproteobacteria

Candidatus Thioglobus sp.textquotedblright strain NP1 is an open-ocean isolate from the SUP05 clade of Gammaproteobacteria. Whole-genome comparisons of strain NP1 to other sequenced isolates from the SUP05 clade indicate that it represents a new species of SUP05 that lacks the ability to fix inorganic carbon using the Calvin-Benson-Bassham cycle.


April 21, 2020

Characterization and Complete Genome Analysis of the Carbazomycin B-Producing Strain Streptomyces luteoverticillatus SZJ61.

Members of marine Actinobacteria have been highly regarded as potentially important sources of antimicrobial compounds. Here, we isolated a strain of Actinobacteria, SZJ61, and showed that it inhibits the in vitro growth of fungi pathogenic to plants. This new isolate was identified as Streptomyces luteoverticillatus by morphological, biochemical and genetic analyses. Antifungal compounds were isolated from S. luteoverticillatus strain SZJ61 and characterized as carbazomycin B by nuclear magnetic resonance spectra. We then sequenced the genome of the S. luteoverticillatus SZJ61 strain, which consists of only one 7,367,863 bp linear chromosome that has a G+C content of 72.05%. Thirty-five putative biosynthetic gene clusters for secondary metabolites, including a variety of bioactive products, were found. Mining of the genome sequence information revealed the putative biosynthetic gene cluster of carbazomycin B. This genomic information is valuable for interpreting the biosynthetic mechanisms of diverse bioactive compounds that have potential applications in the pharmaceutical industry.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.