long-read sequencing Archives - Page 438 of 463

July 7, 2019

Supergene evolution triggered by the introgression of a chromosomal inversion.

Supergenes are groups of tightly linked loci whose variation is inherited as a single Mendelian locus and are a common genetic architecture for complex traits under balancing selection [1-8]. Supergene alleles are long-range haplotypes with numerous mutations underlying distinct adaptive strategies, often maintained in linkage disequilibrium through the suppression of recombination by chromosomal rearrangements [1, 5, 7-9]. However, the mechanism governing the formation of supergenes is not well understood and poses the paradox of establishing divergent functional haplotypes in the face of recombination. Here, we show that the formation of the supergene alleles encoding mimicry polymorphism in the butterfly Heliconius numata is associated with the introgression of a divergent, inverted chromosomal segment. Haplotype divergence and linkage disequilibrium indicate that supergene alleles, each allowing precise wing-pattern resemblance to distinct butterfly models, originate from over a million years of independent chromosomal evolution in separate lineages. These “superalleles” have evolved from a chromosomal inversion captured by introgression and maintained in balanced polymorphism, triggering supergene inheritance. This mode of evolution involving the introgression of a chromosomal rearrangement is likely to be a common feature of complex structural polymorphisms associated with the coexistence of distinct adaptive syndromes. This shows that the reticulation of genealogies may have a powerful influence on the evolution of genetic architectures in nature. Copyright © 2018 Elsevier Ltd. All rights reserved.

July 7, 2019

Hercules: a profile HMM-based hybrid error correction algorithm for long reads.

Choosing whether to use second or third generation sequencing platforms can lead to trade-offs between accuracy and read length. Several types of studies require long and accurate reads. In such cases researchers often combine both technologies and the erroneous long reads are corrected using the short reads. Current approaches rely on various graph or alignment based techniques and do not take the error profile of the underlying technology into account. Efficient machine learning algorithms that address these shortcomings have the potential to achieve more accurate integration of these two technologies. We propose Hercules, the first machine learning-based long read error correction algorithm. Hercules models every long read as a profile Hidden Markov Model with respect to the underlying platform’s error profile. The algorithm learns a posterior transition/emission probability distribution for each long read to correct errors in these reads. We show on two DNA-seq BAC clones (CH17-157L1 and CH17-227A2) that Hercules-corrected reads have the highest mapping rate among all competing algorithms and have the highest accuracy when the breadth of coverage is high. On a large human CHM1 cell line WGS data set, Hercules is one of the few scalable algorithms; and among those, it achieves the highest accuracy.

July 7, 2019

Smooth q-Gram, and its applications to detection of overlaps among long, error-prone sequencing reads

We propose smoothq-gram, the frst variant of q-gram that captures q-gram pair within a small edit distance. We apply smooth q-gram to the problem of detecting overlapping pairs of error-prone reads produced by single molecule real time sequencing (SMRT), which is the frst and most critical step of the de novo fragment assembly of SMRT reads. We have implemented and tested our algorithm on a set of real world benchmarks. Our empirical results demonstrated the signifcant superiority of our algorithm over the existing q-gram based algorithms in accuracy.

July 7, 2019

Gapless genome assembly of the potato and tomato early blight pathogen Alternaria solani.

The Alternaria genus consists of saprophytic fungi as well as plant-pathogenic species that have significant economic impact. To date, the genomes of multiple Alternaria species have been sequenced. These studies have yielded valuable data for molecular studies on Alternaria fungi. However, most of the current Alternaria genome assemblies are highly fragmented, thereby hampering the identification of genes that are involved in causing disease. Here, we report a gapless genome assembly of A. solani, the causal agent of early blight in tomato and potato. The genome assembly is a significant step toward a better understanding of pathogenicity of A. solani.

July 7, 2019

Complete genome sequence of a heavy metal resistant bacterium Maribacter cobaltidurans B1T, isolated from the deep-sea sediment of the South Atlantic Ocean

Many bacteria in the environment have adopted to the presence of toxic heavy metals. Here we present the complete genome sequence of a heavy metal resistant bacterium, Maribacter cobaltidurans B1T (=CGMCC 1.15508T=KCTC 52882T=MCCC 1K03318T), which was isolated from a deep-sea sediment sample collected from the South Atlantic Ocean. Strain B1T is able to resist high concentrations of Co2+ (10.0mM) in Marine Agar 2216. The genome of strain B1T comprises 4,639,957bp in a circular chromosome with G+C content of 39.7mol%. Resistance to Co2+ is mainly based on efflux system in the genome of stain B1T, including czcCBA operons, czcD genes, corC genes, etc. Comparing with the closely related species M. orientalis DSM 16471T, the genome of B1T harbors twenty more copies of genes in czcCBA operon and two copies of the czcD genes related to Co2+ efflux. The function of these genes may contribute to the high level of cobalt resistance, revealing its potential application in biotechnological industry.

July 7, 2019

The complete genome sequence of Colwellia sp. NB097-1 reveals evidence for the potential genetic basis for its adaptation to cold environment

Colwellia sp. NB097-1, isolated from a marine sediment sample from the Bering Sea, is a psychrophilic bacterium whose optimal and maximal growth temperatures were 13 and 25°C, respectively. Here, we present the complete genome of Colwellia sp. NB097-1, which was 4,661,274bp in length with a GC content of 38.5%. The genome provided evidence for the potential genetic basis for its adaptation to a cold environment, such as producing compatible solutes and cold-shock proteins, increasing membrane fluidity and synthesizing glycogen. Some cold-adaptive proteases were also detected in the genome of Colwellia sp. NB097-1. Protease activity analysis further showed that extracellular proteases of Colwellia sp. NB097-1 remained active at low temperatures. The complete genome sequence may be helpful to reveal how this strain survives at low temperature and to find cold-adaptive proteases that may be useful to industry.

July 7, 2019

Complete genome of Halomonas aestuarii Hb3, isolated from tidal flat

Halomonas aestuarii Hb3, a moderately halophilic bacterium belonging to the class Gammaproteobacteria, was isolated from a tidal flat. Herein, we report the complete genome sequence of its strain Hb3. Its size is estimated at 3.54Mbp with a mean G+C content of 67.9%. The genome includes 3238 open reading frames, 65 transfer RNAs, and four ribosomal RNA gene operons. Genes related to the degradation of monoaromatic compounds, detoxification of arsenic, and production of polymers were identified. These features indicate that this strain may be important for ecological and industrial application.

July 7, 2019

Complete genome sequence of Siansivirga zeaxanthinifaciens CC-SAMT-1T, a flavobacterium isolated from coastal surface seawater

Here we present the complete genome sequence of Siansivirga zeaxanthinifaciens CC-SAMT-1T, a flavobacterium isolated from coastal surface seawater. A 3.3Mb genome revealed remarkable specialization of this bacterium particularly in the degradation of sulfated polysaccharides available as detritus or in dissolved phase. Besides utilizing high molecular weight organic biopolymers, this strain appears to accomplish assimilatory sulfate reduction, sulfide oxidation, and acquisition and inter-conversion of inorganic carbon. Genes encoding zeaxanthin and three different kinds of DNA photolyase/cryptochrome (senses blue light) were present, while genes that code for blue light sensing BLUF domain proteins and red/far-red light sensing phytochromes were absent. Furthermore, CC-SAMT-1T lacked the rhodopsin photosystem and all other genes that confer any other known forms of phototrophy. The genomic data revealed that CC-SAMT-1T is highly adapted to sulfur-rich coastal environments, where it most likely contributes to marine carbon and sulfur cycles by metabolizing sulfated polysaccharides as well as inorganic sulfur.

July 7, 2019

Complete genomes of the marine flavobacterium Nonlabens strains YIK11 and MIC269

Here, we report the complete genome sequences of two strains, which were isolated from sediment samples collected in Korea and Micronesia, and both were classified as members of Nonlabens spp. The complete genome sequence of Nonlabens sp. strain YIK11 consists of 3,260,677bp in two contigs while the one from strain MIC269 consists of 2,884,293bp in one contig, without plasmid. The genomes of YIK11 and MIC269 contain three and two genes encoding rhodopsins of different types, respectively.

July 7, 2019

Genome sequencing to develop Paenibacillus donghaensis strain JH8T (KCTC 13049T=LMG 23780T) as a microbial fertilizer and correlation to its plant growth-promoting phenotype

Paenibacillus donghaensis JH8T (KCTC 13049T=LMG 23780T) is a Gram-positive, mesophilic, endospore-forming bacterium isolated from East Sea sediment at depth of 500m in Korea. The strain exhibited plant cell wall hydrolytic and plant growth promoting abilities. The complete genome of P. donghaensis strain JH8T contains 7602 protein-coding sequences and an average GC content of 49.7% in its chromosome (8.54Mbp). Genes encoding proteins related to the degradation of plant cell wall, nitrogen-fixation, phosphate solubilization, and synthesis of siderophore were existed in the P. donghaensis strain JH8T genome, indicating that this strain can be used as an eco-friendly microbial agent for increasing agricultural productivity.

July 7, 2019

Complete genome sequence of Granulosicoccus antarcticus type strain IMCC3135T, a marine gammaproteobacterium with a putative dimethylsulfoniopropionate demethylase gene

Granulosicoccus, the only genus of the family Granulosicoccaceae, occupies a distinct phylogenetic position within the order Chromatiales of the Gammaproteobacteria. The genus has been found in various marine regions, especially associated with diverse marine macroalgae. No genomes have been reported for the genus Granulosicoccus thus far, hampering studies on physiology and lifestyles of this genus. Here we report the complete genome sequence of strain IMCC3135T, the type strain of Granulosicoccus antarcticus isolated from Antarctic coastal seawater. The genome was 7.78Mbp long and harbored many genes involved in sulfur metabolism. In particular, a gene for dimethylsulfoniopropionate (DMSP) demethylase was found in the genome, rendering strain IMCC3135T one of the few marine gammaproteobacteria equipped with the potential for DMSP demethylation.

July 7, 2019

Complete genome sequence of the halophile bacterium Kushneria marisflavi KCCM 80003T, isolated from seawater in Korea

We present the genome sequence of Kushneria marisflavi KCCM 80003T isolated from Yellow Sea in Korea. The complete genome of KCCM 80003T consisted of a single, circular chromosome of 3,667,185bp, with an average G+C content of 59.05%, and 3287 coding sequences, 12 rRNAs, and 66 tRNAs. Kushneria marisflavi KCCM 80003T, belonging to the family Halomonadaceae, exhibited resistance to high salt concentrations and possessed potassium metabolism- or osmotic stress-related coding sequences, including potassium homeostasis, ectoine biosynthesis and regulation, choline and betaine uptake, and betaine biosynthesis features in the genome. These results provide a basis for understanding resistance strategies to osmotic stress at the genetic level and accordingly have implications for genetic engineering and biotechnology.

July 7, 2019

Complete genome sequence of Tsukamurella sp. MH1: A wide-chain length alkane-degrading actinomycete.

Tsukamurella sp. strain MH1, capable to use a wide range of n-alkanes as the only carbon source, was isolated from petroleum-contaminated soil (Pite?ti, Romania) and its complete genome was sequenced. The 4,922,396?bp genome contains only one circular chromosome with a G?+?C content of 71.12%, much higher than the type strains of this genus (68.4%). Based on the 16S rRNA genes sequence similarity, strain MH1 was taxonomically identified as Tsukamurella carboxydivorans. Genome analyses revealed that strain MH1 is harboring only one gene encoding for the alkB-like hydroxylase, arranged in a complete alkane monooxygenase operon. This is the first complete genome of the specie T. carboxydivorans, which will provide insights into the potential of Tsukamurella sp. MH1 and related strains for bioremediation of petroleum hydrocarbons-contaminated sites and into the environmental role of these bacteria. Copyright © 2017. Published by Elsevier B.V.

July 7, 2019

Complete genome sequence of the marine Rhodococcus sp. H-CA8f isolated from Comau fjord in Northern Patagonia, Chile

Rhodococcus sp. H-CA8f was isolated from marine sediments obtained from the Comau fjord, located in Northern Chilean Patagonia. Whole-genome sequencing was achieved using PacBio RS II platform, comprising one closed, complete chromosome of 6,19?Mbp with a 62.45% G?+?C content. The chromosome harbours several metabolic pathways providing a wide catabolic potential, where the upper biphenyl route is described. Also, Rhodococcus sp. H-CA8f bears one linear mega-plasmid of 301?Kbp and 62.34% of G?+?C content, where genomic analyses demonstrated that it is constituted mostly by putative ORFs with unknown functions, representing a novel genetic feature. These genetic characteristics provide relevant insights regarding Chilean marine actinobacterial strains.

July 7, 2019

Synthetic biology, genome mining, and combinatorial biosynthesis of NRPS-derived antibiotics: a perspective.

Combinatorial biosynthesis of novel secondary metabolites derived from nonribosomal peptide synthetases (NRPSs) has been in slow development for about a quarter of a century. Progress has been hampered by the complexity of the giant multimodular multienzymes. More recently, advances have been made on understanding the chemical and structural biology of these complex megaenzymes, and on learning the design rules for engineering functional hybrid enzymes. In this perspective, I address what has been learned about successful engineering of complex lipopeptides related to daptomycin, and discuss how synthetic biology and microbial genome mining can converge to broaden the scope and enhance the speed and robustness of combinatorial biosynthesis of NRPS-derived natural products for drug discovery.

Asset Tag: long-read sequencing

Supergene evolution triggered by the introgression of a chromosomal inversion.

Hercules: a profile HMM-based hybrid error correction algorithm for long reads.

Smooth q-Gram, and its applications to detection of overlaps among long, error-prone sequencing reads

Gapless genome assembly of the potato and tomato early blight pathogen Alternaria solani.

Complete genome sequence of a heavy metal resistant bacterium Maribacter cobaltidurans B1T, isolated from the deep-sea sediment of the South Atlantic Ocean

The complete genome sequence of Colwellia sp. NB097-1 reveals evidence for the potential genetic basis for its adaptation to cold environment

Complete genome of Halomonas aestuarii Hb3, isolated from tidal flat

Complete genome sequence of Siansivirga zeaxanthinifaciens CC-SAMT-1T, a flavobacterium isolated from coastal surface seawater

Complete genomes of the marine flavobacterium Nonlabens strains YIK11 and MIC269

Genome sequencing to develop Paenibacillus donghaensis strain JH8T (KCTC 13049T=LMG 23780T) as a microbial fertilizer and correlation to its plant growth-promoting phenotype

Complete genome sequence of Granulosicoccus antarcticus type strain IMCC3135T, a marine gammaproteobacterium with a putative dimethylsulfoniopropionate demethylase gene

Complete genome sequence of the halophile bacterium Kushneria marisflavi KCCM 80003T, isolated from seawater in Korea

Complete genome sequence of Tsukamurella sp. MH1: A wide-chain length alkane-degrading actinomycete.

Complete genome sequence of the marine Rhodococcus sp. H-CA8f isolated from Comau fjord in Northern Patagonia, Chile

Synthetic biology, genome mining, and combinatorial biosynthesis of NRPS-derived antibiotics: a perspective.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert