Menu
July 7, 2019

Genome sequences of Shewanella baltica and Shewanella morhuae strains isolated from the gastrointestinal tract of freshwater fish.

We present here the genome sequences of Shewanella baltica strain CW2 and Shewanella morhuae strain CW7, isolated from the gastrointestinal tract of Salvelinus namaycush (lean lake trout) and Coregonus clupeaformis (whitefish), respectively. These genome sequences provide insights into the niche adaptation of these specific species in freshwater systems. Copyright © 2018 Castillo et al.


July 7, 2019

ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers.

The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assemblies. We introduce ARKS, an alignment-free linked read genome scaffolding methodology that uses linked reads to organize genome assemblies further into contiguous drafts. Our approach departs from other read alignment-dependent linked read scaffolders, including our own (ARCS), and uses a kmer-based mapping approach. The kmer mapping strategy has several advantages over read alignment methods, including better usability and faster processing, as it precludes the need for input sequence formatting and draft sequence assembly indexing. The reliance on kmers instead of read alignments for pairing sequences relaxes the workflow requirements, and drastically reduces the run time.Here, we show how linked reads, when used in conjunction with Hi-C data for scaffolding, improve a draft human genome assembly of PacBio long-read data five-fold (baseline vs. ARKS NG50?=?4.6 vs. 23.1 Mbp, respectively). We also demonstrate how the method provides further improvements of a megabase-scale Supernova human genome assembly (NG50?=?14.74 Mbp vs. 25.94 Mbp before and after ARKS), which itself exclusively uses linked read data for assembly, with an execution speed six to nine times faster than competitive linked read scaffolders (~?10.5 h compared to 75.7 h, on average). Following ARKS scaffolding of a human genome 10xG Supernova assembly (of cell line NA12878), fewer than 9 scaffolds cover each chromosome, except the largest (chromosome 1, n?=?13).ARKS uses a kmer mapping strategy instead of linked read alignments to record and associate the barcode information needed to order and orient draft assembly sequences. The simplified workflow, when compared to that of our initial implementation, ARCS, markedly improves run time performances on experimental human genome datasets. Furthermore, the novel distance estimator in ARKS utilizes barcoding information from linked reads to estimate gap sizes. It accomplishes this by modeling the relationship between known distances of a region within contigs and calculating associated Jaccard indices. ARKS has the potential to provide correct, chromosome-scale genome assemblies, promptly. We expect ARKS to have broad utility in helping refine draft genomes.


July 7, 2019

The challenge of analyzing the sugarcane genome.

Reference genome sequences have become key platforms for genetics and breeding of the major crop species. Sugarcane is probably the largest crop produced in the world (in weight of crop harvested) but lacks a reference genome sequence. Sugarcane has one of the most complex genomes in crop plants due to the extreme level of polyploidy. The genome of modern sugarcane hybrids includes sub-genomes from two progenitors Saccharum officinarum and S. spontaneum with some chromosomes resulting from recombination between these sub-genomes. Advancing DNA sequencing technologies and strategies for genome assembly are making the sugarcane genome more tractable. Advances in long read sequencing have allowed the generation of a more complete set of sugarcane gene transcripts. This is supporting transcript profiling in genetic research. The progenitor genomes are being sequenced. A monoploid coverage of the hybrid genome has been obtained by sequencing BAC clones that cover the gene space of the closely related sorghum genome. The complete polyploid genome is now being sequenced and assembled. The emerging genome will allow comparison of related genomes and increase understanding of the functioning of this polyploidy system. Sugarcane breeding for traditional sugar and new energy and biomaterial uses will be enhanced by the availability of these genomic resources.


July 7, 2019

Analysis of resistance genes of clinical Pannonibacter phragmitetus strain 31801 by complete genome sequencing.

To clarify the resistance mechanisms of Pannonibacter phragmitetus 31801, isolated from the blood of a liver abscess patient, at the genomic level, we performed whole genomic sequencing using a PacBio RS II single-molecule real-time long-read sequencer. Bioinformatic analysis of the resulting sequence was then carried out to identify any possible resistance genes. Analyses included Basic Local Alignment Search Tool searches against the Antibiotic Resistance Genes Database, ResFinder analysis of the genome sequence, and Resistance Gene Identifier analysis within the Comprehensive Antibiotic Resistance Database. Prophages, clustered regularly interspaced short palindromic repeats (CRISPR), and other putative virulence factors were also identified using PHAST, CRISPRfinder, and the Virulence Factors Database, respectively. The circular chromosome and single plasmid of P. phragmitetus 31801 contained multiple antibiotic resistance genes, including those coding for three different types of ß-lactamase [NPS ß-lactamase (EC 3.5.2.6), ß-lactamase class C, and a metal-dependent hydrolase of ß-lactamase superfamily I]. In addition, genes coding for subunits of several multidrug-resistance efflux pumps were identified, including those targeting macrolides (adeJ, cmeB), tetracycline (acrB, adeAB), fluoroquinolones (acrF, ceoB), and aminoglycosides (acrD, amrB, ceoB, mexY, smeB). However, apart from the tripartite macrolide efflux pump macAB-tolC, the genome did not appear to contain the complete complement of subunit genes required for production of most of the major multidrug-resistance efflux pumps.


July 7, 2019

Activation of the mismatch-specific endonuclease EndoMS/NucS by the replication clamp is required for high fidelity DNA replication.

The mismatch repair (MMR) system, exemplified by the MutS/MutL proteins, is widespread in Bacteria and Eukarya. However, molecular mechanisms how numerous archaea and bacteria lacking the mutS/mutL genes maintain high replication fidelity and genome stability have remained elusive. EndoMS is a recently discovered hyperthermophilic mismatch-specific endonuclease encoded by nucS in Thermococcales. We deleted the nucS from the actinobacterium Corynebacterium glutamicum and demonstrated a drastic increase of spontaneous transition mutations in the nucS deletion strain. The observed spectra of these mutations were consistent with the enzymatic properties of EndoMS in vitro. The robust mismatch-specific endonuclease activity was detected with the purified C. glutamicum EndoMS protein but only in the presence of the ß-clamp (DnaN). Our biochemical and genetic data suggest that the frequently occurring G/T mismatch is efficiently repaired by the bacterial EndoMS-ß-clamp complex formed via a carboxy-terminal sequence motif of EndoMS proteins. Our study thus has great implications for understanding how the activity of the novel MMR system is coordinated with the replisome and provides new mechanistic insight into genetic diversity and mutational patterns in industrially and clinically (e.g. Mycobacteria) important archaeal and bacterial phyla previously thought to be devoid of the MMR system.


July 7, 2019

Complete genome sequence of oyster isolate Vibrio vulnificus env1.

Vibrio vulnificus, a ubiquitous inhabitant of coastal marine environments, has been isolated from a variety of sources. It is an opportunistic pathogen of both marine animals and humans. Here, the genome sequence of V. vulnificus Env1, an environmental isolate resistant to predation by the ciliate Tetrahymena pyriformis, is reported. Copyright © 2018 Noorian et al.


July 7, 2019

Complete genome sequence of Klebsiella quasipneumoniae strain S05, a fouling-causing bacterium isolated from a membrane bioreactor.

We report here the complete genome sequence of Klebsiella quasipneumoniae strain S05, a bacterium capable of producing membrane fouling-causing soluble substances and capable of respiring on oxygen, nitrate, and an anodic electrode. The genomic information of strain S05 should help predict metabolic pathways associated with these unique biological properties of this bacterium. Copyright © 2018 Kitajima et al.


July 7, 2019

Complete genome sequence of Achromobacter spanius type strain DSM 23806T, a pathogen isolated from human blood.

Achromobacter spanius is a newly described, non-fermenting, Gram-negative, coccoid pathogen isolated from human blood. Whole-genome sequencing of the A. spanius type strain was performed to investigate the mechanism of pathogenesis of this strain at a genomic level.The complete genome of A. spanius type strain DSM 23806T was sequenced using single-molecule real-time (SMRT) DNA sequencing.The complete genome of DSM 23806T consists of one circular DNA chromosome of 6425783bp with a G+C content of 64.26%. The entire genome contains 5804 predicted coding sequences (CDS) and 55 tRNAs. Genomic island (GI) analysis showed that this strain encodes several important pathogenesis- and resistance-related genes.These results strongly suggest that GIs provide some fitness advantages in A. spanius type strain DSM 23806T. This report provides an extensive understanding of A. spanius at a genomic level as well as an understanding of the evolution of A. spanius. Copyright © 2018 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.


July 7, 2019

GtTR: Bayesian estimation of absolute tandem repeat copy number using sequence capture and high throughput sequencing.

Tandem repeats comprise significant proportion of the human genome including coding and regulatory regions. They are highly prone to repeat number variation and nucleotide mutation due to their repetitive and unstable nature, making them a major source of genomic variation between individuals. Despite recent advances in high throughput sequencing, analysis of tandem repeats in the context of complex diseases is still hindered by technical limitations. We report a novel targeted sequencing approach, which allows simultaneous analysis of hundreds of repeats. We developed a Bayesian algorithm, namely – GtTR – which combines information from a reference long-read dataset with a short read counting approach to genotype tandem repeats at population scale. PCR sizing analysis was used for validation.We used a PacBio long-read sequenced sample to generate a reference tandem repeat genotype dataset with on average 13% absolute deviation from PCR sizing results. Using this reference dataset GtTR generated estimates of VNTR copy number with accuracy within 95% high posterior density (HPD) intervals of 68 and 83% for capture sequence data and 200X WGS data respectively, improving to 87 and 94% with use of a PCR reference. We show that the genotype resolution increases as a function of depth, such that the median 95% HPD interval lies within 25, 14, 12 and 8% of the its midpoint copy number value for 30X, 200X WGS, 395X and 800X capture sequence data respectively. We validated nine targets by PCR sizing analysis and genotype estimates from sequencing results correlated well with PCR results.The novel genotyping approach described here presents a new cost-effective method to explore previously unrecognized class of repeat variation in GWAS studies of complex diseases at the population level. Further improvements in accuracy can be obtained by improving accuracy of the reference dataset.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.