Menu
July 7, 2019

LoRDEC: accurate and efficient long read error correction.

PacBio single molecule real-time sequencing is a third-generation sequencing technique producing long reads, with comparatively lower throughput and higher error rate. Errors include numerous indels and complicate downstream analysis like mapping or de novo assembly. A hybrid strategy that takes advantage of the high accuracy of second-generation short reads has been proposed for correcting long reads. Mapping of short reads on long reads provides sufficient coverage to eliminate up to 99% of errors, however, at the expense of prohibitive running times and considerable amounts of disk and memory space.We present LoRDEC, a hybrid error correction method that builds a succinct de Bruijn graph representing the short reads, and seeks a corrective sequence for each erroneous region in the long reads by traversing chosen paths in the graph. In comparison, LoRDEC is at least six times faster and requires at least 93% less memory or disk space than available tools, while achieving comparable accuracy. Availability and implementaion: LoRDEC is written in C++, tested on Linux platforms and freely available at http://atgc.lirmm.fr/lordec. © The Author 2014. Published by Oxford University Press.


July 7, 2019

Complete genome of the switchgrass endophyte Enterobacter clocace P101.

The Enterobacter cloacae complex is genetically very diverse. The increasing number of complete genomic sequences of E. cloacae is helping to determine the exact relationship among members of the complex. E. cloacae P101 is an endophyte of switchgrass (Panicum virgatum) and is closely related to other E. cloacae strains isolated from plants. The P101 genome consists of a 5,369,929 bp chromosome. The chromosome has 5,164 protein-coding regions, 100 tRNA sequences, and 8 rRNA operons.


July 7, 2019

SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information.

The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data.Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes.The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner.


July 7, 2019

Genome sequences of two carbapenemase-resistant Klebsiella pneumoniae ST258 isolates.

Klebsiella pneumoniae, an ESKAPE group (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogen, has acquired multiple antibiotic resistance genes and is becoming a serious public health threat. Here, we report the genome sequences of two representative strains of K. pneumoniae from the emerging K. pneumoniae carbapenemase (KPC) outbreak in northeast Ohio belonging to sequence type 258 (ST258) (isolates Kb140 and Kb677, which were isolated from blood and urine, respectively). Both isolates harbor a blaKPC gene, and strain Kb140 carries blaKPC-2, while Kb677 carries blaKPC-3. Copyright © 2014 Ramirez et al.


July 7, 2019

Type I restriction enzymes and their relatives.

Type I restriction enzymes (REases) are large pentameric proteins with separate restriction (R), methylation (M) and DNA sequence-recognition (S) subunits. They were the first REases to be discovered and purified, but unlike the enormously useful Type II REases, they have yet to find a place in the enzymatic toolbox of molecular biologists. Type I enzymes have been difficult to characterize, but this is changing as genome analysis reveals their genes, and methylome analysis reveals their recognition sequences. Several Type I REases have been studied in detail and what has been learned about them invites greater attention. In this article, we discuss aspects of the biochemistry, biology and regulation of Type I REases, and of the mechanisms that bacteriophages and plasmids have evolved to evade them. Type I REases have a remarkable ability to change sequence specificity by domain shuffling and rearrangements. We summarize the classic experiments and observations that led to this discovery, and we discuss how this ability depends on the modular organizations of the enzymes and of their S subunits. Finally, we describe examples of Type II restriction-modification systems that have features in common with Type I enzymes, with emphasis on the varied Type IIG enzymes.


July 7, 2019

Complete genome sequence of the sugar cane endophyte Pseudomonas aurantiaca PB-St2, a disease-suppressive bacterium with antifungal activity toward the plant pathogen Colletotrichum falcatum.

The endophytic bacterium Pseudomonas aurantiaca PB-St2 exhibits antifungal activity and represents a biocontrol agent to suppress red rot disease of sugar cane. Here, we report the completely sequenced 6.6-Mb genome of P. aurantiaca PB-St2. The sequence contains a repertoire of biosynthetic genes for secondary metabolites that putatively contribute to its antagonistic activity and its plant-microbe interactions.


July 7, 2019

Sequence alignment tools: one parallel pattern to rule them all?

In this paper, we advocate high-level programming methodology for next generation sequencers (NGS) alignment tools for both productivity and absolute performance. We analyse the problem of parallel alignment and review the parallelisation strategies of the most popular alignment tools, which can all be abstracted to a single parallel paradigm. We compare these tools to their porting onto the FastFlow pattern-based programming framework, which provides programmers with high-level parallel patterns. By using a high-level approach, programmers are liberated from all complex aspects of parallel programming, such as synchronisation protocols, and task scheduling, gaining more possibility for seamless performance tuning. In this work, we show some use cases in which, by using a high-level approach for parallelising NGS tools, it is possible to obtain comparable or even better absolute performance for all used datasets.


July 7, 2019

First complete genome sequence of Salmonella enterica subsp. enterica serovar Typhimurium strain ATCC 13311 (NCTC 74), a reference strain of multidrug resistance, as achieved by use of PacBio Single-Molecule Real-Time technology.

We report the first complete genomic sequence of Salmonella enterica subsp. enterica serovar Typhimurium strain ATCC 13311, the leading food-borne pathogen and a reference strain used in drug resistance studies. De novo assembly with PacBio sequencing completed its chromosome and one plasmid. They will accelerate the investigation into multidrug resistance in Salmonella Typhimurium. Copyright © 2014 Terabayashi et al.


July 7, 2019

Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology.

The largest known DNA viruses infect Acanthamoeba and belong to two markedly different families. The Megaviridae exhibit pseudo-icosahedral virions up to 0.7 µm in diameter and adenine-thymine (AT)-rich genomes of up to 1.25 Mb encoding a thousand proteins. Like their Mimivirus prototype discovered 10 y ago, they entirely replicate within cytoplasmic virion factories. In contrast, the recently discovered Pandoraviruses exhibit larger amphora-shaped virions 1 µm in length and guanine-cytosine-rich genomes up to 2.8 Mb long encoding up to 2,500 proteins. Their replication involves the host nucleus. Whereas the Megaviridae share some general features with the previously described icosahedral large DNA viruses, the Pandoraviruses appear unrelated to them. Here we report the discovery of a third type of giant virus combining an even larger pandoravirus-like particle 1.5 µm in length with a surprisingly smaller 600 kb AT-rich genome, a gene content more similar to Iridoviruses and Marseillevirus, and a fully cytoplasmic replication reminiscent of the Megaviridae. This suggests that pandoravirus-like particles may be associated with a variety of virus families more diverse than previously envisioned. This giant virus, named Pithovirus sibericum, was isolated from a >30,000-y-old radiocarbon-dated sample when we initiated a survey of the virome of Siberian permafrost. The revival of such an ancestral amoeba-infecting virus used as a safe indicator of the possible presence of pathogenic DNA viruses, suggests that the thawing of permafrost either from global warming or industrial exploitation of circumpolar regions might not be exempt from future threats to human or animal health.


July 7, 2019

Enterobacter asburiae strain L1: complete genome and whole genome optical mapping analysis of a quorum sensing bacterium.

Enterobacter asburiae L1 is a quorum sensing bacterium isolated from lettuce leaves. In this study, for the first time, the complete genome of E. asburiae L1 was sequenced using the single molecule real time sequencer (PacBio RSII) and the whole genome sequence was verified by using optical genome mapping (OpGen) technology. In our previous study, E. asburiae L1 has been reported to produce AHLs, suggesting the possibility of virulence factor regulation which is quorum sensing dependent. This evoked our interest to study the genome of this bacterium and here we present the complete genome of E. asburiae L1, which carries the virulence factor gene virK, the N-acyl homoserine lactone-based QS transcriptional regulator gene luxR and the N-acyl homoserine lactone synthase gene which we firstly named easI. The availability of the whole genome sequence of E. asburiae L1 will pave the way for the study of the QS-mediated gene expression in this bacterium. Hence, the importance and functions of these signaling molecules can be further studied in the hope of elucidating the mechanisms of QS-regulation in E. asburiae. To the best of our knowledge, this is the first documentation of both a complete genome sequence and the establishment of the molecular basis of QS properties of E. asburiae.


July 7, 2019

Surveillance of carbapenem-resistant Klebsiella pneumoniae: tracking molecular epidemiology and outcomes through a regional network.

Carbapenem resistance in Gram-negative bacteria is on the rise in the United States. A regional network was established to study microbiological and genetic determinants of clinical outcomes in hospitalized patients with carbapenem-resistant (CR) Klebsiella pneumoniae in a prospective, multicenter, observational study. To this end, predefined clinical characteristics and outcomes were recorded and K. pneumoniae isolates were analyzed for strain typing and resistance mechanism determination. In a 14-month period, 251 patients were included. While most of the patients were admitted from long-term care settings, 28% of them were admitted from home. Hospitalizations were prolonged and complicated. Nonsusceptibility to colistin and tigecycline occurred in isolates from 7 and 45% of the patients, respectively. Most of the CR K. pneumoniae isolates belonged to repetitive extragenic palindromic PCR (rep-PCR) types A and B (both sequence type 258) and carried either blaKPC-2 (48%) or blaKPC-3 (51%). One isolate tested positive for blaNDM-1, a sentinel discovery in this region. Important differences between strain types were noted; rep-PCR type B strains were associated with blaKPC-3 (odds ratio [OR], 294; 95% confidence interval [CI], 58 to 2,552; P < 0.001), gentamicin nonsusceptibility (OR, 24; 95% CI, 8.39 to 79.38; P < 0.001), amikacin susceptibility (OR, 11.0; 95% CI, 3.21 to 42.42; P < 0.001), tigecycline nonsusceptibility (OR, 5.34; 95% CI, 1.30 to 36.41; P = 0.018), a shorter length of stay (OR, 0.98; 95% CI, 0.95 to 1.00; P = 0.043), and admission from a skilled-nursing facility (OR, 3.09; 95% CI, 1.26 to 8.08; P = 0.013). Our analysis shows that (i) CR K. pneumoniae is seen primarily in the elderly long-term care population and that (ii) regional monitoring of CR K. pneumoniae reveals insights into molecular characteristics. This work highlights the crucial role of ongoing surveillance of carbapenem resistance determinants. Copyright © 2014, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Detecting authorized and unauthorized genetically modified organisms containing vip3A by real-time PCR and next-generation sequencing.

The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.