Menu
July 7, 2019

Assembly of long error-prone reads using de Bruijn graphs.

The recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize the strengths of the alternative de Bruijn graph approach to genome assembly. Moreover, these studies often assume that applications of the de Bruijn graph approach are limited to short and accurate reads and that the OLC approach is the only practical paradigm for assembling long error-prone reads. We show how to generalize de Bruijn graphs for assembling long error-prone reads and describe the ABruijn assembler, which combines the de Bruijn graph and the OLC approaches and results in accurate genome reconstructions.


July 7, 2019

The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts.

Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance.


July 7, 2019

Whole genome DNA sequence analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS) to Salmonella subspecies enterica serotype Tennessee (S. Tennessee) to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana), which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP) analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs), suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future outbreaks. Using WGS can delimit contamination sources for foodborne illnesses across multiple outbreaks and reveal otherwise undetected DNA sequence differences essential to the tracing of bacterial pathogens as they emerge.


July 7, 2019

Atypical Salmonella enterica serovars in murine and human infection models: Is it time to reassess our approach to the study of salmonellosis?

Nontyphoidal Salmonella species are globally disseminated pathogens and the predominant cause of gastroenteritis. The pathogenesis of salmonellosis has been extensively studied using in vivo murine models and cell lines typically challenged with Salmonella Typhimurium. Although serovars Enteritidis and Typhimurium are responsible for the most of human infections reported to the CDC, several other serovars also contribute to clinical cases of salmonellosis. Despite their epidemiological importance, little is known about their infection phenotypes. Here, we report the virulence characteristics and genomes of 10 atypical S. enterica serovars linked to multistate foodborne outbreaks in the United States. We show that the murine RAW 264.7 macrophage model of infection is unsuitable for inferring human relevant differences in nontyphoidal Salmonella infections whereas differentiated human THP-1 macrophages allowed these isolates to be further characterised in a more relevant, human context.


July 7, 2019

Characterization of the first cultured representative of Verrucomicrobia subdivision 5 indicates the proposal of a novel phylum.

The recently isolated strain L21-Fru-AB(T) represents moderately halophilic, obligately anaerobic and saccharolytic bacteria that thrive in the suboxic transition zones of hypersaline microbial mats. Phylogenetic analyses based on 16S rRNA genes, RpoB proteins and gene content indicated that strain L21-Fru-AB(T) represents a novel species and genus affiliated with a distinct phylum-level lineage originally designated Verrucomicrobia subdivision 5. A survey of environmental 16S rRNA gene sequences revealed that members of this newly recognized phylum are wide-spread and ecologically important in various anoxic environments ranging from hypersaline sediments to wastewater and the intestine of animals. Characteristic phenotypic traits of the novel strain included the formation of extracellular polymeric substances, a Gram-negative cell wall containing peptidoglycan and the absence of odd-numbered cellular fatty acids. Unusual metabolic features deduced from analysis of the genome sequence were the production of sucrose as osmoprotectant, an atypical glycolytic pathway lacking pyruvate kinase and the synthesis of isoprenoids via mevalonate. On the basis of the analyses of phenotypic, genomic and environmental data, it is proposed that strain L21-Fru-AB(T) and related bacteria are specifically adapted to the utilization of sulfated glycopolymers produced in microbial mats or biofilms.


July 7, 2019

Complete genome sequence of Vibrio alginolyticus ATCC 33787(T) isolated from seawater with three native megaplasmids.

Vibrio alginolyticus, an opportunistic pathogen, is commonly associated with vibriosis in fish and shellfish and can also cause superficial and ear infections in humans. V. alginolyticus ATCC 33787(T) was originally isolated from seawater and has been used as one of the type strains for exploring the virulence factors of marine bacteria and for developing vaccine against vibriosis. Here we sequenced and assembled the whole genome of this strain, and identified three megaplasmids and three Type VI secretion systems, thus providing useful information for the study of virulence factors and for the development of vaccine for Vibrio. Copyright © 2016. Published by Elsevier B.V.


July 7, 2019

Pseudomonas cerasi sp. nov. (non Griffin, 1911) isolated from diseased tissue of cherry.

Eight isolates of Gram-negative fluorescent bacteria (58(T), 122, 374, 791, 963, 966, 970a and 1021) were obtained from diseased tissue of cherry trees from different regions of Poland. The symptoms resembled those of bacterial canker. Based on an analysis of 16S rDNA sequences the isolates shared the highest over 99.9% similarity with Pseudomonas ficuserectae JCM 2400(T) and P. congelans DSM 14939(T). Phylogenetic analysis using housekeeping genes gyrB, rpoD and rpoB revealed that they form a separate cluster and confirmed their closest relation to P. syringae NCPPB 281(T) and P. congelans LMG 21466(T). DNA-DNA hybridization between the cherry isolate 58(T) and the type strains of these two closely related species revealed relatedness values of 58.2% and 41.9%, respectively. This was further supported by Average Nucleotide Identity (ANIb) and Genome-to-Genome Distance (GGDC) between the whole genome sequences of strain LMG 28609(T) and closely related Pseudomonas species. The major cellular fatty acids are 16:0 and summed feature 3 (16:1 ?7c/15:0 iso 2OH). Phenotypic characteristics differentiated the novel isolates from other closely related species. The G+C content of the genomic DNA of strain 58(T) was 59%. The diversity was proved by PCR MP and BOX PCR, eliminating the possibility that they constitute a clonal population. Based on the evidence of this polyphasic taxonomic study the eight strains are considered to represent a novel species of the genus Pseudomonas for which the name P. cerasi sp. nov. (non Griffin, 1911) is proposed. The type strain of this species is 58(T) (=LMG 28609(T)=CFBP 8305(T)). Copyright © 2016 Elsevier GmbH. All rights reserved.


July 7, 2019

Evaluation of an optimal epidemiologic typing scheme for Legionella pneumophila with whole genome sequence data using validation guidelines.

Sequence-based typing (SBT), analogous to multi-locus sequence typing (MLST), is the current gold-standard typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila However, as common sequence types (STs) cause many infections, some investigations remain unresolved. Here, various whole genome sequencing (WGS)-based methods were evaluated according to published guidelines, including: i) single nucleotide polymorphism (SNP)-based; ii) extended multi-locus sequence typing (MLST) using different numbers of genes; iii) gene presence/absence, and iv) kmer-based. L. pneumophila serogroup 1 isolates (n=106) from the standard “typing panel”, previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI) were tested together with another 229 isolates.Over 98% isolates were considered typable using the mapping- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50-gene) to 86.8% (1455-gene) whilst only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP-based), and all values are higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ~50 genes provides optimal epidemiological concordance whilst substantially improving the discrimination offered by SBT, and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila. Copyright © 2016 David et al.


July 7, 2019

Long single-molecule reads can resolve the complexity of the influenza virus composed of rare, closely related mutant variants

As a result of a high rate of mutations and recombination events, an RNA-virus exists as a heterogeneous “swarm” of mutant variants. The long read length offered by single-molecule sequencing technologies allows each mutant variant to be sequenced in a single pass. However, high error rate limits the ability to reconstruct heterogeneous viral population composed of rare, related mutant variants. In this paper, we present 2SNV, a method able to tolerate the high error-rate of the single-molecule protocol and reconstruct mutant variants. 2SNV uses linkage between single nucleotide variations to efficiently distinguish them from read errors. To benchmark the sensitivity of 2SNV, we performed a single-molecule sequencing experiment on a sample containing a titrated level of known viral mutant variants. Our method is able to accurately reconstruct clone with frequency of 0.2 % and distinguish clones that differed in only two nucleotides distantly located on the genome. 2SNV outperforms existing methods for full-length viral mutant reconstruction. The open source implementation of 2SNV is freely available for download at http://?alan.?cs.?gsu.?edu/?NGS/???q=?content/?2snv.


July 7, 2019

Complete and closed genome sequences of 10 Salmonella enterica subsp. enterica serovar Anatum isolates from human and bovine sources.

Salmonella enterica is an important pathogen transmitted by numerous vectors. Genomic comparisons of Salmonella strains from disparate hosts have the potential to further our understanding of mechanisms underlying host specificities and virulence. Here, we present the closed genome and plasmid sequences of 10 Salmonella enterica subsp. enterica serovar Anatum isolates from bovine and human sources. Copyright © 2016 Nguyen et al.


July 7, 2019

The draft genome of MD-2 pineapple using hybrid error correction of long reads.

The introduction of the elite pineapple variety, MD-2, has caused a significant market shift in the pineapple industry. Better productivity, overall increased in fruit quality and taste, resilience to chilled storage and resistance to internal browning are among the key advantages of the MD-2 as compared with its previous predecessor, the Smooth Cayenne. Here, we present the genome sequence of the MD-2 pineapple (Ananas comosus (L.) Merr.) by using the hybrid sequencing technology from two highly reputable platforms, i.e. the PacBio long sequencing reads and the accurate Illumina short reads. Our draft genome achieved 99.6% genome coverage with 27,017 predicted protein-coding genes while 45.21% of the genome was identified as repetitive elements. Furthermore, differential expression of ripening RNASeq library of pineapple fruits revealed ethylene-related transcripts, believed to be involved in regulating the process of non-climacteric pineapple fruit ripening. The MD-2 pineapple draft genome serves as an example of how a complex heterozygous genome is amenable to whole genome sequencing by using a hybrid technology that is both economical and accurate. The genome will make genomic applications more feasible as a medium to understand complex biological processes specific to pineapple. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


July 7, 2019

Microsatellite length scoring by Single Molecule Real Time Sequencing – Effects of sequence structure and PCR regime.

Microsatellites are DNA sequences consisting of repeated, short (1-6 bp) sequence motifs that are highly mutable by enzymatic slippage during replication. Due to their high intrinsic variability, microsatellites have important applications in population genetics, forensics, genome mapping, as well as cancer diagnostics and prognosis. The current analytical standard for microsatellites is based on length scoring by high precision electrophoresis, but due to increasing efficiency next-generation sequencing techniques may provide a viable alternative. Here, we evaluated single molecule real time (SMRT) sequencing, implemented in the PacBio series of sequencing apparatuses, as a means of microsatellite length scoring. To this end we carried out multiplexed SMRT sequencing of plasmid-carried artificial microsatellites of varying structure under different pre-sequencing PCR regimes. For each repeat structure, reads corresponding to the target length dominated. We found that pre-sequencing amplification had large effects on scoring accuracy and error distribution relative to controls, but that the effects of the number of amplification cycles were generally weak. In line with expectations enzymatic slippage decreased proportionally with microsatellite repeat unit length and increased with repetition number. Finally, we determined directional mutation trends, showing that PCR and SMRT sequencing introduced consistent but opposing error patterns in contraction and expansion of the microsatellites on the repeat motif and single nucleotide level.


July 7, 2019

Whole-genome sequence of Filimonas lacunae, a bacterium of the family Chitinophagaceae characterized by marked colony growth under a high-CO2 atmosphere.

We report here the genome sequence of Filimonas lacunae, a bacterium of the family Chitinophagaceae characterized by high-CO2-dependent growth. The 7.81-Mb circular genome harbors many genes involved in carbohydrate degradation and related genetic regulation, suggesting the role of the bacterium as a carbohydrate degrader in diverse environments. Copyright © 2016 Shiratori-Takano et al.


July 7, 2019

Complete genome sequence of a Rhodococcus species isolated from the winter skate Leucoraja ocellata.

We report here a genome sequence for Rhodococcus sp. isolate UM008 isolated from the renal/interrenal tissue of the winter skate Leucoraja ocellata Genome sequence analysis suggests that Rhodococcus bacteria may act in a novel mutualistic relationship with their elasmobranch host, serving as biocatalysts in the steroidogenic pathway of 1a-hydroxycorticosterone. Copyright © 2016 Wiens et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.