Rapid Genome Assembly: Salmonella Outbreak Strain Sequenced and Closed in Less Than One Week
Wednesday, April 10, 2013
A newly reported Salmonella genome showcases the utility of single molecule, real-time (SMRT®) sequencing for characterizing a foodborne outbreak pathogen.
The outbreak strain, Salmonella enterica subsp. enterica serovar Javiana (S. Javiana), representing one of the top five most common forms of Salmonella associated with fresh-cut produce, was sequenced and analyzed late last year; its genome was published this month in Genome Announcements, a journal from the American Society for Microbiology. The study was led by the US Food & Drug Administration’s Center for Food Safety & Applied Nutrition. Scientists from Pacific Biosciences and New England BioLabs participated in the study, as well.
What’s notable about this particular genome sequencing effort are its turnaround time and comprehensive analysis. A clinical isolate from the S. Javiana outbreak, the source of an outbreak from October 2012, was prepared and sequenced with eight SMRT Cells in under two days. Using the new HGAP de novo genome assembly and Quiver consensus algorithms, the genome was assembled into a single contig for the chromosome and two additional mobile elements. The team also analyzed the methylation data generated with the sequence information. PacBio’s base modification analysis indicated that the Javiana strain appears to have unique methylation patterns, differentiating it from other Salmonella strains. The full analysis of this Javiana strain, including methylome data, was accomplished in less than one week.
This study highlights a few important aspects of sequencing for pathogen detection and identification as outbreaks are happening. In the genome announcement, the authors note that previously there was only one Javiana strain reported in GenBank, and that strain’s genome was not fully finished. In addition, certain DNA sequence regions in the mobile elements were novel as they had no matches to any previous GenBank entries. Therefore, being able to perform a complete de novo assembly rather than relying on alignments to a reference genome was critical to fully understand these pathogens. Also, generating results in a matter of days is pivotal for ongoing outbreaks where accurate identification of the pathogen strain and knowledge about its genome and epigenetic traits may offer clues to the source of the outbreak and how to treat affected individuals. The paper’s authors write, “We believe that the availability of whole-genome sequences and a large reference database will provide the discriminatory power needed to facilitate outbreak cluster detection and source tracking.”
The study was part of the 100K Pathogen Genome Project, a public/private consortium including FDA and PacBio, designed to sequence and characterize 100,000 pathogens in five years, which will include finished genomes and epigenomes of many isolates for establishing a high-quality reference database.