Korean service provider DNA Link has established strong expertise with the PacBio sequencing platform in response to high global demand for the technology.
Obtaining microbial genomes with the highest accuracy and contiguity is extremely important when exploring the functional impact of genetic and epigenetic variants on a genome-wide scale. A comprehensive view of the bacterial genome, including genes, regulatory regions, IS elements, phage integration sites, and base modifications is vital to understanding key traits such as antibiotic resistance, virulence, and metabolism. SMRT Sequencing provides complete genomes, often assembled into a single contig. Our streamlined microbial multiplexing procedure for the Sequel System, from library preparation to genome assembly, can be completed with less than 8 hours bench time. Starting with high-quality genomic DNA (gDNA), samples are sheared to approximately 12 kb distribution, ligated with barcoded overhang adapters, pooled at equimolar representation, and sequenced. Demultiplexing of samples is automated, allowing for immediate genome assembly on our SMRT Link analysis software solution.
Explore a list of PacBio certified service providers.
Our understanding of microbiology has evolved enormously over the last 150 years. Few institutions have witnessed our collective progress more closely than the National Collection of Type Cultures (NCTC). In fact, the collection itself is a record of the many milestones microbiologists have crossed, building on the discoveries of those who came before. To date, 60% of NCTC’s historic collection now has a closed, finished reference genome, thanks to PacBio Single Molecule, Real- Time (SMRT) Sequencing. We are excited to be their partner in crossing this latest milestone on their quest to improve human and animal health by understanding the microscopic world.
The UK’s National Collection of Type Cultures (NCTC) is a unique collection of more than 5,000 expertly preserved and authenticated bacterial cultures, many of historical significance. Founded in 1920, NCTC is the longest established collection of its type anywhere in the world, with a history of its own that has reflected — and contributed to — the evolution of microbiology for more than 100 years.
Genome sequencing of endosymbiotic bacterial Streptomyces sp. from Antartic lichen using Single Molecule Real-time Sequencing (SMRT) technology.
Along with the advent of next-generation sequencing (NGS) techniques, it has become possible to sequence a microbial genome very quickly with high coverage. Recently, PacificBioscience developed single molecule real-time sequencing (SMRT) technology, 3rd generation sequencing platform, which provide much longer (average read length: 1.5Kb) reads without PCR amplification. We did de novo sequencing of Streptomyces sp. using Illumina GAIIx, Roche 454 and PacBio RS system and compared the data. The endosymbiotic bacteria Streptomyces sp. PAMC 26508 was isolated from Antarctic lichen Psoroma sp. that grows attached rocks on Barton Peninsula, King George Island, Antarctica (62, 13’S, 58, 47’W). With 4 SMRT cells, we could get more than 15x coverage of corrected sequence data for de novo assembly. Comparing the performance of other sequencing platforms, PacBio platform could generate data on similar manner with general mid-level GC content organism. In conclusion, PacBio RS system, SMRT technology, shows better performance with high GC content organisms and is expected to be the new tool to improve the de novo sequencing and assembly.
Genome sequencing of microbial genomes using Single Molecule Real-time sequencing (SMRT) technology.
In the last year, high-throughput sequencing technologies have progressed from proof-of-concept to production quality. Although each technology is able to produce vast quantities of sequence information, in every case the underlying chemistry limits reads to very short lengths. We present a examining de novo assembly comparison with bacterial genome assembly varying genome size (from 3.1Mb to 7.6Mb) and different G+C contents (from 43% to 71%), respectively. We analyzed Solexa reads, 454 reads and Pacbio RS reads from Streptomyces sp. (Genome size, 7.6 Mb; G+C content, 71%), Psychrobacter sp. (Genome size, 3.5 Mb; G+C content, 43%), Salinibacterium sp. (Genome size, 3.1 Mb; G+C content, 61%) and Frigoribacterium sp. (Genome size, 3.3 Mb; G+C content, 63%). We assembly each bacterial genome using Celera assembler 7.0 with and without PacBio RS reads. We found out that the assemble result with Pacbio RS reads have less contigs and scaffolds, and better N50 values.
Understanding the genetic basis of infectious diseases is critical to enacting effective treatments, and several large-scale sequencing initiatives are underway to collect this information. Sequencing bacterial samples is typically performed by mapping sequence reads against genomes of known reference strains. While such resequencing informs on the spectrum of single-nucleotide differences relative to the chosen reference, it can miss numerous other forms of variation known to influence pathogenicity: structural variations (duplications, inversions), acquisition of mobile elements (phages, plasmids), homonucleotide length variation causing phase variation, and epigenetic marks (methylation, phosphorothioation) that influence gene expression to switch bacteria from non- pathogenic to pathogenic states. Therefore, sequencing methods which provide complete, de novo genome assemblies and epigenomes are necessary to fully characterize infectious disease agents in an unbiased, hypothesis-free manner. Hybrid assembly methods have been described that combine long sequence reads from SMRT DNA Sequencing with short reads (SMRT CCS (circular consensus) or second-generation reads), wherein the short reads are used to error-correct the long reads which are then used for assembly. We have developed a new paradigm for microbial de novo assemblies in which SMRT sequencing reads from a single long insert library are used exclusively to close the genome through a hierarchical genome assembly process, thereby obviating the need for a second sample preparation, sequencing run, and data set. We have applied this method to achieve closed de novo genomes with accuracies exceeding QV50 (>99.999%) for numerous disease outbreak samples, including E. coli, Salmonella, Campylobacter, Listeria, Neisseria, and H. pylori. The kinetic information from the same SMRT Sequencing reads is utilized to determine epigenomes. Approximately 70% of all methyltransferase specificities we have determined to date represent previously unknown bacterial epigenetic signatures. With relatively short sequencing run times and automated analysis pipelines, it is possible to go from an unknown DNA sample to its complete de novo genome and epigenome in about a day.
Understanding the genetic basis of infectious diseases is critical to enacting effective treatments, and several large-scale sequencing initiatives are underway to collect this information. Sequencing bacterial samples is typically performed by mapping sequence reads against genomes of known reference strains. While such resequencing informs on the spectrum of single nucleotide differences relative to the chosen reference, it can miss numerous other forms of variation known to influence pathogenicity: structural variations (duplications, inversions), acquisition of mobile elements (phages, plasmids), homonucleotide length variation causing phase variation, and epigenetic marks (methylation, phosphorothioation) that influence gene expression to switch bacteria from non-pathogenic to pathogenic states. Therefore, sequencing methods which provide complete, de novo genome assemblies and epigenomes are necessary to fully characterize infectious disease agents in an unbiased, hypothesis-free manner. Hybrid assembly methods have been described that combine long sequence reads from SMRT DNA sequencing with short, high-accuracy reads (SMRT (circular consensus sequencing) CCS or second-generation reads) to generate long, highly accurate reads that are then used for assembly. We have developed a new paradigm for microbial de novo assemblies in which long SMRT sequencing reads (average readlengths >5,000 bases) are used exclusively to close the genome through a hierarchical genome assembly process, thereby obviating the need for a second sample preparation, sequencing run and data set. We have applied this method to achieve closed de novo genomes with accuracies exceeding QV50 (>99.999%) to numerous disease outbreak samples, including E. coli, Salmonella, Campylobacter, Listeria, Neisseria, and H. pylori. The kinetic information from the same SMRT sequencing reads is utilized to determine epigenomes. Approximately 70% of all methyltransferase specificities we have determined to date represent previously unknown bacterial epigenetic signatures. The process has been automated and requires less than 1 day from an unknown DNA sample to its complete de novo genome and epigenome.
PacBio RS II sequencing chemistries provide read lengths beyond 20 kb with high consensus accuracy. The long read lengths of P4-C2 chemistry and demonstrated consensus accuracy of 99.999% are ideal for applications such as de novo assembly, targeted sequencing and isoform sequencing. The recently launched P5-C3 chemistry generates even longer reads with N50 often >10,000 bp, making it the best choice for scaffolding and spanning structural rearrangements. With these chemistry advances, PacBio’s read length performance is now primarily determined by the SMRTbell library itself. Size selection of a high-quality, sheared 20 kb library using the BluePippin™ System has been demonstrated to increase the N50 read length by as much as 5 kb with C3 chemistry. BluePippin size selection or a more stringent AMPure® PB selection cutoff can be used to recover long fragments from degraded genomic material. The selection of chemistries, P4-C2 versus P5-C3, is highly dependent on the final size distribution of the SMRTbell library and experimental goals. PacBio’s long read lengths also allow for the sequencing of full-length cDNA libraries at single-molecule resolution. However, longer transcripts are difficult to detect due to lower abundance, amplification bias, and preferential loading of smaller SMRTbell constructs. Without size selection, most sequenced transcripts are 1-1.5 kb. Size selection dramatically increases the number of transcripts >1.5 kb, and is essential for >3 kb transcripts.
Using whole exome sequencing and bacterial pathogen sequencing to investigate the genetic basis of pulmonary non-tuberculous mycobacterial infections.
Pulmonary non-tuberculous mycobacterial (PNTM) infections occur in patients with chronic lung disease, but also in a distinct group of elderly women without lung defects who share a common body morphology: tall and lean with scoliosis, pectus excavatum, and mitral valve prolapse. In order to characterize the human host susceptibility to PNTM, we performed whole exome sequencing (WES) of 44 individuals in extended families of patients with active PNTM as well as 55 additional unrelated individuals with PNTM. This unique collection of familial cohorts in PNTM represents an important opportunity for a high yield search for genes that regulate mucosal immunity. An average of 58 million 100bp paired-end Illumina reads per exome were generated and mapped to the hg19 reference genome. Following variant detection and classification, we identified 58,422 potentially high-impact SNPs, 97.3% of which were missense mutations. Segregating variants using the family pedigrees as well as comparisons to the unrelated individuals identified multiple potential variants associated with PNTM. Validations of these candidate variants in a larger PNTM cohort are underway. In addition to WES, we sequenced the genomes of 52 mycobacterial isolates, including 9 from these PNTM patients, to integrate host PNTM susceptibility with mycobacterial genotypes and gain insights into the key factors involved in this devastating disease. These genomes were sequenced using a combination of 454, Illumina, and PacBio platforms and assembled using multiple genome assemblers. The resulting genome sequences were used to identify mycobacterial genotypes associated with virulence, invasion, and drug resistance.
The newer hierarchical genome assembly process (HGAP) performs de novo assembly using data from a single PacBio long insert library. To assess the benefits of this method, DNA from several Salmonella enterica serovars was isolated from a pure culture. Genome sequencing was performed using Pacific Biosciences RS sequencing technology. The HGAP process enabled us to close sixteen Salmonella subsp. enterica genomes and their associated mobile elements: The ten serotypes include: Salmonella enterica subsp. enterica serovar Enteritidis (S. Enteritidis) S. Bareilly, S. Heidelberg, S. Cubana, S. Javiana and S. Typhimurium, S. Newport, S. Montevideo, S. Agona, and S. Tennessee. In addition, we were able to detect novel methyltransferases (MTases) by using the Pacific Biosciences kinetic score distributions showing that each serovar appears to have a novel methylation pattern. For example while all Salmonella serovars examined so far have methylase specific activity for 5’-GATC-3’/3’-CTAG-5’ and 5’-CAGAG-3’/3’-GTCTC-5’ (underlined base indicates a modification), S. Heidelberg is uniquely specific for 5’-ACCANCC-3’/3’-TGGTNGG-5’, while S. Typhimurium has uniquely methylase specific for 5′-GATCAG-3’/3′- CTAGTC-5′ sites, for the samples examined so far. We believe that this may be due to the unique environments and phages that these serotypes have been exposed to. Furthermore, our analysis identified and closed a variety of plasmids such as mobilization plasmids, antimicrobial resistance plasmids and IncX plasmids carrying a Type IV secretion system (T4SS). The VirB/D4 T4SS apparatus is important in that it assists with rapid dissemination of antibiotic resistance and virulence determinants. Presently, only limited information exists regarding the genotypic characterization of drug resistance in S. Heidelberg isolates derived from various host species. Here, we characterize two S. Heidelberg outbreak isolates from two different outbreaks. Both isolates contain the IncX plasmid of approximately 35 kb, and carried the genes virB1, virB2, virB3/4, virB5, virB6, virB7, virB8, virB9, virB10, virB11, virD2, and virD4, that are associated with the T4SS. In addition, the outbreak isolate associated with ground turkey carries a 4,473 bp mobilization plasmid and an incompatibility group (Inc) I1 antimicrobial resistance plasmid encoding resistance to gentamicin (aacC2), beta-lactam (bl2b_tem), streptomycin (aadAI) and tetracycline (tetA, tetR) while the outbreak isolate associated with chicken breast carries the IncI1 plasmid encoding resistance to gentamicin (aacC2), streptomycin (aadAI) and sulfisoxazole (sul1). Using this new technology we explored the genetic elements present in resistant pathogens which will achieve a better understanding of the evolution of Salmonella.
SFAF 2014 Presentation Slides: James Gurtowski of Cold Spring Harbor Laboratory (CSHL) shared assembly results for a variety of eukaryotic genomes, including yeast, arabidopsis, and rice.
Lameness is a significant problem resulting in millions of dollars in lost revenue annually. In commercial broilers, the most common cause of lameness is bacterial chondronecrosis with osteomyelitis (BCO). We are using a wire flooring model to induce lameness attributable to BCO. We used 16S ribosomal DNA sequencing to determine that Staphylococcus spp. were the main species associated with BCO. Staphylococcus agnetis, which previously had not been isolated from poultry, was the principal species isolated from the majority of the bone lesion samples. Administering S. agnetis in the drinking water to broilers reared on wire flooring increased the incidence of BCO three-fold when compared with broilers drinking tap water (P = 0.001). We found that the minimum effective dose of Staphylococcus agnetis to induce BCO in broilers grown on wire flooring experiment is 105 cfu/ml. We used PacBio and Illumina sequencing to assemble a 2.4 Mbp contig representing the genome and a 34 kbp contig for the largest plasmid of S. agnetis. Annotation of this genome is underway through comparative genomics with other Staphylococcus genomes, and identification of virulence factors. Our goal is to elucidate genetic diversity, toxins, and pathogenicity determinants, for this poorly characterized species. Isolating pathogenic bacterial species, defining their likely route of transmission to broilers, and genomic analyses will contribute substantially to the development of measures for mitigating BCO losses in poultry.
Comparative genome analysis of Clavibacter michiganensis subsp. michiganensis strains provides insights into genetic diversity and virulence.
Clavibacter michiganensis subsp. michiganensis (Cmm) is a gram positive actinomycete, causing bacterial canker of tomato (Solanum lycopersicum) a disease that can cause significant losses in tomato production. In this study, we determined the complete genome sequence of 13 California Cmm strains and one saprophytic Clavibacter strain using a combination of Ilumina and PacBio sequencing. The California Cmm strains have genome size (3.2 -3.3 mb) similar to the reference strain NCPPB382 (3.3 mb) with =98% sequence identity. Cmm strains from California share =92% genes (8-10% are noble genes) with the reference Cmm strain NCPPB382. Despite this similarity, we detected significant alternatives in California strains with respect to plasmid number, plasmid composition, and genomic island presence indicating acquisition of unique mechanisms controlling virulence. Plasmids pCM1 and pCM2, that were previously demonstrated to be required for NCPPB382 virulence, also differ in their presence and gene content across Cmm strains. pCM2 is absent in some Cmm strains and that still retain virulence in tomato. Saprophytic Clavibacter possess a novel plasmid, pSCM, and lacks the majority of characterized virulence factors. Genome sequence information was also used to design specific and sensitive primer pairs for Cmm detection. A mechanistic understanding of how genomic changes have impacted Cmm virulence and survival across diverse strains will be necessary for developing a robust disease control strategies for bacterial canker of tomato.