+

X

Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.

X

Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.

IMAGES ARE PROVIDED BY Pacific Biosciences ON AN "AS-IS" BASIS. Pacific Biosciences DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT, OWNERSHIP, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL Pacific Biosciences BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES OF ANY KIND WHATSOEVER WITH RESPECT TO THE IMAGES.

You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences' rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences
お問い合わせ

Explore scientific publications featuring PacBio long-read sequencing data

Search Query

Author Search

Dissecting the causal mechanism of X-linked Dystonia-Parkinsonism by integrating genome and transcriptome assembly.

Cell
172, 897-909.e21

2018

Abstract +

X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders. Copyright © 2018 Elsevier Inc. All rights reserved.

Single molecule, full-length transcript sequencing provides insight into the extreme metabolism of ruby-throated hummingbird Archilochus colubris

GigaScience

2018

Abstract +

Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids, derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding a total of 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism.

A high-throughput approach for identification of nontuberculous mycobacteria in drinking water reveals relationship between water age and Mycobacterium avium.

mBio
9

2018

Abstract +

Nontuberculous mycobacteria (NTM) frequently detected in drinking water (DW) include species associated with human infections, as well as species rarely linked to disease. Methods for improved the recovery of NTM DNA and high-throughput identification of NTM are needed for risk assessment of NTM infection through DW exposure. In this study, different methods of recovering bacterial DNA from DW were compared, revealing that a phenol-chloroform DNA extraction method yielded two to four times as much total DNA and eight times as much NTM DNA as two commercial DNA extraction kits. This method, combined with high-throughput, single-molecule real-time sequencing of NTMrpoBgenes, allowed the identification of NTM to the species, subspecies, and (in some cases) strain levels. This approach was applied to DW samples collected from 15 households serviced by a chloraminated distribution system, with homes located in areas representing short (<24 h) and long (>24 h) distribution system residence times. Multivariate statistical analysis revealed that greater water age (i.e., combined distribution system residence time and home plumbing stagnation time) was associated with a greater relative abundance ofMycobacterium aviumsubsp.avium, one of the most prevalent NTM causing infections in humans. DW from homes closer to the treatment plant (with a shorter water age) contained more diverse NTM species, includingMycobacterium abscessusandMycobacterium chelonaeOverall, our approach allows NTM identification to the species and subspecies levels and can be used in future studies to assess the risk of waterborne infection by providing insight into the similarity between environmental and infection-associated NTM.IMPORTANCEAn extraction method for improved recovery of DNA from nontuberculous mycobacteria (NTM), combined with single-molecule real-time sequencing (PacBio) of NTMrpoBgenes, was used for high-throughput characterization of NTM species and in some cases strains in drinking water (DW). The extraction procedure recovered, on average, eight times as much NTM DNA and three times as much total DNA from DW as two widely used commercial DNA extraction kits. The combined DNA extraction and sequencing approach allowed high-throughput screening of DW samples to identify NTM, revealing that the relative abundance ofMycobacterium aviumsubsp.aviumincreased with water age. Furthermore, the two-step barcoding approach developed as part of the PacBio sequencing method makes this procedure highly adaptable, allowing it to be used for other target genes and species. Copyright © 2018 Haig et al.

Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing.

Biotechnology and Bioengineering
ePub ahead of print

2018

Abstract +

High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci.Based on Pacific Biosciences single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ~40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5Kb plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.

Genomic analysis of hospital plumbing reveals diverse reservoir of bacterial plasmids conferring carbapenem resistance.

mBio
9, e02011-17

2018

Abstract +

The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization ofblaKPC/blaNDM-positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections withblaKPC-positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission ofLeclerciasp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings.IMPORTANCECarbapenemase-producing organisms (CPOs) are a global concern because of the morbidity and mortality associated with these resistant Gram-negative bacteria. Horizontal plasmid transfer spreads the resistance mechanism to new bacteria, and understanding the plasmid ecology of the hospital environment can assist in the design of control strategies to prevent nosocomial infections. A 5-year genomic and epidemiological survey was undertaken to study the CPOs in the patient-accessible environment, as well as in the plumbing system removed from the patient. This comprehensive survey revealed a vast, unappreciated reservoir of CPOs in wastewater, which was in contrast to the low positivity rate in both the patient population and the patient-accessible environment. While there were few patient-environmental isolate associations, there were plasmid backbones common to both populations. These results are relevant to all hospitals for which CPO colonization may not yet be defined through extensive surveillance.

Sensitive detection of mitochondrial DNA variants for analysis of mitochondrial DNA-enriched extracts from frozen tumor tissue.

Scientific Reports
8, 2261

2018

Abstract +

Large variation exists in mitochondrial DNA (mtDNA) not only between but also within individuals. Also in human cancer, tumor-specific mtDNA variation exists. In this work, we describe the comparison of four methods to extract mtDNA as pure as possible from frozen tumor tissue. Also, three state-of-the-art methods for sensitive detection of mtDNA variants were evaluated. The main aim was to develop a procedure to detect low-frequent single-nucleotide mtDNA-specific variants in frozen tumor tissue. We show that of the methods evaluated, DNA extracted from cytosol fractions following exonuclease treatment results in highest mtDNA yield and purity from frozen tumor tissue (270-fold mtDNA enrichment). Next, we demonstrate the sensitivity of detection of low-frequent single-nucleotide mtDNA variants (=1% allele frequency) in breast cancer cell lines MDA-MB-231 and MCF-7 by single-molecule real-time (SMRT) sequencing, UltraSEEK chemistry based mass spectrometry, and digital PCR. We also show de novo detection and allelic phasing of variants by SMRT sequencing. We conclude that our sensitive procedure to detect low-frequent single-nucleotide mtDNA variants from frozen tumor tissue is based on extraction of DNA from cytosol fractions followed by exonuclease treatment to obtain high mtDNA purity, and subsequent SMRT sequencing for (de novo) detection and allelic phasing of variants.

Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.

Nucleic Acids Research
ePub ahead of print

2018

Abstract +

Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

The axolotl genome and the evolution of key tissue formation regulators.

Nature

2018

Abstract +

Salamanders serve as important tetrapod models for developmental, regeneration and evolutionary studies. An extensive molecular toolkit makes the Mexican axolotl (Ambystoma mexicanum) a key representative salamander for molecular investigations. Here we report the sequencing and assembly of the 32-gigabase-pair axolotl genome using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL). We observed a size expansion of introns and intergenic regions, largely attributable to multiplication of long terminal repeat retroelements. We provide evidence that intron size in developmental genes is under constraint and that species-restricted genes may contribute to limb regeneration. The axolotl genome assembly does not contain the essential developmental gene Pax3. However, mutation of the axolotl Pax3 paralogue Pax7 resulted in an axolotl phenotype that was similar to those seen in Pax3-/- and Pax7-/- mutant mice. The axolotl genome provides a rich biological resource for developmental and evolutionary studies.

Cytogenomic identification and long-read single molecule real-time (SMRT) sequencing of a Bardet-Biedl Syndrome 9 (BBS9) deletion.

NPJ Genomic Medicine
3, 3

2018

Abstract +

Bardet-Biedl syndrome (BBS) is a recessive disorder characterized by heterogeneous clinical manifestations, including truncal obesity, rod-cone dystrophy, renal anomalies, postaxial polydactyly, and variable developmental delays. At least 20 genes have been implicated in BBS, and all are involved in primary cilia function. We report a 1-year-old male child from Guyana with obesity, postaxial polydactyly on his right foot, hypotonia, ophthalmologic abnormalities, and developmental delay, which together indicated a clinical diagnosis of BBS. Clinical chromosomal microarray (CMA) testing and high-throughput BBS gene panel sequencing detected a homozygous 7p14.3 deletion of exons 1-4 of BBS9 that was encompassed by a 17.5?Mb region of homozygosity at chromosome 7p14.2-p21.1. The precise breakpoints of the deletion were delineated to a 72.8?kb region in the proband and carrier parents by third-generation long-read single molecule real-time (SMRT) sequencing (Pacific Biosciences), which suggested non-homologous end joining as a likely mechanism of formation. Long-read SMRT sequencing of the deletion breakpoints also determined that the aberration included the neighboring RP9 gene implicated in retinitis pigmentosa; however, the clinical significance of this was considered uncertain given the paucity of reported cases with unambiguous RP9 mutations. Taken together, our study characterized a BBS9 deletion, and the identification of this shared haplotype in the parents suggests that this pathogenic aberration may be a BBS founder mutation in the Guyanese population. Importantly, this informative case also highlights the utility of long-read SMRT sequencing to map nucleotide breakpoints of clinically relevant structural variants.

Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches.

BMC Genomics
19, 922

2018

Abstract +

Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene.A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily.The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .

Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species.

Proceedings of the National Academy of Sciences of the United States of America
115, E753-E761

2018

Abstract +

The fungal genus ofAspergillusis highly interesting, containing everything from industrial cell factories, model organisms, and human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverseAspergillusspecies (A. campestris,A. novofumigatus,A. ochraceoroseus, andA. steynii) have been whole-genome PacBio sequenced to provide genetic references in threeAspergillussections.A. taichungensisandA. candidusalso were sequenced for SM elucidation. ThirteenAspergillusgenomes were analyzed with comparative genomics to determine phylogeny and genetic diversity, showing that each presented genome contains 15-27% genes not found in other sequenced Aspergilli. In particular,A. novofumigatuswas compared with the pathogenic speciesA. fumigatusThis suggests thatA. novofumigatuscan produce most of the same allergens, virulence, and pathogenicity factors asA. fumigatus, suggesting thatA. novofumigatuscould be as pathogenic asA. fumigatusFurthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences, and predictive algorithms. We thus identify putative SM clusters for aflatoxin, chlorflavonin, and ochrindol inA. ochraceoroseus,A. campestris, andA. steynii, respectively, and novofumigatonin,ent-cycloechinulin, andepi-aszonalenins inA. novofumigatusOur study delivers six fungal genomes, showing the large diversity found in theAspergillusgenus; highlights the potential for discovery of beneficial or harmful SMs; and supports reports ofA. novofumigatuspathogenicity. It also shows how biological, biochemical, and genomic information can be combined to identify genes involved in the biosynthesis of specific SMs.

Firefly genomes illuminate the origin and evolution of bioluminescence

bioRxiv
ePub ahead of Print

2017

Abstract +

Fireflies are among the best-studied of the bioluminescent organisms. Despite long-term interest in the biochemistry, neurobiology, and evolution of firefly flash signals and the widespread biotechnological applications of firefly luciferase, only a limited set of genes related to this complex trait have been described. To investigate the genetic basis of firefly bioluminescence, we generated a high-quality reference genome for the Big Dipper firefly Photinus pyralis, from which the first laboratory luciferase was cloned, using long-read (PacBio), short-read (Illumina), and Hi-C sequencing technologies. To facilitate comparative genomics, we also generated short-read genome assemblies for a Japanese firefly Aquatica lateralis and a bioluminescent click beetle, Ignelater luminosus. Analyses of these genomic datasets supports at least two independent gains of luminescence in beetles, and provides new insights into the evolution of beetle bioluminescence and chemical defenses that likely co-evolved over their 100 million years of evolution.

Pacific Biosciences sequencing and IMGT/HighV-QUEST analysis of full-display combinatorial library.

Frontiers in Immunology
8, 1796

2017

Abstract +

Phage-display selection of immunoglobulin (IG) or antibody single chain Fragment variable (scFv) from combinatorial libraries is widely used for identifying new antibodies for novel targets. Next-generation sequencing (NGS) has recently emerged as a new method for the high throughput characterization of IG and T cell receptor (TR) immune repertoires both in vivo and in vitro. However, challenges remain for the NGS sequencing of scFv from combinatorial libraries owing to the scFv length (>800?bp) and the presence of two variable domains [variable heavy (VH) and variable light (VL) for IG] associated by a peptide linker in a single chain. Here, we show that single-molecule real-time (SMRT) sequencing with the Pacific Biosciences RS II platform allows for the generation of full-length scFv reads obtained from an in vivo selection of scFv-phages in an animal model of atherosclerosis. We first amplified the DNA of the phagemid inserts from scFv-phages eluted from an aortic section at the third round of the in vivo selection. From this amplified DNA, 450,558 reads were obtained from 15 SMRT cells. Highly accurate circular consensus sequences from these reads were generated, filtered by quality and then analyzed by IMGT/HighV-QUEST with the functionality for scFv. Full-length scFv were identified and characterized in 348,659 reads. Full-length scFv sequencing is an absolute requirement for analyzing the associated VH and VL domains enriched during the in vivo panning rounds. In order to further validate the ability of SMRT sequencing to provide high quality, full-length scFv sequences, we tracked the reads of an scFv-phage clone P3 previously identified by biological assays and Sanger sequencing. Sixty P3 reads showed 100% identity with the full-length scFv of 767?bp, 53 of them covering the whole insert of 977?bp, which encompassed the primer sequences. The remaining seven reads were identical over a shortened length of 939?bp that excludes the vicinity of primers at both ends. Interestingly these reads were obtained from each of the 15 SMRT cells. Thus, the SMRT sequencing method and the IMGT/HighV-QUEST functionality for scFv provides a straightforward protocol for characterization of full-length scFv from combinatorial phage libraries.

Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation.

Nature Biotechnology
ePub ahead of print

2017

Abstract +

Shotgun metagenomics methods enable characterization of microbial communities in human microbiome and environmental samples. Assembly of metagenome sequences does not output whole genomes, so computational binning methods have been developed to cluster sequences into genome 'bins'. These methods exploit sequence composition, species abundance, or chromosome organization but cannot fully distinguish closely related species and strains. We present a binning method that incorporates bacterial DNA methylation signatures, which are detected using single-molecule real-time sequencing. Our method takes advantage of these endogenous epigenetic barcodes to resolve individual reads and assembled contigs into species- and strain-level bins. We validate our method using synthetic and real microbiome sequences. In addition to genome binning, we show that our method links plasmids and other mobile genetic elements to their host species in a real microbiome sample. Incorporation of DNA methylation information into shotgun metagenomics analyses will complement existing methods to enable more accurate sequence binning.

Comparative genomic analyses of Clavibacter michiganensis subsp. insidiosus and pathogenicity on Medicago truncatula.

Phytopathology
ePub ahead of print

2017

Abstract +

Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,930 genes, with 401 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.