PacBio 2013 User Group Meeting Presentation Slides: Lisbeth Guethlein from Stanford University School of Medicine looked at highly repetitive and variable immune regions of the orangutan genome. Guethlein reported that “PacBio managed to accomplish in a week what I have been working on for a couple years” (with Sanger sequencing), and the results were concordant. “Long story short, I was a happy customer.”
Comparative genome analysis of Clavibacter michiganensis subsp. michiganensis strains provides insights into genetic diversity and virulence.
Clavibacter michiganensis subsp. michiganensis (Cmm) is a gram positive actinomycete, causing bacterial canker of tomato (Solanum lycopersicum) a disease that can cause significant losses in tomato production. In this study, we determined the complete genome sequence of 13 California Cmm strains and one saprophytic Clavibacter strain using a combination of Ilumina and PacBio sequencing. The California Cmm strains have genome size (3.2 -3.3 mb) similar to the reference strain NCPPB382 (3.3 mb) with =98% sequence identity. Cmm strains from California share =92% genes (8-10% are noble genes) with the reference Cmm strain NCPPB382. Despite this similarity, we detected significant alternatives in California strains with respect to plasmid number, plasmid composition, and genomic island presence indicating acquisition of unique mechanisms controlling virulence. Plasmids pCM1 and pCM2, that were previously demonstrated to be required for NCPPB382 virulence, also differ in their presence and gene content across Cmm strains. pCM2 is absent in some Cmm strains and that still retain virulence in tomato. Saprophytic Clavibacter possess a novel plasmid, pSCM, and lacks the majority of characterized virulence factors. Genome sequence information was also used to design specific and sensitive primer pairs for Cmm detection. A mechanistic understanding of how genomic changes have impacted Cmm virulence and survival across diverse strains will be necessary for developing a robust disease control strategies for bacterial canker of tomato.
While advances in RNA sequencing methods have accelerated our understanding of the human transcriptome, isoform discovery remains a challenge because short read lengths require complicated assembly algorithms to infer the contiguity of full-length transcripts. With PacBio’s long reads, one can now sequence full-length transcript isoforms up to 10 kb. The PacBio Iso- Seq protocol produces reads that originate from independent observations of single molecules, meaning no assembly is needed. Here, we sequenced the transcriptome of the human MCF-7 breast cancer cell line using the Clontech SMARTer® cDNA preparation kit and the PacBio RS II. Using PacBio Iso-Seq bioinformatics software, we obtained 55,770 unique, full-length, high-quality transcript sequences that were subsequently mapped back to the human genome with = 99% accuracy. In addition, we identified both known and novel fusion transcripts. To assess our results, we compared the predicted ORFs from the PacBio data against a published mass spectrometry dataset from the same cell line. 84% of the proteins identified with the Uniprot protein database were recovered by the PacBio predictions. Notably, 251 peptides solely matched to the PacBio generated ORFs and were entirely novel, including abundant cases of single amino acid polymorphisms, cassette exon splicing and potential alternative protein coding frames.
Early detection of colorectal cancer (CRC) and its precursor lesions (adenomas) is crucial to reduce mortality rates. The fecal immunochemical test (FIT) is a non-invasive CRC screening test that detects the blood-derived protein hemoglobin. However, FIT sensitivity is suboptimal especially in detection of CRC precursor lesions. As adenoma-to-carcinoma progression is accompanied by alternative splicing, tumor-specific proteins derived from alternatively spliced RNA transcripts might serve as candidate biomarkers for CRC detection.
ASMS Conference: Approaching the ‘perfect’ database – single-molecule, full-length transcript sequencing to create sample-specific, full-length protein databases
Recent advances in DNA sequencing technologies based on single-molecule detection now enable determination of full-length transcript sequences and, thus, all protein sequences in a sample. Utilizing data from this exciting…
AGBT Virtual Poster: Using the PacBio Iso-Seq method to search for novel colorectal cancer biomarkers
Early detection of colorectal cancer (CRC) and its precursor lesions (adenomas) is crucial to reduce mortality rates. The fecal immunochemical test (FIT) is a non-invasive CRC screening test that detects…
In this AGBT 2017 poster, the University of Helsinki’s Petri Auevinen reports on efforts to understand bacteria that grow on, and subsequently spoil, food. This analysis monitored DNA modifications and…
Tremendous flexibility is maintained in the human proteome via alternative splicing, and cancer genomes often subvert this flexibility to promote survival. Identification and annotation of cancer-specific mRNA isoforms is critical…
In this PacBio User Group Meeting presentation, PacBio scientist Kristin Mars speaks about recent updates, such as the single-day library prep that’s now possible with the Iso-Seq Express workflow. She…
Complete genome sequence of Paracoccus sp. Arc7-R13, a silver nanoparticles synthesizing bacterium isolated from Arctic Ocean sediments
Paracoccus sp. Arc7-R13, a silver nanoparticles (AgNPs) synthesizing bacterium, was isolated from Arctic Ocean sediment. Here we describe the complete genome of Paracoccus sp. Arc7-R13. The complete genome contains 4,040,012?bp with 66.66?mol%?G?+?C content, including one circular chromosome of 3,231,929?bp (67.45?mol%?G?+?C content), and eight plasmids with length ranging from 24,536?bp to 199,685?bp. The genome contains 3835 protein-coding genes (CDSs), 49 tRNA genes, as well as 3 rRNA operons as 16S-23S-5S rRNA. Based on the gene annotation and Swiss-Prot analysis, a total of 15 genes belonging to 11 kinds, including silver exporting P-type ATPase (SilP), alkaline phosphatase, nitroreductase, thioredoxin reductase, NADPH dehydrogenase and glutathione peroxidase, might be related to the synthesis of AgNPs. Meanwhile, many additional genes associated with synthesis of AgNPs such as protein-disulfide isomerase, c-type cytochrome, glutathione synthase and dehydrogenase reductase were also identified.
Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.
Full-length mRNA sequencing and gene expression profiling reveal broad involvement of natural antisense transcript gene pairs in pepper development and response to stresses.
Pepper is an important vegetable with great economic value and unique biological features. In the past few years, significant development has been made towards understanding the huge complex pepper genome; however, pepper functional genomics has not been well studied. To better understand the pepper gene structure and pepper gene regulation, we conducted full-length mRNA sequencing by PacBio sequencing and obtained 57862 high-quality full-length mRNA sequences derived from 18362 previously annotated and 5769 newly detected genes. New gene models were built that combined the full-length mRNA sequences and corrected approximately 500 fragmented gene models from previous annotations. Based on the full-length mRNA, we identified 4114 and 5880 pepper genes forming natural antisense transcript (NAT) genes in-cis and in-trans, respectively. Most of these genes accumulate small RNAs in their overlapping regions. By analyzing these NAT gene expression patterns in our transcriptome data, we identified many NAT pairs responsive to a variety of biological processes in pepper. Pepper formate dehydrogenase 1 (FDH1), which is required for R-gene-mediated disease resistance, may be regulated by nat-siRNAs and participate in a positive feedback loop in salicylic acid biosynthesis during resistance responses. Several cis-NAT pairs and subgroups of trans-NAT genes were responsive to pepper pericarp and placenta development, which may play roles in capsanthin and capsaicin biosynthesis. Using a comparative genomics approach, the evolutionary mechanisms of cis-NATs were investigated, and we found that an increase in intergenic sequences accounted for the loss of most cis-NATs, while transposon insertion contributed to the formation of most new cis-NATs. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.
The landscape of SNCA transcripts across synucleinopathies: New insights from long reads sequencing analysis
Dysregulation of alpha-synuclein expression has been implicated in the pathogenesis of synucleinopathies, in particular Parkinsontextquoterights Disease (PD) and Dementia with Lewy bodies (DLB). Previous studies have shown that the alternatively spliced isoforms of the SNCA gene are differentially expressed in different parts of the brain for PD and DLB patients. Similarly, SNCA isoforms with skipped exons can have a functional impact on the protein domains. The large intronic region of the SNCA gene was also shown to harbor structural variants that affect transcriptional levels. Here we apply the first study of using long read sequencing with targeted capture of both the gDNA and cDNA of the SNCA gene in brain tissues of PD, DLB, and control samples using the PacBio Sequel system. The targeted full-length cDNA (Iso-Seq) data confirmed complex usage of known alternative start sites and variable 3textquoteright UTR lengths, as well as novel 5textquoteright starts and 3textquoteright ends not previously described. The targeted gDNA data allowed phasing of up to 81% of the ~114kb SNCA region, with the longest phased block excedding 54 kb. We demonstrate that long gDNA and cDNA reads have the potential to reveal long-range information not previously accessible using traditional sequencing methods. This approach has a potential impact in studying disease risk genes such as SNCA, providing new insights into the genetic etiologies, including perturbations to the landscape the gene transcripts, of human complex diseases such as synucleinopathies.
Chemical defense against predators is widespread in natural ecosystems. Occasionally, taxonomically distant organisms share the same defense chemical. Here, we describe an unusual tripartite marine symbiosis, in which an intracellular bacterial symbiont (“Candidatus Endobryopsis kahalalidefaciens”) uses a diverse array of biosynthetic enzymes to convert simple substrates into a library of complex molecules (the kahalalides) for chemical defense of the host, the alga Bryopsis sp., against predation. The kahalalides are subsequently hijacked by a third partner, the herbivorous mollusk Elysia rufescens, and employed similarly for defense. “Ca E. kahalalidefaciens” has lost many essential traits for free living and acts as a factory for kahalalide production. This interaction between a bacterium, an alga, and an animal highlights the importance of chemical defense in the evolution of complex symbioses.Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Rapid transcriptional responses to serum exposure are associated with sensitivity and resistance to antibody-mediated complement killing in invasive Salmonella Typhimurium ST313
Background: Salmonella Typhimurium ST313 exhibits signatures of adaptation to invasive human infection, including higher resistance to humoral immune responses than gastrointestinal isolates. Full resistance to antibody-mediated complement killing (serum resistance) among nontyphoidal Salmonellae is uncommon, but selection of highly resistant strains could compromise vaccine-induced antibody immunity. Here, we address the hypothesis that serum resistance is due to a distinct genotype or transcriptome response in S. Typhimurium ST313.