X

Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.

X

Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.

IMAGES ARE PROVIDED BY Pacific Biosciences ON AN "AS-IS" BASIS. Pacific Biosciences DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT, OWNERSHIP, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL Pacific Biosciences BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES OF ANY KIND WHATSOEVER WITH RESPECT TO THE IMAGES.

You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences' rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences
Contact:

Explore scientific publications featuring PacBio long-read sequencing data

Search Query

Author Search

Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads

bioRxiv
Preprint

2019

Abstract +

The sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome, CHM13. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of large tandem repeats, as validated with orthogonal analyses. Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Our analysis also shows a slight excess of disrupted gene annotations, indicating further developments are needed to improve residual single-base-pair indel errors. Despite these shortcomings, our results suggest that HiFi may currently be the most effective stand-alone technology for de novo assembly of human genomes.

A High-Quality Genome Assembly from a Single, Field collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System

bioRxiv
Preprint

2019

Abstract +

A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies, however, long-read methods have historically had greater input DNA requirements and higher costs than next generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female Spotted Lanternfly (Lycorma delicatula) using a single PacBio SMRT Cell. The Spotted Lanternfly is an invasive species recently discovered in the northeastern United States, threatening to damage economically important crop plants in the region. The DNA from one individual was used to make one standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on one Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing approximately 36-fold coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Further, it was possible to segregate more than half of the diploid genome into the two separate haplotypes. The assembly also recovered two microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.

Germline murine immunoglobulin IGHV genes in wild-derived and classical inbred strains: a comparison

bioRxiv
Preprint

2019

Abstract +

To better understand the subspecies origin of antibody genes in classical inbred mouse strains, the IGH gene loci of four wild-derived mouse strains were explored by analysis of VDJ gene rearrangements. A total of 341 unique IGHV gene sequences were inferred in the wild-derived strains, including 247 sequences that have not previously been reported. The genes of the Non-Obese Diabetic (NOD) strain were also documented, and all but one of the 84 inferred NOD IGHV genes have previously been observed in C57BL/6 mice. This is surprising because the Swiss mouse-derived NOD strain and the C57BL/6 strain have no known shared ancestry. The relationships between the genes of the wild-derived inbred strains and of the C57BL/6, NOD and BALB/c classical inbred strain were then explored. The IGH loci of the C57BL/6 and the MSM/MsJ strains share many sequences, but analysis showed that few sequences are shared with wild-derived strains representing the three major subspecies of the house mouse. There were also few IGHV sequences that were shared by the BALB/c strain and any of the four wild-derived strains. The origins of IGHV genes in the C57BL/6, MSM/MsJ and BALB/c strains therefore remain unclear. These unexpected similarities and differences highlight our lack of understanding of the antibody gene loci of the laboratory mouse, with implications for the interpretation of strain-specific differences in models of antibody-mediated diseases, and of Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) data. These results also suggest that a position-based immunoglobulin gene nomenclature may be unworkable in the mouse.

Islands of retroelements are major components of Drosophila centromeres

PLOS Biology
17, 1-40

2019

Abstract +

Long-read sequencing, CENP-A ChIP, and chromatin fiber imaging reveal the composition and organization of Drosophila melanogaster centromeres, which have long remained elusive despite the high quality of this species’ genome. assembly.

Long-Read Sequencing Emerging in Medical Genetics

Frontiers in Genetics
10, 426

2019

Abstract +

The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.

The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication.

Nature genetics
51, 865-876

2019

Abstract +

High oil and protein content make tetraploid peanut a leading oil and food legume. Here we report a high-quality peanut genome sequence, comprising 2.54?Gb with 20 pseudomolecules and 83,709 protein-coding gene models. We characterize gene functional groups implicated in seed size evolution, seed oil content, disease resistance and symbiotic nitrogen fixation. The peanut B subgenome has more genes and general expression dominance, temporally associated with long-terminal-repeat expansion in the A subgenome that also raises questions about the A-genome progenitor. The polyploid genome provided insights into the evolution of Arachis hypogaea and other legume chromosomes. Resequencing of 52 accessions suggests that independent domestications formed peanut ecotypes. Whereas 0.42-0.47 million years ago (Ma) polyploidy constrained genetic variation, the peanut genome sequence aids mapping and candidate-gene discovery for traits such as seed size and color, foliar disease resistance and others, also providing a cornerstone for functional genomics and peanut improvement.

Three phylogenetic groups have driven the recent population expansion of Cryptococcus neoformans.

Nature communications
10, 2035

2019

Abstract +

Cryptococcus neoformans (C. neoformans var. grubii) is an environmentally acquired pathogen causing 181,000 HIV-associated deaths each year. We sequenced 699 isolates, primarily C. neoformans from HIV-infected patients, from 5 countries in Asia and Africa. The phylogeny of C. neoformans reveals a recent exponential population expansion, consistent with the increase in the number of susceptible hosts. In our study population, this expansion has been driven by three sub-clades of the C. neoformans VNIa lineage; VNIa-4, VNIa-5 and VNIa-93. These three sub-clades account for 91% of clinical isolates sequenced in our study. Combining the genome data with clinical information, we find that the VNIa-93 sub-clade, the most common sub-clade in Uganda and Malawi, was associated with better outcomes than VNIa-4 and VNIa-5, which predominate in Southeast Asia. This study lays the foundation for further work investigating the dominance of VNIa-4, VNIa-5 and VNIa-93 and the association between lineage and clinical phenotype.

Extended haplotype phasing of de novo genome assemblies with FALCON-Phase

bioRxiv
Preprint

2019

Abstract +

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. These assemblies can be created in various ways, such as use of tissues that contain single-haplotype (haploid) genomes, or by co-sequencing of parental genomes, but these approaches can be impractical in many situations. We present FALCON-Phase, which integrates long-read sequencing data and ultra-long-range Hi-C chromatin interaction data of a diploid individual to create high-quality, phased diploid genome assemblies. The method was evaluated by application to three datasets, including human, cattle, and zebra finch, for which high-quality, fully haplotype resolved assemblies were available for benchmarking. Phasing algorithm accuracy was affected by heterozygosity of the individual sequenced, with higher accuracy for cattle and zebra finch (>97%) compared to human (82%). In addition, scaffolding with the same Hi-C chromatin contact data resulted in phased chromosome-scale scaffolds.

Full-length transcriptome analysis of Litopenaeus vannamei reveals transcript variants involved in the innate immune system.

Fish & shellfish immunology
87, 346-359

2019

Abstract +

To better understand the immune system of shrimp, this study combined PacBio isoform sequencing (Iso-Seq) and Illumina paired-end short reads sequencing methods to discover full-length immune-related molecules of the Pacific white shrimp, Litopenaeus vannamei. A total of 72,648 nonredundant full-length transcripts (unigenes) were generated with an average length of 2545 bp from five main tissues, including the hepatopancreas, cardiac stomach, heart, muscle, and pyloric stomach. These unigenes exhibited a high annotation rate (62,164, 85.57%) when compared against NR, NT, Swiss-Prot, Pfam, GO, KEGG and COG databases. A total of 7544 putative long noncoding RNAs (lncRNAs) were detected and 1164 nonredundant full-length transcripts (449 UniTransModels) participated in the alternative splicing (AS) events. Importantly, a total of 5279 nonredundant full-length unigenes were successfully identified, which were involved in the innate immune system, including 9 immune-related processes, 19 immune-related pathways and 10 other immune-related systems. We also found wide transcript variants, which increased the number and function complexity of immune molecules; for example, toll-like receptors (TLRs) and interferon regulatory factors (IRFs). The 480 differentially expressed genes (DEGs) were significantly higher or tissue-specific expression patterns in the hepatopancreas compared with that in other four tested tissues (FDR <0.05). Furthermore, the expression levels of six selected immune-related DEGs and putative IRFs were validated using real-time PCR technology, substantiating the reliability of the PacBio Iso-seq results. In conclusion, our results provide new genetic resources of long-read full-length transcripts data and information for identifying immune-related genes, which are an invaluable transcriptomic resource as genomic reference, especially for further exploration of the innate immune and defense mechanisms of shrimp. Copyright © 2019 Elsevier Ltd. All rights reserved.

Persistence of Moraxella catarrhalis in Chronic Obstructive Pulmonary Disease and Regulation of the Hag/MID Adhesin.

The Journal of infectious diseases
219, 1448-1455

2019

Abstract +

Persistence of bacterial pathogens in the airways has profound consequences on the course and pathogenesis of chronic obstructive pulmonary disease (COPD). Patients with COPD continuously acquire and clear strains of Moraxella catarrhalis, a major pathogen in COPD. Some strains are cleared quickly and some persist for months to years. The mechanism of the variability in duration of persistence is unknown. Guided by genome sequences of selected strains, we studied the expression of Hag/MID, hag/mid gene sequences, adherence to human cells, and autoaggregation in longitudinally collected strains of M. catarrhalis from adults with COPD. Twenty-eight of 30 cleared strains of M. catarrhalis expressed Hag/MID whereas 17 of 30 persistent strains expressed Hag/MID upon acquisition by patients. All persistent strains ceased expression of Hag/MID during persistence. Expression of Hag/MID in human airways was regulated by slipped-strand mispairing. Virulence-associated phenotypes (adherence to human respiratory epithelial cells and autoaggregation) paralleled Hag/MID expression in airway isolates.Most strains of M. catarrhalis express Hag/MID upon acquisition by adults with COPD and all persistent strains shut off expression during persistence. These observations suggest that Hag/MID is important for initial colonization by M. catarrhalis and that cessation of expression facilitates persistence in COPD airways.© The Author(s) 2018. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

Precise therapeutic gene correction by a simple nuclease-induced double-stranded break.

Nature
ePub ahead of print

2019

Abstract +

Current programmable nuclease-based methods (for example, CRISPR-Cas9) for the precise correction of a disease-causing genetic mutation harness the homology-directed repair pathway. However, this repair process requires the co-delivery of an exogenous DNA donor to recode the sequence and can be inefficient in many cell types. Here we show that disease-causing frameshift mutations that result from microduplications can be efficiently reverted to the wild-type sequence simply by generating a DNA double-stranded break near the centre of the duplication. We demonstrate this in patient-derived cell lines for two diseases: limb-girdle muscular dystrophy type 2G (LGMD2G)1 and Hermansky-Pudlak syndrome type 1 (HPS1)2. Clonal analysis of inducible pluripotent stem (iPS) cells from the LGMD2G cell line, which contains a mutation in TCAP, treated with the Streptococcus pyogenes Cas9 (SpCas9) nuclease revealed that about 80% contained at least one wild-type TCAP allele; this correction also restored TCAP expression in LGMD2G iPS cell-derived myotubes. SpCas9 also efficiently corrected the genotype of an HPS1 patient-derived B-lymphoblastoid cell line. Inhibition of polyADP-ribose polymerase 1 (PARP-1) suppressed the nuclease-mediated collapse of the microduplication to the wild-type sequence, confirming that precise correction is mediated by the microhomology-mediated end joining (MMEJ) pathway. Analysis of editing by SpCas9 and Lachnospiraceae bacterium ND2006 Cas12a (LbCas12a) at non-pathogenic 4-36-base-pair microduplications within the genome indicates that the correction strategy is broadly applicable to a wide range of microduplication lengths and can be initiated by a variety of nucleases. The simplicity, reliability and efficacy of this MMEJ-based therapeutic strategy should permit the development of nuclease-based gene correction therapies for a variety of diseases that are associated with microduplications.

Reviving the Transcriptome Studies: An Insight into the Emergence of Single-molecule Transcriptome Sequencing

Frontiers in genetics
10, 384

2019

Abstract +

Advances in transcriptomics have provided an exceptional opportunity to study functional implications of the genetic variability. Technologies such as RNA-Seq have emerged as state-of-the-art techniques for transcriptome analysis that take advantage of high-throughput next-generation sequencing. However, similar to their predecessors, these approaches continue to impose major challenges on full-length transcript structure identification, primarily due to inherent limitations of read length. With the development of single-molecule sequencing (SMS) from PacBio, a growing number of studies on the transcriptome of different organisms have been reported. SMS has emerged as advantageous for comprehensive genome annotation including identification of novel genes/isoforms, long non-coding RNAs and fusion transcripts. This approach can be used across a broad spectrum of species to better interpret the coding information of the genome, and facilitate the biological function study. We provide an overview of SMS platform and its diverse applications in various biological studies, and our perspective on the challenges associated with the transcriptome studies.

Unveiling novel targets of paclitaxel resistance by single molecule long-read RNA sequencing in breast cancer.

Scientific reports
9, 6032

2019

Abstract +

RNA sequencing has become one of the most common technology to study transcriptomes in cancer, whereas its length limits its application on alternative splicing (AS) events and novel isoforms. Firstly, we applied single molecule long-read RNA sequencing (Iso-seq) and de novo assembly with short-read RNA sequencing (RNA-seq) in both wild type (231-WT) and paclitaxel resistant type (231-PTX) of human breast cancer cell MDA-MBA-231. The two sequencing technology provide both the accurate transcript sequences and the deep transcript coverage. Then we combined shor-read and long-read RNA-seq to analyze alternative events and novel isoforms. Last but not the least, we selected BAK1 as our candidate target to verify our analysis. Our results implied that improved characterization of cancer genomic function may require the application of the single molecule long-read RNA sequencing to get the deeper and more precise view to transcriptional level. Our results imply that improved characterization of cancer genomic function may require the application of the single molecule long-read RNA sequencing to get the deeper and more precise view to transcriptional level.

Recipients receiving better HLA-matched hematopoietic cell transplantation grafts, uncovered by a novel HLA typing method, have superior survival: A retrospective study

Biology of Blood and Marrow Transplantation
25, 443-450

2019

Abstract +

HLA matching at an allelic-level resolution for volunteer unrelated donor (VUD) hematopoietic cell transplanta- tion (HCT) results in improved survival and fewer post-transplant complications. Limitations in typing technolo- gies used for the hyperpolymorphic HLA genes have meant that variations outside of the antigen recognition domain (ARD) have not been previously characterized in HCT. Our aim was to explore the extent of diversity out- side of the ARD and determine the impact of this diversity on transplant outcome. Eight hundred ninety-one VUD-HCT donors and their recipients transplanted for a hematologic malignancy in the United Kingdom were ret- rospectively HLA typed at an ultra-high resolution (UHR) for HLA-A, -B, -C, -DRB1, -DQB1, and -DPB1 using next- generation sequencing technology. Matching was determined at full gene level for HLA class I and at a coding DNA sequence level for HLA class II genes. The HLA matching status changed in 29.1% of pairs after UHR HLA typ- ing. The 12/12 UHR HLA matched patients had significantly improved 5-year overall survival when compared with those believed to be 12/12 HLA matches based on their original HLA typing but were found to be mismatched after UHR HLA typing (54.8% versus 30.1%, P= .022). Survival was also significantly better in 12/12 UHR HLA- matched patients when compared with those with any degree of mismatch at this level of resolution (55.1% ver- sus 40.1%, P= .005). This study shows that better HLA matching, found when typing is done at UHR that includes exons outside of the ARD, introns, and untranslated regions, can significantly improve outcomes for recipients of a VUD-HCT for a hematologic malignancy and should be prospectively performed at donor selection.

A chromosomal-scale genome assembly of Tectona grandis reveals the importance of tandem gene duplication and enables discovery of genes in natural product biosynthetic pathways.

GigaScience
8

2019

Abstract +

Teak, a member of the Lamiaceae family, produces one of the most expensive hardwoods in the world. High demand coupled with deforestation have caused a decrease in natural teak forests, and future supplies will be reliant on teak plantations. Hence, selection of teak tree varieties for clonal propagation with superior growth performance is of great importance, and access to high-quality genetic and genomic resources can accelerate the selection process by identifying genes underlying desired traits.To facilitate teak research and variety improvement, we generated a highly contiguous, chromosomal-scale genome assembly using high-coverage Pacific Biosciences long reads coupled with high-throughput chromatin conformation capture. Of the 18 teak chromosomes, we generated 17 near-complete pseudomolecules with one chromosome present as two chromosome arm scaffolds. Genome annotation yielded 31,168 genes encoding 46,826 gene models, of which, 39,930 and 41,155 had Pfam domain and expression evidence, respectively. We identified 14 clusters of tandem-duplicated terpene synthases (TPSs), genes central to the biosynthesis of terpenes, which are involved in plant defense and pollinator attraction. Transcriptome analysis revealed 10 TPSs highly expressed in woody tissues, of which, 8 were in tandem, revealing the importance of resolving tandemly duplicated genes and the quality of the assembly and annotation. We also validated the enzymatic activity of four TPSs to demonstrate the function of key TPSs.In summary, this high-quality chromosomal-scale assembly and functional annotation of the teak genome will facilitate the discovery of candidate genes related to traits critical for sustainable production of teak and for anti-insecticidal natural products.© The Author(s) 2019. Published by Oxford University Press.

Event

CROPS Conference 2019

June 3, 2019-June 6, 2019

Stay
Current

Visit our blog »