X

Quality Statement

Pacific Biosciences is committed to providing high-quality products that meet customer expectations and comply with regulations. We will achieve these goals by adhering to and maintaining an effective quality-management system designed to ensure product quality, performance, and safety.

X

Image Use Agreement

By downloading, copying, or making any use of the images located on this website (“Site”) you acknowledge that you have read and understand, and agree to, the terms of this Image Usage Agreement, as well as the terms provided on the Legal Notices webpage, which together govern your use of the images as provided below. If you do not agree to such terms, do not download, copy or use the images in any way, unless you have written permission signed by an authorized Pacific Biosciences representative.

Subject to the terms of this Agreement and the terms provided on the Legal Notices webpage (to the extent they do not conflict with the terms of this Agreement), you may use the images on the Site solely for (a) editorial use by press and/or industry analysts, (b) in connection with a normal, peer-reviewed, scientific publication, book or presentation, or the like. You may not alter or modify any image, in whole or in part, for any reason. You may not use any image in a manner that misrepresents the associated Pacific Biosciences product, service or technology or any associated characteristics, data, or properties thereof. You also may not use any image in a manner that denotes some representation or warranty (express, implied or statutory) from Pacific Biosciences of the product, service or technology. The rights granted by this Agreement are personal to you and are not transferable by you to another party.

You, and not Pacific Biosciences, are responsible for your use of the images. You acknowledge and agree that any misuse of the images or breach of this Agreement will cause Pacific Biosciences irreparable harm. Pacific Biosciences is either an owner or licensee of the image, and not an agent for the owner. You agree to give Pacific Biosciences a credit line as follows: "Courtesy of Pacific Biosciences of California, Inc., Menlo Park, CA, USA" and also include any other credits or acknowledgments noted by Pacific Biosciences. You must include any copyright notice originally included with the images on all copies.

IMAGES ARE PROVIDED BY Pacific Biosciences ON AN "AS-IS" BASIS. Pacific Biosciences DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, NON-INFRINGEMENT, OWNERSHIP, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL Pacific Biosciences BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES OF ANY KIND WHATSOEVER WITH RESPECT TO THE IMAGES.

You agree that Pacific Biosciences may terminate your access to and use of the images located on the PacificBiosciences.com website at any time and without prior notice, if it considers you to have violated any of the terms of this Image Use Agreement. You agree to indemnify, defend and hold harmless Pacific Biosciences, its officers, directors, employees, agents, licensors, suppliers and any third party information providers to the Site from and against all losses, expenses, damages and costs, including reasonable attorneys' fees, resulting from any violation by you of the terms of this Image Use Agreement or Pacific Biosciences' termination of your access to or use of the Site. Termination will not affect Pacific Biosciences' rights or your obligations which accrued before the termination.

I have read and understand, and agree to, the Image Usage Agreement.

I disagree and would like to return to the Pacific Biosciences home page.

Pacific Biosciences
Contact:

Search scientific posters

Search Query

Author Search

A complete solution for full-length transcript sequencing using the PacBio Sequel II System

Advances in Genome Biology and Technology

2020

Abstract +

Long read mRNA sequencing methods such as PacBio’s Iso-Seq method offers high-throughput transcriptome profiling in prokaryotic and eukaryotic cells. By avoiding the transcript assembly problem and instead sequencing full-length cDNA, Iso-Seq has emerged as the most reliable technology for annotating isoforms and, in turn, improving proteome predictions in a wide variety of organisms. Improvements in library preparation, sequencing throughput, and bioinformatics has enabled the Iso-Seq method to be complete solution for transcript characterization. The Iso-Seq Express kit is a one-day library prep requiring 60-300 ng of total RNA. The PacBio Sequel II system produces 4-5 million full-length reads, sufficient to profile a whole human transcriptome. Finally, the SQANTI2 software is a powerful tool for categorizing the complex isoforms against reference annotations, while also incorporating orthogonal information such as CAGE peak data, public RNA-seq junction data, and ORF predictions.

Amplification-free protocol for targeted enrichment of repeat expansion genomic regions and SMRT Sequencing

Advances in Genome Biology and Technology

2020

Abstract +

Many genetic disorders are associated with repeat sequence expansions. Obtaining accurate DNA sequence information from these regions will facilitate researchers to further establish the relationship between these genetic disorders and underlying disease mechanisms. Moreover, repeat interruptions have also been shown to act as phenotypic modifiers in some disorders. Targeted sequencing is an economical way to obtain sequence information from one or more defined regions in a genome. However, most targeted enrichment and sequencing methods require some form of DNA amplification. Amplifying large regions with extreme GC content as seen in repeat expansion disorders is challenging and prone to introducing sequence artifacts. DNA amplification also removes any epigenetic signatures present in native DNA. This technique also preserves native DNA molecules for the possibility of direct characterization of epigenetic signatures.

Copy-number variant detection with PacBio long reads

Advances in Genome Biology and Technology

2020

Abstract +

Long-read sequencing of diverse humans has revealed more than 20,000 insertion, deletion, and inversion structural variants spanning more than 12 Mb in a healthy human genome. Most of these variants are too large to detect with short reads and too small for array comparative genome hybridization (aCGH). While the standard approaches to calling structural variants with long reads thrive in the 50 bp to 10 kb size range, they tend to miss exactly the large (>50 kb) copy-number variants that are called more readily with aCGH. Standard algorithms rely on reference-based mapping of reads that fully span a variant or on de novo assembly; and copy-number variants are often too large to be spanned by a single read and frequently involve segmentally duplicated sequence that is not yet included in most de novo assemblies. To comprehensively detect large variants in human genomes, we extended pbsv – a structural variant caller for long reads – to call copy-number variants (CNVs) from read-clipping and read-depth signatures. In human germline benchmark samples, we detect more than 300 CNVs spanning around 10 Mb, and we call hundreds of additional events in re-arranged cancer samples. Together with insertion, deletion, inversion, duplication, and translocation calling from spanning reads, this allows pbsv to comprehensively detect large variants from a single data type

New advances in SMRT Sequencing facilitate multiplexing for de novo and structural variant studies

Advances in Genome Biology and Technology

2020

Abstract +

The latest advancements in Sequel II SMRT Sequencing have increased average read lengths up to 50% compared to Sequel II chemistry 1.0 which allows multiplexing of 2-3 small organisms (<500 Mb) such as insects and worms for producing reference quality assemblies, calling structural variants for up to 2 samples with ~3 Gb genomes, analysis of 48 microbial genomes, and up to 8 communities for metagenomic profiling in a single SMRT Cell 8M. With the improved processivity of the new Sequel II sequencing polymerase, more SMRTbell molecules reach rolling circle mode resulting in longer overall read lengths, thus allowing efficient detection of barcodes (up to 80%) in the SMRTbell templates. Multiplexing of genomes larger than microbial organisms is now achievable. In collaboration with the Wellcome Sanger Institute, we have developed a workflow for multiplexing two individual Anopheles coluzzii using as low as 150 ng genomic DNA per individual. The resulting assemblies had high contiguity (contig N50s over 3 Mb) and completeness (>98% of conserved genes) for both individuals. For microbial multiplexing, we multiplexed 48 microbes with varying complexities and sizes ranging 1.6-8.0 Mb in single SMRT Cell 8M. Using a new end-to-end analysis (Microbial Assembly Analysis, SMRT Link 8.0), assemblies resulted in complete circularized genomes (>200-fold coverage) and efficient detection of >3-200 kb plasmids. Finally, the long read lengths (>90 kb) allows detection of barcodes in large insert SMRTbell templates (>15 kb) thus facilitating multiplex of two human samples in 1 SMRT Cell 8M for detecting SVs, Indels and CNVs. Here, we present results and describe workflows for multiplexing samples for specific applications for SMRT Sequencing.

Unbiased characterization of metagenome composition and function using HiFi sequencing on the PacBio Sequel II System

Advances in Genome Biology and Technology

2020

Abstract +

Recent work comparing metagenomic sequencing methods indicates that a comprehensive picture of the taxonomic and functional diversity of complex communities will be difficult to achieve with one sequencing technology alone. While the lower cost of short reads has enabled greater sequencing depth, the greater contiguity of long-read assemblies and lack of GC bias in SMRT Sequencing has enabled better gene finding. However, since long-read assembly typically requires high coverage for error correction, these benefits have in the past been lost for low-abundance species. The introduction of the Sequel II System has enabled a new, higher throughput, assembly-optional data type that addresses these challenges: HiFi reads. HiFi reads combine QV20 accuracy with long read lengths, eliminating the need for assembly for most metagenome applications, including gene discovery and metabolic pathway reconstruction. In fact, the read lengths and accuracy of HiFi data match or outperform the quality metrics of most metagenome assemblies, enabling cost-effective recovery of intact genes and operons while omitting the resource intensive and data-inefficient assembly step. Here we present the application of HiFi sequencing to both mock and human fecal samples using full-length 16S and shotgun methods. This proof-of-concept work demonstrates the unique strengths of the HiFi method. First, the high correspondence between the expected community composition,16S and shotgun profiling data reflects low context bias. In addition, every HiFi read yields ~5-8 predicted genes, without assembly, using standard tools. If assembly is desired, excellent results can be achieved with Canu and contig binning tools. In summary, HiFi sequencing is a new, cost-effective option for high-resolution functional profiling of metagenomes which complements existing short read workflows.

A complete solution for high-quality genome annotation using the PacBio Iso-Seq method

Plant and Animal Genome XXVIII Conference

2020

Abstract +

The PacBio Iso-Seq method produces high-quality, full-length transcripts of up to 10 kb and longer and has been used to annotate many important plant and animal genomes. We describe here the full Iso-Seq ecosystem that enables researchers to achieve high-quality genome annotations. The Iso-Seq Express workflow is a 1-day protocol that requires only 60-300 ng of total RNA and supports multiplexing of different tissues. Sequencing on a single SMRT Cell 8M on the Sequel II System produces up to 4 million full-length reads, sufficient to exhaustively characterize a whole transcriptome on the order of 15,000-17,000 genes with 100,000 or more transcripts. Most importantly, the method is supported by a maturing suite of official and community-developed tools. The SMRT Link Iso-Seq application outputs high-quality (>99% accurate), full-length transcript sequences that can optionally be mapped to a reference genome for a single SMRT Cell worth of data in 6-9 hours. For example, the SQANTI2 tool classifies Iso-Seq transcripts against a reference annotation, filters potential library artifacts, and processes information from both long read-only and short read-based quantification. IsoPhase is a tool for identifying allele-specific isoform expression. Cogent has been used to process Iso-Seq transcripts in a genome-independent manner to assess genome assemblies. Finally, IsoAnnot is an up-and-coming tool for identifying differential isoform expression across different samples. We describe how these tools complement each other and provide guidelines to make the best use out of Iso-Seq data for understanding transcriptomes.

A high-quality PacBio insect genome from 5 ng of input DNA

Plant and Animal Genome XXVIII Conference

2020

Abstract +

High-quality insect genomes are essential resources to understand insect biology and to combat them as disease vectors and agricultural pests. It is desirable to sequence a single individual for a reference genome to avoid complications from multiple alleles during de novo assembly. However, the small body size of many insects poses a challenge for the use of long-read sequencing technologies which often have high DNA-input requirements. The previously described PacBio Low DNA Input Protocol starts with ~100 ng of DNA and allows for high-quality assemblies of single mosquitoes among others and represents a significant step in reducing such requirements. Here, we describe a new library protocol with a further 20-fold reduction in the DNA input quantity. Starting with just 5 ng of high molecular weight DNA, we describe the successful sequencing and de novo genome assembly of a single male sandfly (Phlebotomus papatasi, the main vector of the Old World cutaneous leishmaniasis), using HiFi data generated on the PacBio Sequel II System and assembled with FALCON. The assembly shows a high degree of completeness (>97% of BUSCO genes are complete), contiguity (contig N50 of 1 Mb), and sequence accuracy (>98% of BUSCO genes without frameshift errors). This workflow has general utility for small-bodied insects and other plant and animal species for both focused research studies or in conjunction with large-scale genome projects.

Beyond Contiguity: Evaluating the accuracy of de novo genome assemblies

Plant and Animal Genome XXVIII Conference

2020

Abstract +

HiFi reads (>99% accurate, 15-20 kb) from the PacBio Sequel II System consistently provide complete and contiguous genome assemblies. In addition to completeness and contiguity, accuracy is of critical importance, as assembly errors complicate downstream analysis, particularly by disrupting gene frames. Metrics used to assess assembly accuracy include: 1) in-frame gene count, 2) kmer consistency, and 3) concordance to a benchmark, where discordances are interpreted as assembly errors. Genome in a Bottle (GIAB) provides a benchmark for the human genome with estimated accuracy of 99.9999% (Q60). Concordance for human HiFi assemblies exceeds Q50, which provides excellent genomes for downstream analysis, but presents a challenge that any new benchmark must significantly exceed Q50 or the discordance will represent the error rate of the benchmark. To establish benchmarks for Oryza sativa and Drosophila melanogaster, we collected draft references, Illumina short reads, and PacBio HiFi reads. By species, the benchmark was defined as regions of normal coverage that are not within 5 bp of a small variant or 50 bp of a structural variant. For both species, the benchmark regions span around 60% of the genome and HiFi assemblies achieve Q50 accuracy, which is notably more accurate than assemblies with other technologies and meets typical standards for a finished, reference-grade assembly. Here we present a protocol to generate benchmarks for any sample that rival the GIAB benchmark in accuracy. These benchmarks allow the comparison and improvement of genome assemblies and highlight the superior accuracy of assemblies generated with PacBio HiFi reads.

Every species can be a model: Reference-quality PacBio genomes from single insects

Plant and Animal Genome XXVIII Conference

2020

Abstract +

A high-quality reference genome is an essential resource for primary and applied research across the tree of life. Genome projects for small-bodied, non-model organisms such as insects face several unique challenges including limited DNA input quantities, high heterozygosity, and difficulty of culturing or inbreeding in the lab. Recent progress in PacBio library preparation protocols, sequencing throughput, and read accuracy address these challenges. We present several case studies including the Red Admiral (Vanessa atalanta), Monarch Butterfly (Danaus plexippus), and Anopheles malaria mosquitoes that highlight the benefits of sequencing single individuals for de novo genome assembly projects, and the ease at which these projects can be conducted by individual research labs. Sampled individuals may originate from lab colonies of interest to the research community or be sourced from the wild to better capture natural variation in a focal population. Where genomic DNA quantities are limited, the PacBio Low DNA Input Protocol requires ~100 ng of input DNA. Low DNA input samples with 500 Mb genome size or less can be multiplexed on a single SMRT Cell 8M on the Sequel II System. For samples with more abundant DNA quantity, size-selected libraries may be constructed to maximize sequencing yield. Both low DNA input and size-selected libraries can be used to generate HiFi reads, whose quality is Q20 or above (1% error or less) and lengths range from 10 – 25 kb. With HiFi reads, de novo assembly computation is greatly simplified relative to long read methods due to smaller sequence file sizes and more rapid analysis, resulting in highly accurate, contiguous, complete, and haplotype-resolved assemblies.

Comprehensive structural and copy-number variant detection with long reads

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

To comprehensively detect large variants in human genomes, we have extended pbsv – a structural variant caller for long reads – to call copy-number variants (CNVs) from read-clipping and read-depth signatures. In human germline benchmark samples, we detect more than 300 CNVs spanning around 10 Mb, and we call hundreds of additional events in re-arranged cancer samples. Long-read sequencing of diverse humans has revealed more than 20,000 insertion, deletion, and inversion structural variants spanning more than 12 Mb in a typical human genome. Most of these variants are too large to detect with short reads and too small for array comparative genome hybridization (aCGH). While the standard approaches to calling structural variants with long reads thrive in the 50 bp to 10 kb size range, they tend to miss exactly the large (>50 kb) copy-number variants that are called more readily with aCGH and short reads. Standard algorithms rely on reference-based mapping of reads that fully span a variant or on de novo assembly; and copy-number variants are often too large to be spanned by a single read and frequently involve segmentally duplicated sequence that is not yet included in most de novo assemblies.

Detection and phasing of small variants in Genome in a Bottle samples with highly accurate long reads

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

Introduction: Long-read PacBio SMRT Sequencing has been applied successfully to assemble genomes and detect structural variants. However, due to high raw read error rates of 10-15%, it has remained difficult to call small variants from long reads. Recent improvements in library preparation, sequencing chemistry, and instrument yield have increased length, accuracy, and throughput of PacBio Circular Consensus (CCS) reads, resulting in 10-20 kb “HiFi” reads with mean read quality above 99%. Materials and Methods: We sequenced 11 kb size-selected libraries from the Genome in a Bottle (GIAB) human reference samples HG001, HG002, and HG005 to approximately 30-fold coverage on the Sequel II System with six SMRT Cells 8M each. The CCS algorithm was used to generate highly accurate (average 99.8%) reads of mean length 10-11 kb, which were then mapped to the hs37d5 reference with pbmm2. We detected small variants using Google DeepVariant and compared these variant calls to GIAB benchmarks. Small variants were then phased with WhatsHap. Results: With these long, highly accurate CCS reads, DeepVariant achieves high SNP and Indel accuracy against the GIAB benchmark truth set for all three reference samples. Using WhatsHap, small variants were phased into haplotype blocks with N50 from 82 to 146 kb. The improved mappability of long reads allows detection of variants in many medically relevant genes such as CYP2D6and PMS2that have proven 'difficult-to-map' with short reads. We show that small variant precision and recall remain high down to 15-fold coverage. Conclusions: These highly accurate long reads combine the mappability of noisy long reads with the accuracy and small variant detection utility of short reads, which will allow the detection and phasing of variants in regions that have proven recalcitrant to short read sequencing and variant detection.

Full-Length RNA-seq of Alzheimer brain on the PacBio Sequel II System

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

The PacBio Iso-Seq method produces high-quality, full-length transcripts and can characterize a whole transcriptome with a single SMRT Cell 8M. We sequenced an Alzheimer whole brain sample on a single SMRT Cell 8M on the Sequel II System. Using the Iso-Seq bioinformatics pipeline followed by SQANTI2 analysis, we detected 162,290 transcripts for 17,670 genes up to 14 kb in length. More than 60% of the transcripts are novel isoforms, the vast majority of which have supporting cage peak data and polyadenylation signals, demonstrating the utility of long-read sequencing for human disease research.

High-quality human genomes achieved through HiFi sequence data and FALCON-Unzip assembly

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

De novo assemblies of human genomes from accurate (85-90%), continuous long reads (CLR) now approach the human reference genome in contiguity, but the assembly base pair accuracy is typically below QV40 (99.99%), an order-of-magnitude lower than the standard for finished references. The base pair errors complicate downstream interpretation, particularly false positive indels that lead to false gene loss through frameshifts. PacBio HiFi sequence data, which are both long (>10 kb) and very accurate (>99.9%) at the individual sequence read level, enable a new paradigm in human genome assembly. Haploid human assemblies using HiFi data achieve similar contiguity to those using CLR data and are highly accurate at the base level1. Furthermore, HiFi assemblies resolve more high-identity sequences such as segmental duplications2. To enable HiFi assembly in diploid human samples, we have extended the FALCON-Unzip assembler to work directly with HiFi reads. Here we present phased human diploid genome assemblies from HiFi sequencing of HG002, HG005, and the Vertebrate Genome Project (VGP) mHomSap1 trio on the PacBio Sequel II System. The HiFi assemblies all exceed the VGP’s quality guidelines, approaching QV50 (99.999%) accuracy. For HG002, 60% of the genome was haplotype-resolved, with phase-block N50 of 143Kbp and phasing accuracy of 99.6%. The overall mean base accuracy of the assembly was QV49.7. In conclusion, HiFi data show great promise towards complete, contiguous, and accurate diploid human assemblies.

Structural variant in the RNA Binding Motif Protein, X-Linked 2 (RBMX2) gene found to be linked to bipolar disorder

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

Bipolar disorder (BD) is a phenotypically and genetically complex neurological disorder that affects 1% of the worldwide population. There is compelling evidence from family, twin and adoption studies supporting the involvement of a genetic predisposition with estimated heritability up to ~ 80%. The risk in first-degree relatives is ten times higher than in the general population. Linkage and association studies have implicated multiple putative chromosomal loci for BD susceptibility, however no disease genes have yet to be identified. Here, we have fully characterized a ~12 Mb significantly linked (lod score=3.54) genomic region on chromosome Xq24-q27 in an extended family from a genetic isolate that was using long-read single molecule, real-time (SMRT) sequencing. The family segregates BD in at least 4 generations with 16 individuals out of 61 affected. Thus, this family portrays a highly elevated reoccurrence risk compared to the general population. It is expected that the genetic complexity would be reduced in isolated populations, even in genetically complex disorders such as BD, as in the case of this extended family. We selected 16 key individuals from the X-chromosomally linked family to be sequenced. These selected individuals either carried the disease haplotype, were non-carriers of the disease haplotype, or served as married-in controls. We designed a Nimblegen capture array enriching for 5-9 kb fragments spanning the entire 12 Mb region that were then sequenced using long-read SMRT sequencing to screen for causative structural variants (SVs) explaining the increased risk for BD in this extended family. Altogether, 192 SVs were detected in the critically linked region however most of these represented common variants that could be seen across many of the family members regardless of the disease status. One SV stood out that showed perfect segregation among all affected individuals that were carriers of the disease haplotype. This was a 330bp Alu deletion in intron 4 of the RNA Binding Motif Protein, X-Linked 2 (RBMX2) gene that has previously been shown to play a central role in brain development and function. Moreover, Alu elements in general have also previously been associated with at least 37 neurological and neurodegenerative disorders. In order to validate the finding and the functionality of the identified SV further studies like isoform characterization are warranted.

The value of long read amplicon sequencing for clinical applications

69th Annual Meeting of the American Society of Human Genetics

2019

Abstract +

NGS is commonly used for amplicon sequencing in clinical applications to study genetic disorders and detect disease-causing mutations. This approach can be plagued by limited ability to phase sequence variants and makes interpretation of sequence data difficult when pseudogenes are present. Long-read highly accurate amplicon sequencing can provide very accurate, efficient, high throughput (through multiplexing) sequences from single molecules, with read lengths largely limited by PCR. Data is easy to interpret; phased variants and breakpoints are present within high fidelity individual reads. Here we show SMRT Sequencing of the PMS2 and OPN1 (MW and LW) genes using the Sequel System. Homologous regions make NGS and MLPA results very difficult to interpret.

Webinar

Labroots Genetics Virtual Week

April 21, 2020-April 23, 2020

Stay
Current

Visit our blog »