For microbial sequencing on the PacBio Sequel System, the current yield per SMRT Cell is in excess relative to project requirements. Multiplexing offers a viable solution; greatly increasing throughput, efficiency, and reducing costs per genome. This approach is achieved by incorporating a unique barcode for each microbial sample into the SMRTbell adapters and using a streamlined library preparation process. To demonstrate performance,12 unique barcodes assigned to B. subtilis and sequenced on a single SMRT Cell. To further demonstrate the applicability of this method, we multiplexed the genomes of 16 strains of H. pylori. Each DNA was sheared to 10 kb,…
Plant and animal whole genome sequencing has proven to be challenging, particularly due to genome size, high density of repetitive elements and heterozygosity. The Sequel System delivers long reads, high consensus accuracy and uniform coverage, enabling more complete, accurate, and contiguous assemblies of these large complex genomes. The latest Sequel chemistry increases yield up to 8 Gb per SMRT Cell for long insert libraries >20 kb and up to 10 Gb per SMRT Cell for libraries >40 kb. In addition, the recently released SMRTbell Express Template Prep Kit reduces the time (~3 hours) and DNA input (~3 µg), making the…
Recent improvements in sequencing chemistry and instrument performance combine to create a new PacBio data type, Single Molecule High-Fidelity reads (HiFi reads). Increased read length and improvement in library construction enables average read lengths of 10-20 kb with average sequence identity greater than 99% from raw single molecule reads. The resulting reads have the accuracy comparable to short read NGS but with 50-100 times longer read length. Here we benchmark the performance of this data type by sequencing and genotyping the Genome in a Bottle (GIAB) HG0002 human reference sample from the National Institute of Standards and Technology (NIST). We…
A high-quality reference genome is an essential tool for studying the genetics of traits and disease, organismal, comparative and conservation biology, and population genomics. PacBio Single Molecule, Real-Time (SMRT) Sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives. However, relatively high DNA input requirements (3 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that may have lower DNA content…
De novo assemblies of human genomes from accurate (85-90%), continuous long reads (CLR) now approach the human reference genome in contiguity, but the assembly base pair accuracy is typically below QV40 (99.99%), an order-of-magnitude lower than the standard for finished references. The base pair errors complicate downstream interpretation, particularly false positive indels that lead to false gene loss through frameshifts. PacBio HiFi sequence data, which are both long (>10 kb) and very accurate (>99.9%) at the individual sequence read level, enable a new paradigm in human genome assembly. Haploid human assemblies using HiFi data achieve similar contiguity to those using…
To comprehensively detect large variants in human genomes, we have extended pbsv – a structural variant caller for long reads – to call copy-number variants (CNVs) from read-clipping and read-depth signatures. In human germline benchmark samples, we detect more than 300 CNVs spanning around 10 Mb, and we call hundreds of additional events in re-arranged cancer samples. Long-read sequencing of diverse humans has revealed more than 20,000 insertion, deletion, and inversion structural variants spanning more than 12 Mb in a typical human genome. Most of these variants are too large to detect with short reads and too small for array…
Long-read sequencing of diverse humans has revealed more than 20,000 insertion, deletion, and inversion structural variants spanning more than 12 Mb in a healthy human genome. Most of these variants are too large to detect with short reads and too small for array comparative genome hybridization (aCGH). While the standard approaches to calling structural variants with long reads thrive in the 50 bp to 10 kb size range, they tend to miss exactly the large (>50 kb) copy-number variants that are called more readily with aCGH. Standard algorithms rely on reference-based mapping of reads that fully span a variant or…
HiFi sequencing on the PacBio Sequel II System enables complete microbial community profiling of complex metagenomic samples using whole genome shotgun sequences. With HiFi sequencing, highly accurate long reads overcome the challenges posed by the presence of intergenic and extragenic repeat elements in microbial genomes, thus greatly improving phylogenetic profiling and sequence assembly. Recent improvements in library construction protocols enable HiFi sequencing starting from as low as 5 ng of input DNA. Here, we demonstrate comparative analyses of a control sample of known composition and a human fecal sample from varying amounts of input genomic DNA (1 ug, 200 ng,…
Structural variation accounts for much of the variation among human genomes. Structural variants of all types are known to cause Mendelian disease and contribute to complex disease. Learn how long-read sequencing is enabling detection of the full spectrum of structural variants to advance the study of human disease, evolution and genetic diversity.
The Sequel System, powered by Single Molecule, Real Time (SMRT) Technology, delivers long reads, high consensus accuracy, uniform coverage and epigenetic characterization.
The Sequel II System, powered by Single Molecule, Real Time (SMRT) Technology, delivers highly accurate long reads for a comprehensive view of genomes, transcriptomes and epigenomes.
Single Molecule, Real-Time (SMRT) Sequencing on the Sequel II System enables easy and affordable generation of high-quality de novo assemblies. With megabase size contig N50s, accuracies >99.99%, and phased haplotypes, you can do more biology – capturing undetected SNVs, fully intact genes, and regulatory elements embedded in complex regions.
Discover the benefits of HiFi reads and learn how highly accurate long-read sequencing provides a single technology solution across a range of applications.
This tutorial provides an overview of the Hierarchical Genome Assembly Process (HGAP4) de novo assembly analysis application. HGAP4 generates accurate de novo assemblies using only PacBio data. HGAP4 is suitable for assembling a wide range of genome sizes and complexity. HGAP4 now includes some support for diploid-aware assembly. This tutorial covers features of SMRT Link v5.0.0.
This tutorial provides an overview of the Long Amplicon Analysis (LAA) application. The LAA algorithm generates highly accurate, phased and full-length consensus sequences from long amplicons. Applications of LAA include HLA typing, alternative haplotyping, and localized de novo assemblies of targeted genes. This tutorial covers features of SMRT Link v5.0.0.