Structural variation accounts for much of the variation among human genomes. Structural variants of all types are known to cause Mendelian disease and contribute to complex disease. Learn how long-read sequencing is enabling detection of the full spectrum of structural variants to advance the study of human disease, evolution and genetic diversity.
With SMRT Link you can unlock the power of PacBio Single Molecule, Real-Time (SMRT) Sequencing using our portfolio of software tools designed to set up and monitor sequencing runs, review performance metrics, analyze, visualize, and annotate your sequencing data.
Application Brochure: Scalable human whole genome HiFi sequencing for rare and inherited disease research
PacBio highly accurate long reads – HiFi reads – offer a single-platform solution for rare and inherited disease research, elucidating suspected genetic causes of disease in up to ~50% of cases that have not previously been explained using short-read exome or whole genome sequencing. PacBio offers an efficient workflow, developed in collaboration with Children’s Mercy Kansas City, which provides a scalable solution for sequencing 100s to 1000s of whole human genomes per year on the Sequel II and Sequel IIe Systems.
Learn how PacBio highly accurate long reads enable an improved approach to whole genome sequencing to understand the genetic origins of rare diseases.
With highly accurate long reads (HiFi reads) from the Sequel II or IIe Systems you can comprehensively detect variants in 100s to 1000s of genomes in a year. HiFi reads provide high precision and recall for single nucleotide variants (SNVs), indels, structural variants (SVs), and copy number variants (CNVs), including in difficult-to-map repetitive regions.
Structural variants (genomic differences =50 base pairs) contribute to the evolution of organisms traits and human disease. Most structural variants (SVs) are too small to detect with array comparative genomic hybridization but too large to reliably discover with short-read DNA sequencing. Recent studies in human genomes show that PacBio SMRT Sequencing sensitively detects structural variants.
De novo assembly is a large part of JGI’s analysis portfolio. Repetitive DNA sequences are abundant in a wide range of organisms we sequence and pose a significant technical challenge for assembly. We are interested in long read technologies capable of spanning genomic repeats to produce better assemblies. We currently have three RS II and two Sequel PacBio machines. RS II machines are primarily used for fungal and microbial genome assembly as well as synthetic biology validation. Between microbes and fungi we produce hundreds of PacBio libraries a year and for throughput reasons the vast majority of these are >10 kb AMPure libraries. Throughput for RS II is about 1 Gb per SMRT Cell. This is ideal for microbial sized genomes but can be costly and labor intensive for larger projects which require multiple cells. JGI was an early access site for Sequel and began testing with real samples in January 2016. During that time we’ve had the opportunity to sequence microbes, fungi, metagenomes, and plants. Here we present our experience over the last 18 months using the Sequel platform and provide comparisons with RS II results.
From RNA to full-length transcripts: The PacBio Iso-Seq method for transcriptome analysis and genome annotation
A single gene may encode a surprising number of proteins, each with a distinct biological function. This is especially true in complex eukaryotes. Short- read RNA sequencing (RNA-seq) works by physically shearing transcript isoforms into smaller pieces and bioinformatically reassembling them, leaving opportunity for misassembly or incomplete capture of the full diversity of isoforms from genes of interest. The PacBio Isoform Sequencing (Iso-Seq™) method employs long reads to sequence transcript isoforms from the 5’ end to their poly-A tails, eliminating the need for transcript reconstruction and inference. These long reads result in complete, unambiguous information about alternatively spliced exons, transcriptional start sites, and poly- adenylation sites. This allows for the characterization of the full complement of isoforms within targeted genes, or across an entire transcriptome. Here we present improved genome annotations for two avian models of vocal learning, Anna’s hummingbird (Calypte anna) and zebra finch (Taeniopygia guttata), using the Iso-Seq method. We present graphical user interface and command line analysis workflows for the data sets. From brain total RNA, we characterize more than 15,000 isoforms in each species, 9% and 5% of which were previously unannotated in hummingbird and zebra finch, respectively. We highlight one example where capturing full-length transcripts identifies additional exons and UTRs.
Microbes play an important role in nearly every part of our world, as they affect human health, our environment, agriculture, and aid in waste management. Complete closed genome sequences, which have become the gold standard with PacBio long-read sequencing, can be key to understanding microbial functional characteristics. However, input requirements, consumables costs, and the labor required to prepare and sequence a microbial genome have in the past put PacBio sequencing out of reach for some larger projects. We have developed a multiplexed library prep approach that is simple, fast, and cost-effective, and can produce 4 to 16 closed bacterial genomes from one Sequel SMRT Cell. Additionally, we are introducing a streamlined analysis pipeline for processing multiplexed genome sequence data through de novo HGAP assembly, making the entire process easy for lab personnel to perform. Here we present the entire workflow from shearing through assembly, with times for each step. We show HGAP assembly results with single or very few contigs from bacteria from different size genomes, sequenced without or with size selection. These data illustrate the benefits and potential of the PacBio multiplexed library prep and the Sequel System for sequencing large numbers of microbial genomes.
Structural variants (SVs) – genomic differences =50 base pairs – are few by count compared to single nucleotide variants (SNVs) and indels but include most of the base pairs that differ between two humans.