Learn how Single Molecule, Real-Time (SMRT) Sequencing and the Sequel IIe System and will accelerate your research by delivering highly accurate long reads to provide the most comprehensive view of genomes, transcriptomes and epigenomes.
Dan Geraghty explains that while there have been decades’ worth of studies associating the genetics of the major histocompatibility complex (MHC), and the highly polymorphic HLA class 1 and 2 genes, we still haven’t found the key mutations for a variety of different autoimmune diseases such as type 1 diabetes, rheumatoid arthritis, multiple sclerosis, and others. Enormous amounts of linkage disequilibrium in these regions are one factor, as is getting information in phase, so larger stretches of sequence are needed. Recently Geraghty has begun using SMRT Technology with hopes of drilling down to the causal genetics.
In this webinar, the presenters describe a targeted sequencing workflow that combines Roche NimbleGen’s SeqCap EZ enrichment technology with PacBio’ SMRT Sequencing to provide a more comprehensive view of variants and haplotype information over multi-kilobase, contiguous regions. They demonstrate that 6 kb fragments can also be utilized to enrich for long fragments that extend beyond the targeted capture site and well into (and often across) the adjacent intronic regions. When combined with SMRT Sequencing, multi-kilobase genomic regions can be phased and variants, including complex structural variants, can be detected in exons, introns and intergenic regions.
Euan Ashley from Stanford University started with the premise that while current efforts in the field of genomics medicine address 30% of patient cases, there’s a need for new approaches to make sense of the remaining 70%. Toward that end, he said that accurately calling structural variants is a major need. In one translational research example, Ashley said that SMRT Sequencing with the Sequel System allowed his team to identify six potentially causative genes in an individual with complex and varied symptoms; one gene was associated with Carney syndrome, which was a match for the person’s physiology and was later…
This tutorial provides an overview of the Long Amplicon Analysis (LAA) application. The LAA algorithm generates highly accurate, phased and full-length consensus sequences from long amplicons. Applications of LAA include HLA typing, alternative haplotyping, and localized de novo assemblies of targeted genes. This tutorial covers features of SMRT Link v5.0.0.
In a talk at AGBT 2017, Histogenetics CEO Nezih Cereb reported on how SMRT Sequencing is allowing his team to produce full-length, phased sequences for HLA alleles, which are important for matching organ transplants to recipients. The company is typing thousands of samples per day on their PacBio RS II systems and their new Sequel System. Cereb noted that SMRT Sequencing is unique in its ability to reliably phase mutations in the HLA alleles without imputation. Cereb concluded with his plans to use this approach for other complex regions, such as KIR, and announced their continued increasing HLA typing capacity…
In this ASHG workshop presentation, Stuart Scott of the Icahn School of Medicine at Mount Sinai, presented on using the PacBio system for amplicon sequencing in pharmacogenomics and clinical genomics workflows. Accurate, phased amplicon sequence for the CYP2D6 gene, for example, has allowed his team to reclassify up to 20% of samples, providing data that’s critical for drug metabolism and dosing. In clinical genomics, Scott presented several case studies illustrating the utility of highly accurate, long-read sequencing for assessing copy number variants and for confirming a suspected medical diagnosis in rare disease patients. He noted that the latest Sequel System…
In this presentation, Elizabeth Tseng explains how PacBio’s full-length RNA Sequencing using the Iso-Seq method can characterize full-length transcripts without the need for computational transcript assembly. The Iso-Seq method is fully supported bioinformatically through PacBio’s SMRT Analysis software that outputs high-quality, full-length transcript sequences that can be used for genome annotation and novel gene discovery. Elizabeth shows that the highly accurate reads can be used to discover allelic-specific isoform expressions in transcriptome data.
In this presentation at PAG 2020, Bart Nijland of Genetwister Technologies explains how his team set out to make a haplotype-aware assembly of the highly complex tetraploid Rosa x hybrida L. genome in order to capture its full range of genetic variation. HiFi reads generated from PacBio’s Sequel II System have made it possible to parse out critical information from many of the plant’s parental genes.
An important need in analyzing complex genomes is the ability to separate and phase haplotypes. While whole genome assembly can deliver this information, it cannot reveal whether there is allele-specific gene or isoform expression. The PacBio Iso-Seq method, which can produce high-quality transcript sequences of 10 kb and longer, has been used to annotate many important plant and animal genomes. We present an algorithm called IsoPhase that post-processes Iso-Seq data for transcript-based haplotyping. We applied IsoPhase to a maize Iso-Seq dataset consisting of two homozygous parents and two F1 cross hybrids. We validated the majority of the SNPs called with…
Background: The sequencing and haplotype phasing of entire gene sequences improves the understanding of the genetic basis of disease and drug response. One example is cystic fibrosis (CF). Cystic fibrosis transmembrane conductance regulator (CFTR) modulator therapies have revolutionized CF treatment, but only in a minority of CF subjects. Observed heterogeneity in CFTR modulator efficacy is related to the range of CFTR mutations; revertant mutations can modify the response to CFTR modulators, and other intronic variations in the ~200 kb CFTR gene have been linked to disease severity. Heterogeneity in the CFTR gene may also be linked to differential responses to…
We show that linearizing and directly sequencing full-length fosmids simplifies the assembly problem such that it is possible to unambiguously assemble individual haplotypes for the highly repetitive 100-200 kb killer Ig-like receptor (KIR) gene loci of chromosome 19. A tiling of targeted fosmids can be used to clone extended lengths of genomic DNA, 100s of kb in length, but repeat complexity in regions of particular interest, such as the KIR locus, means that sequence assembly of pooled samples into complete haplotypes is difficult and in many cases impossible. The current maximum read length generated by SMRT Sequencing exceeds the length…
As a cost-effective alternative to whole genome human sequencing, targeted sequencing of specific regions, such as exomes or panels of relevant genes, has become increasingly common. These methods typically include direct PCR amplification of the genomic DNA of interest, or the capture of these targets via probe-based hybridization. Commonly, these approaches are designed to amplify or capture exonic regions and thereby result in amplicons or fragments that are a few hundred base pairs in length, a length that is well-addressed with short-read sequencing technologies. These approaches typically provide very good coverage and can identify SNPs in the targeted region, but…
While the identification of individual SNPs has been readily available for some time, the ability to accurately phase SNPs and structural variation across a haplotype has been a challenge. With individual reads of an average length of 9 kb (P5-C3), and individual reads beyond 30 kb in length, SMRT Sequencing technology allows the identification of mutation combinations such as microdeletions, insertions, and substitutions without any predetermined reference sequence. Long- amplicon analysis is a novel protocol that identifies and reports the abundance of differing clusters of sequencing reads within a single library. Graphs generated via hierarchical clustering of individual sequencing reads…
One of the major applications of DNA sequencing technology is to bring together information that is distant in sequence space so that understanding genome structure and function becomes easier on a large scale. The Single Molecule Real Time (SMRT) Sequencing platform provides direct sequencing data that can span several thousand bases to tens of thousands of bases in a high-throughput fashion. In contrast to solving genomic puzzles by patching together smaller piece of information, long sequence reads can decrease potential computation complexity by reducing combinatorial factors significantly. We demonstrate algorithmic approaches to construct accurate consensus when the differences between reads…