Haplotype-resolved genomes are important for understanding how combinations of variants impact phenotypes. The study of disease, quantitative traits, forensics, and organ donor matching are aided by phased genomes. Phase is commonly resolved using familial data, population-based imputation, or by isolating and sequencing single haplotypes using fosmids, BACs, or haploid tissues. Because these methods can be prohibitively expensive, or samples may not be available, alternative approaches are required. de novo genome assembly with PacBio Single Molecule, Real-Time (SMRT) data produces highly contiguous, accurate assemblies. For non-inbred samples, including humans, the separate resolution of haplotypes results in higher base accuracy and more…
Yunfei Guo, from the University of Southern California, presents his ASHG 2015 poster on a de novo assembly of a diploid Asian genome. The uniform coverage of long-read sequencing helped access regions previously unresolvable due to high GC bias or long repeats. The assembly allowed scientists to fill some 400 gaps in the latest human reference genome, including some as long as 50 kb.
Brett Hannigan, Computational Biology Project Leader at DNAnexus, demonstrates a fast, accurate, and cost-efficient solution for diploid-aware de novo genome assembly utilizing FALCON on the DNAnexus platform.
PacBio Sequencing is characterized by very long sequence reads (averaging > 10,000 bases), lack of GC-bias, and high consensus accuracy. These features have allowed the method to provide a new gold standard in de novo genome assemblies, producing highly contiguous (contig N50 > 1 Mb) and accurate (> QV 50) genome assemblies. We will briefly describe the technology and then highlight the full workflow, from sample preparation through sequencing to data analysis, on examples of insect genome assemblies, and illustrate the difference these high-quality genomes represent with regard to biological insights, compared to fragmented draft assemblies generated by short-read sequencing.
Jonas Korlach spoke about recent SMRT Sequencing updates, such as latest Sequel System chemistry release (1.2.1) and updates to the Integrative Genomics Viewer that’s now update optimized for PacBio data. He presented the recent data release of structural variation detected in the NA12878 genome, including many more insertions and deletions than short-read-based technologies were able to find.
At AGBT 2017, Mike Schatz from Johns Hopkins University and Cold Spring Harbor Laboratory presented data from sequencing, assembling, and analyzing personalized, phased diploid genomes with either Illumina, 10x Genomics, and PacBio SMRT Sequencing. Compared to the short-read-based methods, PacBio data assembled in large, complete contigs and contained the broadest range of structural variants with the best resolution. Plus: unexpected translocation findings with SMRT Sequencing, validated in follow-up studies.
The goal of this session is to help users complete their PacBio genome assembly and generate the best resource for their research. Kingan begins with a brief review of the diploid assembly process used by FALCON and FALCON-Unzip, highlighting the enhanced phasing of the Unzip module, and concluding with recommendations for genome polishing. Next, she explores how heterozygosity can influence the assembly process and how read coverage depth along the assembly can reveal important characteristics of assembly structure. Kingan then recommends approaches, including specific tools, that can be used to quality filter and curate the assembly, including annotation-, coverage-, and…
SMRT Sequencing is a DNA sequencing technology characterized by long read lengths and high consensus accuracy, regardless of the sequence complexity or GC content of the DNA sample. These characteristics can be harnessed to address medically relevant genes, mRNA transcripts, and other genomic features that are otherwise difficult or impossible to resolve. I will describe examples for such new clinical research in diverse areas, including full-length gene sequencing with allelic haplotype phasing, gene/pseudogene discrimination, sequencing extreme DNA contexts, high-resolution pharmacogenomics, biomarker discovery, structural variant resolution, full-length mRNA isoform cataloging, and direct methylation detection.
This webinar, presented by Nisha Pillai, provides an overview of bioinformatics approaches for PacBio Single Molecule, Real-Time (SMRT) Sequencing data and discusses the whole genome sequencing application including: assembly workflow designs, an overview analysis tools for de novo assembly of SMRT Sequencing data (HGAP4, FALCON & FALCON-Unzip), and finally best practices and case studies.
This webinar highlights global initiatives currently underway to use Single Molecule, Real-Time (SMRT) Sequencing to de novo assemble genomes of individuals representing multiple ethnic populations, thereby extending the diversity of available human reference genomes. In their presentations, Tina Graves-Lindsay from Washington University and Adam Ameur from Uppsala University spoke about diploid assemblies, discovering novel sequence and improving diversity of the current human reference genome. Finally, Paul Peluso of PacBio presented data from the recent effort to sequence a Puerto Rican genome and shared a SMRT Sequencing technology roadmap showing the next several upgrades for the Sequel System.
In this PacBio User Group Meeting presentation, Tina Graves-Lindsay of the McDonnell Genome Institute and the Genome Reference Consortium speaks about the importance of phasing human reference genomes. Her team is now working on its fifteenth human genome assembly — part of a major effort to improve genomic representation of ethnic diversity — with a pipeline that generates 60-fold PacBio coverage for a de novo assembly, followed by scaffolding with other technologies. They are also using FALCON-Unzip to separate haplotypes, leading to reference-grade diploid assemblies. This approach has already helped resolve errors seen in other genomes and even the gold-standard…
In this PacBio User Group Meeting presentation, Zev Kronenberg of PacBio presents on using the combination of PacBio and Phase Genomics data and analysis tools to create highly contiguous genome assemblies.