In this PacBio User Group Meeting presentation, Tina Graves-Lindsay of the McDonnell Genome Institute and the Genome Reference Consortium speaks about the importance of phasing human reference genomes. Her team is now working on its fifteenth human genome assembly — part of a major effort to improve genomic representation of ethnic diversity — with a pipeline that generates 60-fold PacBio coverage for a de novo assembly, followed by scaffolding with other technologies. They are also using FALCON-Unzip to separate haplotypes, leading to reference-grade diploid assemblies. This approach has already helped resolve errors seen in other genomes and even the gold-standard…
In this PacBio User Group Meeting presentation, Zev Kronenberg of PacBio presents on using the combination of PacBio and Phase Genomics data and analysis tools to create highly contiguous genome assemblies.
Jonas Korlach, Chief Scientific Officer at PacBio, discussed the technology waves that have followed the initial human genome sequencing project, where we are today, and where we are going. Today, we are in what Korlach calls the 4th wave, where more comprehensive whole-genome re-sequencing is occurring, and we are nearing the 5th, when we will actually be able to free ourselves from reference genomes and sequence everything de novo.
Yunfei Guo, from the University of Southern California, presents his ASHG 2015 poster on a de novo assembly of a diploid Asian genome. The uniform coverage of long-read sequencing helped access regions previously unresolvable due to high GC bias or long repeats. The assembly allowed scientists to fill some 400 gaps in the latest human reference genome, including some as long as 50 kb.
Yunfei Guo, a grad student at the University of Southern California, discusses the benefits of SMRT Sequencing: very long reads that make it possible to resolve long repetitive regions and discover structural variants, and a random error mode that allows for extremely high accuracy.
During this presentation from ASHG 2015, Maria Nattestad of Cold Spring Harbor Laboratory described the study of a Her2-amplified breast cancer cell line using long-read sequencing from PacBio. With reads as long as 71 kb, she was able to characterize extensive and complex rearrangements and found more than 11,000 structural variants. She also used the Iso-Seq method to find gene fusions, including some novel ones.
Jeong-Sun Seo of Macrogen and Seoul National University College of Medicine reports on sequencing many Asian genomes to better understand genetic variation in that population. He shows that identifying certain structural variants may explain diseases that disproportionately affect Asian people.
In his talk from the AGBT 2015 PacBio workshop, Craig Venter detailed plans to sequence 1 million genomes and gather extensive phenotypic data to make sense of them. Included: generating 30 reference genomes to represent ethnogeographic diversity; the need for long-range continuity in sequencing; and truly predictive genomics.
Yuta Suzuki from the University of Tokyo presents his AGBT poster on heterozygotic DNA methylation patterns. He used kinetic data from SMRT Sequencing to generate epigenetic information on samples ranging from human to medaka fish and was able to analyze haplotype-specific methylation data. He also shows that long reads are better able to capture data about CpG islands than short-read sequences.
Jason Chin, senior director of bioinformatics at PacBio, talks about using long-read sequence data and string graph assembly for assembling diploid genomes. A major challenge for diploid genome assembly is in distinguishing homologous regions from repeats, so he discusses how long reads are essential for resolving repeat regions. In the presentation, Chin displays data from two inbred Arabidopsis strains used to create a synthetic diploid assembly.
In this AGBT virtual poster video, Jason Chin, a bioinformatician at PacBio, describes a polyploidy-aware de novo assembly approach called FALCON and a new algorithm, dubbed FALCON-unzip, that involves “unzipping” diploid genomes for de novo haplotype reconstructions from SMRT Sequencing data. These methods are illustrated in a studies of fungal, Arabidopsis and human datasets for the resolution of structural variation and characterization of haplotypes.
Swati Ranade from PacBio presents her AGBT poster demonstrating the use of SMRT Sequencing to characterize complex immune regions from human, macaque, and hummingbird. Included: a de novo assembly of complete KIR haplotypes, the MHC region, and MHC alleles.
Jason Chin, senior director of bioinformatics at PacBio, talks about using long-read sequence data to generate diploid genome assemblies to produce comprehensive haplotype sequence reconstructions. In the presentation, Chin describes the FALCON Unzip process that combines SNP phasing with the assembly process and allows for determination of the haplotype sequences and identification of structural variants. He presents an example of diploid assembly from inbred Arabidopsis strains.
Brett Hannigan, Computational Biology Project Leader at DNAnexus, demonstrates a fast, accurate, and cost-efficient solution for diploid-aware de novo genome assembly utilizing FALCON on the DNAnexus platform.