Part IV of The New Biology documentary. This documentary film features the wave of cutting-edge technologies that now provide the opportunity to create predictive models of living systems, and gain wisdom about the fundamental nature of life itself. The potential impact for humanity is immense: from fighting complex diseases such as cancer, enabling proactive surveillance of virulent pathogens, and increasing food crop production.
This documentary film features the wave of cutting-edge technologies that now provide the opportunity to create predictive models of living systems, and gain wisdom about the fundamental nature of life itself. The potential impact for humanity is immense: from fighting complex diseases such as cancer, enabling proactive surveillance of virulent pathogens, and increasing food crop production.
Kim Worley from Baylor’s Human Genome Sequencing Center describes the improvement of the sooty mangabey primate genome. Sooty mangabey is a model organism for HIV research, since this particular primate can be infected with the immunodeficiency virus and never develop any symptoms. Worley and her team used PacBio long reads in conjunction with their own assembly tool, PBJelly, closing 64% and improving another 19% of the gaps.
Judson Ward, principal scientists at Driscoll’s Strawberries in California, introduces a genome assembly for Potentilla micrantha, which is closely related to strawberry but lacks fleshy ‘fruits’ or berries. Comparative genomics between P. micrantha and strawberry will yield significant information regarding the genetic mechanisms controlling fruit development. Using SMRT Sequencing Driscoll’s sequenced the 240 Mb P. micaranthagenome and produced a draft genome assembly, spanning the majority of the predicted sequence length. A comparison of sequence data produced using the Illumina HiSeq2000 and the PacBio RS platform demonstrated that PacBio sequencing produced a significantly longer N50 contig size and permitted a more complete genome…
Simon Chan, UC Davis on how PacBio long read sequencing revealed higher order repeats in centromeres of switchgrass which would have been hidden if you are restricted by the much shorter Sanger reads.
In this presentation, Greg Harhay from the USDA offers data on pathogens involved in bovine respiratory disease complex, known as “shipping fever.” His team used PacBio sequencing to analyze several isolates from two different pathogens, looking at their DNA sequence and methylation patterns.
From USDA’s Agricultural Research Service, molecular biologist Sean Gordon discusses the need for long-read sequencing to map an organism’s transcriptome. His team analyzed the wood-decaying fungus Plicaturopsis crispa first with short reads and found that they were missing exons and other important information. They switched to SMRT Sequencing so they could observe, rather than infer, full-length transcripts.
Chongyuan Luo from the Salk Institute for Biological Studies describes sequencing three strains of Arabidopsis thaliana using PacBio technology. The goal: uncover structural variants that have been missed by short-read and other sequencers. Luo notes that PacBio sequencing provides highly accurate SNP detection and also extends the mappability of reads beyond what is possible with short-read data, producing better and more accurate assemblies.
Allen Van Deynze from UC Davis presents the genome sequencing and assembly project for spinach, an organism of 980 Mb. Results indicate a high-accuracy assembly with significantly higher N50 contig length than a previous short-read assembly. The PacBio assembly has allowed for filling gaps in the prior assembly.
Dick McCombie from Cold Spring Harbor Laboratory describes de novo sequencing of several organisms, including yeast, Arabidopsis, and rice. With SMRT Sequencing, structural differences are preserved and full chromosomes can assemble into single contigs. Longest read observed: 54 kb.
Shane Brubaker from renewable oil manufacturer Solazyme reports using the PacBio system to sequence the genome of a GC-rich strain of algae that couldn’t be fully assembled with short-read sequence data. He notes that CCS reads exceed Sanger quality at significantly lower cost.
PacBio CSO Jonas Korlach describes the Iso-Seq method for full-length transcript isoform characterization using SMRT Sequencing. He presents published research using the method for full isoform characterization, including papers from Stanford scientists who analyzed full transcriptomes with SMRT Sequencing. With the Iso-Seq method, researchers found novel isoforms and novel genes even in well-studied cell lines.
This seminar features great hands-on information and best practices for analyzing SMRT Sequencing data for eukaryotic genome assembly. Michael Schatz provides an overview of the assembly tools, provides recommendations for when to use each one, and discusses the challenges of short-read assemblies. James Gurtowski gives an in-depth overview of hybrid assemblies methods, where short read data are used used to correct errors in longer reads. Finally, Sergey Koren presents on chromosome-scale assembly, including the MinHash Alignment Process (MHAP) he developed to dramatically reduce the computational processing power required for genome assemblies.
Swati Ranade from PacBio presents recent efforts to look at challenging regions of the human genome using SMRT Sequencing. She highlights a study just published that fully sequences a particular mucin gene for the first time, as well as work on KIR haplotypes in humans and other primates.
Jason Chin, senior director of bioinformatics at PacBio, talks about using long-read sequence data and string graph assembly for assembling diploid genomes. A major challenge for diploid genome assembly is in distinguishing homologous regions from repeats, so he discusses how long reads are essential for resolving repeat regions. In the presentation, Chin displays data from two inbred Arabidopsis strains used to create a synthetic diploid assembly.