Mark Gerstein is the co-director of the Yale Computational Biology and Bioinformatics program where he focuses on better annotation of the human genome and better ways to mine big genomics data. He has played a big role in some of the large genomics initiatives since the first human genome project, including ENCODE and the 1,000 Genomes Project. “I’m very enthusiastic, of course, about the thousand dollar genome, but I don’t think that a true human genome has arrived for a thousand dollars,” Mark says at the outset of this Mendelspod interview. “The great excitement of next generation sequencing—which is deserved—has…
Marc Salit is the leader of the Genome Scale Measurement Group at the National Institute of Standards and Technology or NIST. In this Mendelspod podcast, he explains how NIST played a pivotal, foundational role in enabling the ‘Century of Physics.’ Now Marc and NIST are looking for the right set of standards to enable the already-upon-us “Century of Biology.” The human reference genome is an example of a standard that Marc and his team are developing. Currently they are piloting what they call “Genome in a Bottle,” a physical reference standard to which all other human genomes can be measured.…
By 2050, there will be 9 billion people on the planet. What will they eat? This is the question that led Rod Wing, Director of the Arizona Genomics Institute, into the field of plant genomics. What has been accomplished so far in the mission to come up with some super green crops? And how does Rod see anti-GMO sentiment and the recent trend toward gluten free diets factoring in? After answering these questions, he dives into a discussion on which sequencing instruments he has used for plant work. Unsurprisingly, Rod prefers the PacBio long reads even though the cost is…
One of the popular questions on the Mendelspod program is how those doing sequencing decide between the quality of PacBio’s long reads and the cheaper short read technology, such as that of Illumina or Thermo Fisher. Steve Marsh, the Director of Bioinformatics at the Anthony Nolan Research Institute in London, provides the most clear and dramatic answer yet: use the PacBio system exclusively. Established in 1974 by the mother of a boy with a rare blood disease, the Anthony Nolan Institute is a world leader in blood crossmatching and donor/patient registries. Steve and his team at the Institute have dramatically…
Jim Lupski is a professor at Baylor College of Medicine where he’s on the frontline of incorporating genomic research into everyday clinical practice. The story begins with Jim’s own genome, which is perhaps the most sequenced genome ever. Jim’s life as a leading genomic researcher has been driven in part for a strong personal reason. He has a rare genetic disease named after three researchers who first defined it, Charcot Marie Tooth Neuropathy. What began as a personal journey to uncover the source of his own disease led Jim to seminal work that launched the field of structural variation. Working…
In this webinar, Jonas Korlach, Chief Scientific Officer, PacBio provides an overview of the features and the advantages of the new Sequel II System. Kiran Garimella, Senior Computational Scientist, Broad Institute of MIT and Harvard University, describes his work sequencing humans with HiFi reads enabling discovery of structural variants undetectable in short reads. Luke Tallon, Scientific Director, Genomics Resource Center, Institute for Genome Sciences, University of Maryland School of Medicine, covers the GRC’s work on bacterial multiplexing, 16S microbiome profiling, and shotgun metagenomics. Finally, Shane McCarthy, Senior Research Associate, University of Cambridge, focuses on the scaling and affordability of high-quality…
Dr. Wenger gives attendees an update on PacBio’s long-read sequencing and variant detection capabilities on the Sequel II System and shares recommendations on how to design your own study using HiFi reads. Then, Dr. Sund from Cincinnati Children’s Hospital Medical Center describes how she has used long-read sequencing to solve rare neurological diseases involving complex structural rearrangements that were previously unsolved with standard methods.
Recent improvements in sequencing chemistry and instrument performance combine to create a new PacBio data type, Single Molecule High-Fidelity reads (HiFi reads). Increased read length and improvement in library construction enables average read lengths of 10-20 kb with average sequence identity greater than 99% from raw single molecule reads. The resulting reads have the accuracy comparable to short read NGS but with 50-100 times longer read length. Here we benchmark the performance of this data type by sequencing and genotyping the Genome in a Bottle (GIAB) HG0002 human reference sample from the National Institute of Standards and Technology (NIST). We…
In 2012, NIST convened the Genome in a Bottle Consortium to develop the metrology infrastructure needed to enable confidence in human whole genome variant calls.
2015 SMRT Informatics Developers Conference Presentation Slides: Adam English, from the Human Genome Sequencing Center at Baylor College of Medicine presents on the structural variation tools being developed at Baylor.
2015 SMRT Informatics Developers Conference Presentation Slides: Kevin Corcoran of PacBio provided a brief review of community involvement in the development of analysis tools and showed a preview of upcoming sample preparation, chemistry and informatics improvements.
Purpose: Clinical laboratories, research laboratories and technology developers all need DNA samples with reliably known genotypes in order to help validate and improve their methods. The Genome in a Bottle Consortium (genomeinabottle.org) has been developing Reference Materials with high-accuracy whole genome sequences to support these efforts.Methodology: Our pilot reference material is based on Coriell sample NA12878 and was released in May 2015 as NIST RM 8398 (tinyurl.com/giabpilot). To minimize bias and improve accuracy, 11 whole-genome and 3 exome data sets produced using 5 different technologies were integrated using a systematic arbitration method [1]. The Genome in a Bottle Analysis Group…
The Genome in a Bottle Consortium is developing the reference materials, reference methods , and reference data n
In recent years, human genomic research has focused on comparing short-read data sets to a single human reference genome. However, it is becoming increasingly clear that significant structural variations present in individual human genomes are missed or ignored by this approach. Additionally, remapping short-read data limits the phasing of variation among individual chromosomes. This reduces the newly sequenced genome to a table of single nucleotide polymorphisms (SNPs) with little to no information as to the co-linearity (phasing) of these variants, resulting in a “mosaic” reference representing neither of the parental chromosomes. The variation between the homologous chromosomes is lost in…
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and…