This webinar highlights global initiatives currently underway to use Single Molecule, Real-Time (SMRT) Sequencing to de novo assemble genomes of individuals representing multiple ethnic populations, thereby extending the diversity of available human reference genomes. In their presentations, Tina Graves-Lindsay from Washington University and Adam Ameur from Uppsala University spoke about diploid assemblies, discovering novel sequence and improving diversity of the current human reference genome. Finally, Paul Peluso of PacBio presented data from the recent effort to sequence a Puerto Rican genome and shared a SMRT Sequencing technology roadmap showing the next several upgrades for the Sequel System.
The lack of diversity in genomic data has been an issue of growing concern. It threatens to limit the benefits from the massive investment that has been made to date to transform biomedical research, drug development, and the clinical care of patients. We spoke to Jonas Korlach, chief scientific officer of Pacific Biosciences, about the problem, how it’s being addressed, and the role advancing technology can play in gleaning greater insights from the genomes that are analyzed.
Tina Graves-Lindsay from the McDonnell Genome Institute reports at AGBT 2020 on how her team is using PacBio sequencing to produce reference-grade human genome assemblies. With highly accurate HiFi reads, no error correction step is needed during the sequencing and analysis process, and they can produce reference-grade assemblies with half the sequence coverage needed before. They are now generating diploid assemblies and will be contributing to the human pangenome reference project.
A recent study on human structural variation indicates insufficiencies and errors in the human reference genome, GRCh38, and argues for the construction of a human pan-genome.
In recent genome analyses, population-specific reference panels have indicated important. However, reference panels based on short-read sequencing data do not sufficiently cover long insertions. Therefore, the nature of long insertions has not been well documented. Here, we assembled a Japanese genome using single-molecule real-time sequencing data and characterized insertions found in the assembled genome. We identified 3691 insertions ranging from 100?bps to ~10,000?bps in the assembled genome relative to the international reference sequence (GRCh38). To validate and characterize these insertions, we mapped short-reads from 1070 Japanese individuals and 728 individuals from eight other populations to insertions integrated into GRCh38. With…