‘The Quality of PacBio Data Is Beyond Compare’: Eric Schadt on Applications of SMRT Sequencing to Human Genetics
Monday, October 6, 2014
As part of its continuing series on long-read sequencing, last week Mendelspod aired an engaging interview with Eric Schadt, Professor & Chair of Genetics and Genomic Sciences, and Director of the Icahn Institute for Genomics and Multiscale Biology at Mount Sinai.
Having now spent three years in his role at the groundbreaking institute, he reports that they are making great progress in the quest to build better data-driven health profiles around individuals that may better guide healthcare choices.
On short-read versus long-read sequencing
Short-read sequencing technologies still maintain the advantage in terms of throughput, says Schadt, but there are a variety of important genomic features that cannot be characterized without long-read sequencing, such as long tandem repeats, bigger structural variations, and focal variants important in cancer.
“I definitely think [short-read] technologies were tuned for certain problems and had certain advantages that enabled this big advance, but they are absolutely not hitting the entire problem like we need it hit,” he told Mendelspod.
Cancer is a main area of study for which Schadt believes long-read sequencing is needed, in order to understand the complicated genomic features driving the tumor cells. And outside of human applications he called out plant genomics. “Plant genomes are so complicated and so flooded with repeat sequences, their only hope is to have long-read data,” he said.
In general, Schadt believes that the scientific community is recognizing the need for long-read data to provide complete characterizations for genomes. “The aim in all of these is to unambiguously resolve all the structural features of a genome, to de novo assemble those genomes and get away from reference-based assemblies, you are just not going to be able to do that with short-read technologies.”
The quality of PacBio sequencing is ‘beyond compare’
Schadt noted that early misconceptions about the type of error profiles seen in single-molecule data erroneously led people to believe the data was of lower quality. He explained how the errors are random and can easily be washed out with a modest amount of coverage, whereas other next-generation sequencing technologies have systemic errors that cannot be removed.
At Mount Sinai they used PacBio® technology to sequence the human genome and saw “very dramatic improvements in the quality of the de novo assemblies, revealing features that have never been seen before.” He said he believes this type of sequencing will become the standard. “The quality of that PacBio data is just beyond compare.”
Schadt also noted that because it is the only single molecule sequencer, there are certain applications that cannot currently be done with any other product. Examples he discussed include assembling bacterial genomes de novo, calling cancer variants in heterogeneous samples, dealing with viral mixtures, mitochondrial DNA sequencing, and looking at methylation as part of the sequencing. “It’s just amazing what the instrument is capable of doing,” he said.
Next up in the series
Mendelspod will interview Gene Myers from the Max Planck Institute. Stay tuned for programming information.