AGBT Day 4: A Better Gorilla Assembly, and Data from the Sequel System
Monday, February 22, 2016
On the final day of AGBT, attendees strapped in for the last talks of the conference before the ’80s-themed dance party to close out the meeting. Two of those talks focused on SMRT Sequencing, one including new data from our Sequel System.
Christopher Hill from the Eichler lab at the University of Washington gave a fascinating talk on creating reference-grade assemblies for the great ape species. These resources will be incredibly helpful for shedding light on biological mechanisms behind speech, disease, neurological behavior, and other traits that separate us from our closest primate relatives. Current assemblies for these apes — including bonobo, chimpanzee, gorilla, and orangutan — are highly fragmented, with contig N50s in the tens of kilobases, Hill noted. He and his team are using SMRT Sequencing to resolve repetitive and highly complex regions to build a new gorilla assembly.
With PacBio sequencing and the FALCON assembler, the new assembly has just 16,000 contigs (compared to more than 460,000 in the existing assembly) and a contig N50 length of 9.6 Mb (compared to 11.7 kb in the existing assembly). The new gorilla assembly closed 94% of gaps from the existing assembly, added 164 Mb of new euchromatic sequence, and corrected previous misassemblies. Hill noted that structural variation in particular can be detected more robustly with this new resource. He also said that the gorilla reference is now more in line with the human reference thanks to this marked increase in contiguity. His team is currently working to bring the chimpanzee genome up to the same standard.
Our own CSO, Jonas Korlach, also gave a talk in the closing session of AGBT on the value of SMRT Sequencing for addressing complex diseases. He briefed attendees on the new, higher-throughput Sequel System and showed comparisons of Sequel data with data from the PacBio RS II system across a variety of applications. He noted the strong concordance between the platforms in studies such as highly multiplexed targeted sequencing of breast and ovarian cancer samples, a de novo E. coli genome assembly, and Iso-Seq analyses of full-length mRNA in control and cancer samples. Korlach stressed the value of long reads for high-quality DNA sequencing and assembly, but noted that read length alone isn’t enough; other essential elements include lack of GC bias and high consensus accuracy, he said.
We hope that you enjoyed this year’s AGBT as much as we did. We’re already looking forward to next year’s meeting!