Complete Genomes Within Reach: Multiplexing Enables Efficient Microbial Sequencing
Monday, April 8, 2019
Research interest in the human microbiome and the roles our bacterial, viral, and single-cell eukaryote co-inhabitants play in health, nutrition, immunity, and disease has exploded. Yet accurately measuring the composition of these microbial communities remains complex.
Sequence-based approaches allow the genetic material from complete collections of microbes to be analyzed without the need to cultivate the microorganisms. But each step in the process of collecting, extracting, preparing, sequencing and analyzing the DNA and data introduces its own set of errors and biases.
At the Innovation Lab of the University of Minnesota Genomics Center, research scientist Ben Auch and his colleagues are developing new tools to improve both microbiome and isolate characterization and the Sequel System is providing some solutions.
As Auch explained in a recent webinar, DNA extraction is a significant source of bias. He referenced a 2017 study in Nature Biotechnology led by German researchers, which compared how 21 labs handled two fecal specimens, and whether different DNA extraction protocols affected microbiome test results. They found wide variation in results, particularly among gram-positive bacteria, which were often under-reported.
“It would be helpful to have a tool to assess this bias consistently across labs and samples, that could potentially be used as an inline process control to track bias across the entire microbiome workflow, including extraction,” Auch said.
His lab has partnered with Minnesota company Microbiologics to develop a “xenobiological microbial standard” — a cell-based microbial spike-in control made up of organisms not found in the human microbiome.
“We hope to be able to use this tool to capture the diversity of microbial properties that might be present in a microbiome sample and track how those properties influence the resulting data, and also to calculate absolute microbial abundance,” Auch said.
The prototype contains an even mixture of 12 organisms – six gram positive and six gram negative – ranging in environmental origin, GC (guanine-cytosine) content and genome sizes, from 2.14 to 9 Mb.
But to use this assemblage as a control, he needed them to be well characterized. Unfortunately, in cases where the microbes had previously been sequenced, the existing genome assemblies tended to be highly fragmented. Others had no assemblies at all. In addition, Auch wanted to be sure his information reflected the exact strains they were using.
He turned to PacBio technology to comprehensively sequence all 12 organisms. And in order to make the endeavor more efficient and affordable, he used multiplexing.
“Compared to preparing libraries individually, this protocol is highly streamlined,” he said.
As he explained in the webinar, samples are first sheared to a consistent size of around 10 kb then cleaned up and QC’ed for fragmentation size and concentration. The library prep is individual at this point, with each sample following the typical PacBio library prep through the ligation step. At that point barcoded adaptors are substituted for the default adaptor, the ligase is inactivated and samples can be pooled based on a calculator that PacBio provides. The calculator decides how much of each ligation reaction should be pooled based on the concentration, shear size, and estimated genome size of all the samples.
The pooled libraries are now treated as a single sample, and they are moved into an optional, yet recommended, size selection step.
Once the library has been sequenced, it’s de-multiplexed in SMRT Link and ready for the downstream assembly process.
“You can generally plan to sequence 6-8 typical microbes on each SMRT cell, or a total genome content of between 30 and 40 megabases. For smaller and less repetitive genomes, you might be able to get as many as 16 libraries on a single SMRT Cell,” Auch said.
He shared the results of one run, which included seven samples, each of which was assembled in just one or two contigs.
“Considering the diversity of these microbes, I think it’s quite impressive to be able to get seven or more complete genomes out of a single prep and a single SMRT Cell,” Auch said.
As an added bonus, the sequencing data contains information about methylation in its raw reads, which can be further mined and shared with other researchers via the community database REBASE, curated by Nobel Laureate Rich Roberts.
“We’ve shown that diverse microbes across GC content, genome size, and environment, can be efficiently multiplexed on the PacBio Sequel System and they result in highly contiguous genomes,” Auch said. “High quality, complete microbial genomes are now very much within reach, from both a technical and cost perspective.”
To learn more about Microbial Multiplexing Workflow on the Sequel System using the SMRTbell Express Template Prep Kit 2.0, check out this handy application guide.