June 1, 2021  |  

Comparison of sequencing approaches applied to complex soil metagenomes to resolve proteins of interest

Background: Long-read sequencing presents several potential advantages for providing more complete gene profiling of metagenomic samples. Long reads can capture multiple genes in a single read, and longer reads typically result in assemblies with better contiguity, especially for higher abundance organisms. However, a major challenge with using long reads has been the higher cost per base, which may lead to insufficient coverage of low-abundance species. Additionally, lower single-pass accuracy can make gene discovery for low-abundance organisms difficult. Methods: To evaluate the pros and cons of long reads for metagenomics, we directly compared PacBio and Illumina sequencing on a soil-derived sample, which included spike-in controls of known concentrations of pure referenced samples. For PacBio sequencing, a 10 kb library was sequenced on the Sequel System with 3.0 chemistry. Highly accurate long reads (HiFi reads) with Q20 and higher were generated for downstream analyses using PacBio Circular Consensus Sequencing (CCS) mode. Results were assessed according to the following criteria: DNA extraction capacity, bioinformatics pipeline status, % of proteins with ambiguous AA’s, total unique error-free genes/$1000, total proteins observed in spike-ins/$1000, proteins of interest/$1000, median length of contigs with proteins, and assembly requirements. Results: Both methods had areas of superior performance. DNA extraction capacity was higher for Illumina, the bioinformatics pipeline is well-tested, and there was a lower proportion of proteins with ambiguous AA’s. On the other hand, with PacBio, twice as many unique error-free genes, twice as many total proteins from spike-ins, and ~6 times more proteins of interest were found per $1000 cost. PacBio data produced on average 5 times longer contigs capturing proteins of interest. Additionally, assembly was not required for gene or protein finding, as was the case with Illumina data. Conclusions: In this comparison of PacBio Sequel System with Illumina NextSeq on a complex microbiome, we conclude that the sequencing system of choice may vary, depending on the goals and resources for the project. PacBio sequencing requires a longer DNA extraction method, and the bioinformatics pipeline may require development. On the other hand, the Sequel System generates hundreds of thousands of long HiFi reads per SMRT Cell, producing more genes, more proteins, and longer contigs, thereby offering more information about the metagenomic samples for a lower cost.

June 1, 2021  |  

Unbiased characterization of metagenome composition and function using HiFi sequencing on the PacBio Sequel II System

Recent work comparing metagenomic sequencing methods indicates that a comprehensive picture of the taxonomic and functional diversity of complex communities will be difficult to achieve with one sequencing technology alone. While the lower cost of short reads has enabled greater sequencing depth, the greater contiguity of long-read assemblies and lack of GC bias in SMRT Sequencing has enabled better gene finding. However, since long-read assembly typically requires high coverage for error correction, these benefits have in the past been lost for low-abundance species. The introduction of the Sequel II System has enabled a new, higher throughput, assembly-optional data type that addresses these challenges: HiFi reads. HiFi reads combine QV20 accuracy with long read lengths, eliminating the need for assembly for most metagenome applications, including gene discovery and metabolic pathway reconstruction. In fact, the read lengths and accuracy of HiFi data match or outperform the quality metrics of most metagenome assemblies, enabling cost-effective recovery of intact genes and operons while omitting the resource intensive and data-inefficient assembly step. Here we present the application of HiFi sequencing to both mock and human fecal samples using full-length 16S and shotgun methods. This proof-of-concept work demonstrates the unique strengths of the HiFi method. First, the high correspondence between the expected community composition,16S and shotgun profiling data reflects low context bias. In addition, every HiFi read yields ~5-8 predicted genes, without assembly, using standard tools. If assembly is desired, excellent results can be achieved with Canu and contig binning tools. In summary, HiFi sequencing is a new, cost-effective option for high-resolution functional profiling of metagenomes which complements existing short read workflows.

June 1, 2021  |  

Low-input single molecule HiFi sequencing for metagenomic samples

HiFi sequencing on the PacBio Sequel II System enables complete microbial community profiling of complex metagenomic samples using whole genome shotgun sequences. With HiFi sequencing, highly accurate long reads overcome the challenges posed by the presence of intergenic and extragenic repeat elements in microbial genomes, thus greatly improving phylogenetic profiling and sequence assembly. Recent improvements in library construction protocols enable HiFi sequencing starting from as low as 5 ng of input DNA. Here, we demonstrate comparative analyses of a control sample of known composition and a human fecal sample from varying amounts of input genomic DNA (1 ug, 200 ng, 5 ng), and present the corresponding library preparation workflows for standard, low input, and Ultra-Low methods. We demonstrate that the metagenome assembly, taxonomic assignment, and gene finding analyses are comparable across all methods for both samples, providing access to HiFi sequencing even for DNA-limited sample types.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.