There are many challenges involved with metagenome assembly, including the presence of multiple species, uneven species abundances, and conserved genomic regions that are shared across species. Highly accurate long reads offer clear advantages over short reads and can overcome many of the obstacles associated with metagenome assembly. PacBio HiFi sequencing of metagenomic samples with the Sequel IIe system regularly produces reads 8–15 kb in size with a median QV ranging from 30 – 45 (99.9–99.99% accuracy). With the development of new metagenome assembly algorithms specific to HiFi reads (hifiasm-meta, metaFlye), it is now possible to reconstruct full metagenome assembled genomes (MAGs) for many high abundance species. These MAGs are often composed of a single circular contig, representing high-quality complete bacterial genomes. However, discontiguous assemblies still occur for lower abundance taxa, and post-assembly tools are required to identify MAGs in this category. Here, we present the HiFi-MAG-Pipeline, a comprehensive workflow for processing long-read metagenome assemblies.
Organization: PacBio
Year: 2022