Recombinant adeno-associated virus (rAAV) is one of the most actively investigated gene therapy vehicles. At its most basic form, an rAAV vector has been stripped of all its components with only the flanking inverted terminal repeat (ITR) regions remaining, which are needed to guide genome replication and packaging during vector production.
Traditional QC methods for rAAV using qPCR or gel electrophoresis can only confirm the size distribution of the vectors, but fail to identify the prevalence, fragmentation, and component of them. While next-generation sequencing has been used for AAV sequencing, the short read lengths cannot sequence full-length AAV molecules, which are essential for identifying truncation events and chimeras. The ITR region can also be in different flip/flop configurations, which are difficult to distinguish without the full-length information.
While other long read technologies are limited by accuracy and cannot resolve similar AAV species and quantitate its prevalence, PacBio SMRT sequencing generates long and accurate reads (HiFi reads), which can sequence an AAV genome from ITR to ITR, revealing otherwise difficult-to-detect issues.
See how HiFi sequencing is used in different phases of AAV vector development:
HiFi sequencing for AAV vector discovery
Novel capsids from human viromes can serve as therapeutic candidates. Using HiFi sequencing from a single tissue, Hsu et al. (2020) discovered a novel capsid (AAVv66) that is similar to the commercially popular AAV2 but exhibits enhanced production yield, virion stability, and CNS transduction. The discovery was made possible by using targeted sequencing with primers designed at conserved regions across serotypes. Notably, the ability to sequence through the target region (~2.2 kb) in full length and with high accuracy, allowed the researchers to tabulate and quantify the distinct novel AAV species in the sample, leading to the identification of a dominant novel capsid, AAVv66, that made up 45% of the population.
HiFi sequencing for AAV vector design
The ability to sequence full-length AAV informs vector designs and reveals packaging issues that are not otherwise observable. With HiFi sequencing, Tai et al. (2018) discovered that a seemingly homogenous genome population actually contains less than 50% full-length species. Specifically, they sequenced a population of self-complementary AAV (scAAV) and found that the inclusion of short hairpin DNA resulted in undesirable truncated genomes. Similarly, in a follow up publication (Tran et al. 2020), the inclusion of dual single guide RNA (sgRNA) expression cassettes in tail-to-tail configuration was found to cause truncation of packaging in ssAAV. In both studies, HiFi sequencing revealed impurities, such as sequences originating from plasmid backbone, as well as from the packaging and helper plasmids.
Thus, HiFi sequencing can be used iteratively improve vector design by observing the frequency of truncations, fragmentation, and other non-full-length anomalies. Further, to confirm that the transgene produces the desired mRNA transcripts, targeted full-length isoform sequencing (targeted Iso-Seq method) can be used to characterize and quantify the expressed isoforms.
HiFi sequencing for host integration study
Understanding potential host integration events is also important for creating a safe gene therapy product. HiFi sequencing was used by Dalwadi et al. (2021) to interrogate the frequency of rAAV into human hepatocytes, in which they found chromosomal integrations at a surprisingly high frequency of 1%–3% both in vitro and in vivo. Importantly, most of the inserted rAAV sequences were heavily rearranged and were accompanied by deletions of the host genomic sequence at the integration site.
HiFi sequencing for AAV production
Recently, Tran et al. (2022) showed that different production platforms can result in different AAV genome populations. Using HiFi sequencing, they found that compared to pTx/HEK293, the rBV/Sf9-produced vectors had a higher degree of unresolved genomes as a consequence of truncated, self-complementary single ITR species. Further, sequencing revealed what were previously thought to be empty capsids to be containing ITR-bearing short DNA fragments. This discovery has implications for standard clinical productions that rely on qPCR or ddPCR that quantifies vector titers based on sequences proximal to ITR. Finally, regardless of the production platform, the authors showed that heterogeneity in ITRs directly influences genome population.
Gene therapy is at an inflection point. High accuracy and complete visibility are critical to the success of novel vector discovery, vector design, and manufacturing quality control for gene therapy products. Using HiFi sequencing, researchers have characterized and quantified ssAAV and scAAV in different production systems, uncovering previously unknown issues with partial or empty capsids that could have important implications for safety and efficacy. Meanwhile, the human virome harbors many more therapeutic candidates that could be discovered via sequencing.