January 4, 2024  |  Gene therapy + editing

Mastering AAV vector design: Best practices for gene therapy product characterization using HiFi sequencing

AAV capsid

By Claire Aldridge, PhD

Dr. Aldridge, Chief Strategy Officer at Form Bio, pioneers advanced pharmaceutical solutions to get life-saving therapeutics to patients. She was recently selected as one of Forbes’ 10 Women Leading the SynBio Revolution, redefining the future of the pharmaceutical industry. Her distinguished career includes biotech venture investing, entrepreneurship, and translating scientific discovery into practical solutions. Dr. Aldridge holds a BS in Biomedical Science from Texas A&M University and a PhD in Immunology and Genetics from Duke University.

For gene therapies, 2023 will go down as a significant landmark. In the final weeks of June, the FDA approved two AAV-based therapies, Elevidys for Duchenne muscular dystrophy and Roctavian for severe hemophilia A, offering new hope to those living with these genetic disorders.1,2 Recently, in December, two cell-based gene therapies –Casgevy and Lyfegenia– received FDA approval for treating sickle cell disease.13

The approval of these therapies constitutes a major “mission accomplished” moment for the preclinical and clinical teams that worked on them. Casgevy marks a significant milestone for the genome editing community as the first FDA-approved CRISPR-based therapy. In the case of Elevidys, the drug’s manufacturer, Sarepta, spent six years working towards commercialization.3 And the road was not easy going: During clinical testing, the team navigated two clinical holds over safety concerns.

A clinical hold can rapidly derail years of work on a potential therapy, and those venturing down the gene therapy development path can face risks aplenty (in addition to unforeseen safety signals).  On the preclinical R&D side, gene therapies are incredibly complex to design; constructing a Goldilocks therapy to get into the right cells and make the right amount of protein.  Manufacturing offers another hurdle to commercialization, as quality can be easily compromised for these intricate, multi-component biologics. Gene therapies aim to introduce persisting genetic instructions into human genomes, making safety a number one priority. Any challenges that arise in the preclinical design and small-scale manufacturing stages make clinical development all the riskier for patients and equate to extra time and money developers will spend on their path to regulatory approval.

In particular, product-related impurities are just starting to be put under the spotlight, as their presence has been linked to several safety issues. To address some of these issues, we need new techniques, technologies, and ways of thinking about characterizing gene therapies and assessing critical quality attributes.4


The crucial role of vector design in gene therapy development success

There are several types of viral vectors used in gene therapy development:5

  • Adenovirus-based (ADV) vectors: Wild-type ADVs commonly cause upper respiratory tract infections and are being investigated clinically as a vector for novel vaccine development against infectious diseases and as cancer therapies.
  • Lentivirus (LV) vectors: LVs are complex retroviruses and are primarily used as a vector for engineering ex vivo gene therapies, such as generating chimeric antigen receptor T (CAR-T) cells.
  • Adeno-associated virus (AAV) vectors: AAVs don’t cause any known human disease. However, it has been a popular laboratory tool as it has a relatively simple genome and can be easily manipulated. AAV-based gene therapies are being investigated in many rare diseases, and over 200 clinical trials worldwide use AAVs and 5 FDA approved.

Given the level of activity and the clinical success of AAV-based gene therapies, there has been much attention to improving the success rate and streamlining development. AAV-based vector research and design is a critical aspect of preclinical gene therapy that reflects an opportunity for improvement and optimization to improve the final product. This step could become analogous to lead-to-hit medicinal chemistry in traditional small molecule drug development. 

Vector design: Front and center for safety

When done thoughtfully, vector engineering and design can improve translation efficiency, refine tissue specificity (via promoter and regulatory region optimization), reduce immune response activation, and increase manufacturing yield and quality.6 However, when vector engineering issues arise, they can cause severe safety or efficacy issues. Some vector designs that have not been rigorously evaluated may lead to products with immune system triggers or protein expression defects.

For example, certain vector design decisions can lead to improper capsid packaging and the production of encapsidated non-vector or partial vector DNA.7  This impurity can lead to higher dosing and the production of immunogenic peptides. Other contaminants, like empty capsids or hypomethylated CpG motifs, can increase innate immune responses. These impurities can also lead to higher dosing, as described in an FDA Briefing document, which can lead to life-threatening immune activation and adverse events, such as T-cell-mediated liver injury and thrombocytopenic microangiopathy associated with complement activation.8

The importance of complete AAV genome sequencing in vector design

Given the potential downstream effects of these impurities, identifying and quantifying some of these issues must be a core piece of the gene therapy research puzzle. Once these potential impurities are identified and characterized, we can help troubleshoot and enable optimization of vector designs to reduce or eliminate their production in the first place.

AAV genome sequencing has been invaluable for characterizing and quantifying encapsidated, non-therapeutic DNA in AAV manufacturing products. The nuances of the AAV vectors (such as inverted repeat sequences) and the importance of comprehensive vector characterization have led to the development of new sequencing methodologies, such as PacBio HiFi long-read sequencing, that enable highly accurate sequencing of encapsidated DNA.

Recently, Tran et al. used a PacBio long-read platform to develop AAV-genome population sequencing (AAV-Gpseq) and used the technique to interrogate the effect of CRISPR components on AAV vector design.9  Using AAV-Gpseq, the group discovered that dual guide RNA designs can lead to AAV genome truncations during viral packaging. This finding is consistent with previous findings that secondary or tertiary DNA conformations can lead to replication errors during capsid packaging.10

The shortcomings of short-read sequencing in AAV vector design

The publication of the AAV-Gpseq methodology and further development of HiFi sequencing on the PacBio Sequel II/IIe sequencing platform is a major advancement over short-read sequencing methods that have long dominated the genome sequencing world.

Short-read sequencing produces sequencing results that are between 50 and 600 bases.11 To analyze these results, bioinformatics pipelines must be employed to accurately reconstruct an AAV genome sequence in silico, a length usually between 4 and 5 kb long.   Correctly counting and reconstructing these short snippets into their original length and position within the AAV vector can be challenging, even when done computationally. This short-read process is error-prone, analogous to reconstructing a 500-page narrative in the correct order using a jumbled-up collection of random phrases or sentences from the original novel.

In addition, short-read sequencing techniques have trouble with repetitive DNA sequences or secondary structures, which can cause polymerase stalling during library preparation. Both features are found in AAV vectors, and short-read sequencing technologies cannot fully sequence the inverted terminal repeat (ITR) regions due to the inhibition of polymerase processivity.10  In addition, ITR deletions have been reported in AAV manufacturing products. While there is an incomplete understanding of their effect on therapeutic efficacy and safety, they are another impurity produced during AAV manufacturing that must be considered during vector design.9

Uncovering hidden AAV packaging problems with HiFi sequencing

By contrast, long-read sequencing fills these (often literal) gaps left by short-read sequencing of AAV vectors.

HiFi read length can span the entire AAV genome and beyond, up to 25 kb.12 Consequently, there is no need to use additional bioinformatics pipelines to stitch shorter reads back together in the proper order. Computational analysis is precisely performed by simply mapping long reads to a reference genome. In addition, the accuracy of HiFi sequencing is 99.9%, enabling the precise identification of mutations that arise during AAV production workflows.  Long-read HiFi sequencing can also span ITRs, enabling the identification of truncations and other mutations in these regions, which previously have not been appreciated due to their absence in short-read sequencing data.

Taken together, long-read sequencing more comprehensively uncovers partial AAV genomes and non-vector impurities in AAV manufacturing products.  Until the use of HiFi sequencing, these and the above issues had remained practically invisible to gene therapy researchers and developers. Now that they can be easily identified without extensive preparation, complications in vector discovery can be found earlier in the research and design development cycle potentially informing and helping to avoid downstream effects on efficacy and safety.

Elevating AAV gene therapy product characterization through the use of AI

AAV gene therapy product characterization demands precision.  With HiFi sequencing, the impact of vector design choices can be assessed throughout the gene therapy development pathway and can help reveal hidden design flaws – fragmentation, truncation and more.  HiFi sequencing data can uncover these issues but employing artificial intelligence in your workflow can help take the product characterization research further and enable optimization of AAV quality, potentially enhancing safety, efficacy, and manufacturing of possible gene therapy products.

Specifically, HiFi sequencing data used as foundational training data for predictive AI helps researchers gain a better understanding of the construct composition, ranging from visibility to the truncation propensity of different parts of the construct, insight to the secondary and tertiary structures in the DNA leading those truncations, insight into overall CG content and subsequent CpG islands, and an in silico view of how different components of the regulatory cassette are impacting design stability.

The ability to predictively eliminate problematic vector designs enables researchers to take a design for manufacturing (DFM) approach – the practice of designing products with the specific aim of improving manufacturability.  This interdisciplinary approach and the close collaboration between computational biologists, bioinformaticians, geneticists, genetic engineers, and clinicians will continue to accelerate the pace of gene therapy successes, with many landmark years to come.

We invite you to watch our webinar for a more-in-depth exploration of PacBio gene therapy and gene editing solutions developed in collaboration with our strategic partners at Form Bio. Topics discussed include leveraging artificial intelligence to characterize and optimize AAV quality, helping to enable the enhancement of safety, efficacy, and manufacturability of gene therapies.

Watch the webinar

  1. ELEVIDYS. Published July 13, 2023. Accessed October 20, 2023.
  2. ROCTAVIAN. Published July 26, 2023. Accessed October 20, 2023.
  3. Sarepta’s Elevidys Reaches Finish Line as First Gene Therapy Approved for Duchenne Muscular Dystrophy. Published June 28, 2023. Accessed October 20, 2023.
  4. Gimpel AL, Katsikis G, Sha S, et al. Analytical methods for process and product characterization of recombinant adeno-associated virus-based gene therapies. Mol Ther – Methods Clin Dev. 2021; 20:740-754.
  5. Bulcha JT, Wang Y, Ma H, Tai PWL, Gao G. Viral vector platforms within the gene therapy landscape. Signal Transduct Target Ther. 2021;6(1):1-24.
  6. Li C, Samulski RJ. Engineering adeno-associated virus vectors for gene therapy. Nat Rev Genet. 2020;21(4):255-272.
  7. FDA, Cellular, Tissue, and Gene Therapies Advisory Committee (CGTTAC) Meeting #70. Toxicity Risks of AAV for Gene Therapy. September 2-3, 2021.
  8. FDA Briefing Document. BLA#125781/00.  Cellular, Tissue, and Gene Therapies Advisory Committee Meeting.05/12/2023.
  9. Tran NT, Heiner C, Weber K, et al. AAV-Genome Population Sequencing of Vectors Packaging CRISPR Components Reveals Design-Influenced Heterogeneity. Mol Ther Methods Clin Dev. 2020; 18:639-651.
  10. Developing Machine Learning Powered Solutions for Cell and Gene Therapy Candidate Validation. Published November 17, 2022. Accessed October 19, 2023.
  11. Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020;21(1):30.
  12. Highly accurate long-read HiFi sequencing data for five complex genomes. Published November 17, 2020. Accessed October 20, 2023.
  13. FDA Approves First Gene Therapies to Treat Patients with Sickle Cell Disease.  Published Dec 8, 2023. Accessed Dec 20, 2023.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.