July 12, 2022

Typing CYP2D6 star alleles from fully phased variants using PacBio HiFi reads

Author(s): John Harting¹, Zev Kronenberg², Nina Gonzaludo³, Jenny Ekholm⁴, Geoff Henno⁵, Edd Lee⁶

¹Pacific Biosciences (PacBio), Computational Research, Menlo Park, United States; ²Pacific Biosciences (PacBio), Bioinformatic Engineering, Menlo Park, United States; ³Pacific Biosciences (PacBio), Market Development, Menlo Park, United States; ⁴Pacific Biosciences (PacBio), Segment Marketing, Menlo Park, United States; ⁵Pacific Biosciences (PacBio), Precision Health Segment Marketing, Menlo Park, United States; ⁶Pacific Biosciences (PacBio), Human Genomics Segment Marketing, Menlo Park, United States

Background/Objectives:

The CYP2D6 locus is well known for its importance to pharmacogenetics as well as for its high diversity and complex genomic setting. Resolving individual alleles at this locus using short-read sequencing technologies requires inference-based methods due to ambiguous mapping in the presence of highly homologous pseudogenes. In contrast, long-range sequencing with PacBio HiFi reads directly resolves and phases a wide range of complicated and difficult genetic loci without inference. We present a novel bioinformatics workflow using PacBio HiFi reads which enables rapid and precise diplotyping and star(*)-allele classification of CYP2D6.

Methods:

In this work we designed a set of primers to amplify full CYP2D6 genes and flanking sequence. A multi-primer approach was used to separately amplify primary CYP2D6 genes, duplicate genes, hybrid genes, and fully deleted *5 alleles. We applied this targeted strategy to 22 samples from Coriell and sequenced the amplicons on a PacBio Sequel II System. To generate resolved CYP2D6 *- allele diplotypes we describe a two-step process : 1) Cluster and consensus of PacBio HiFi reads, 2) Direct comparison of phased variant sets from consensus sequences to star-alleles described in PharmVar. Additional information regarding fusion alleles is also provided to further identify hybrid categories.

Results:

Direct CYP2D6 *-allele typing generated by this workflow resulted in concordant results compared to orthogonal technologies. Differences between previous technologies’ results and PacBio HiFi sequencing were due to higher resolution and improved calls via our method, including better CNV calls, *5 deletion calling, and high resolution subtyping for all alleles.

Organization: PacBio
Year: 2022

View Conference Poster

ALS case study

Support

Typing CYP2D6 star alleles from fully phased variants using PacBio HiFi reads

Talk with an expert