The immunoglobulin heavy chain constant (IGHC) domain of antibodies (Ab) is responsible for effector functions critical to Ab mediated immunity. In humans, this domain is encoded by genes within the IGHC locus, where descriptions of genomic diversity remain incomplete. To address this, we utilized long-read genomic datasets to build a high-quality IGHC haplotype/variant catalog from 105 individuals of diverse ancestry, and developed a high-throughput approach for targeted long-read IGHC locus sequencing and assembly. From locally phased assemblies, we discovered previously uncharacterized single nucleotide variants (SNV) and complex structural variants (SVs, n=7), as well as novel genes and alleles. Of the 262 identified IGHC coding alleles, 235 (89.6%) were undocumented. SNV, SV, and gene allele/genotype frequencies revealed significant population differentiation, including; (i) hundreds of SNVs in African and East Asian populations exceeding fixation index (FST) of 0.3, (ii) and an IGHG4 haplotype carrying specific coding variants uniquely enriched in East and South Asian populations. Our results illuminate missing signatures of haplotype diversity in the IGHC locus, including evidence of natural selection, and establish a new foundation for investigating IGHC germline variation and its role in Ab function and disease.
Journal: bioRxiv
DOI: 10.1101/2025.02.12.634878
Year: 2025