A paper from scientists at the National Marrow Donor Program, Center for International Blood and Marrow Transplant Research, Fred Hutchinson Cancer Research Center, and other institutions reports the use of SMRT Sequencing to characterize the challenging killer cell immunoglobulin-like receptor (KIR) region in eight human genomes. By sequencing full-length fosmids, they found previously unreported haplotype structures.
“Revealing Complete Complex KIR Haplotypes Phased By Long-Read Sequencing Technology” comes from lead author David Roe, senior author Martin Maiers, and collaborators. They targeted the KIR region — which has implications in autoimmune disease, transplantation, infections, and more — because it has historically been very challenging to sequence. Containing as many as 16 genes and pseudogenes, the highly homologous KIR haplotypes are shaped by tandem duplications, deletions, and frequent recombination. “These characteristics of homology, repetitiveness, and structural diversity have made the region difficult to haplotype,” the scientists note. “A sequencing approach that precisely captures the complexity of KIR haplotypes for functional annotation is desirable.”
To that end, they incorporated SMRT Sequencing, which produces reads long enough to span fosmids. “Using this method, we have for the first time comprehensively sequenced and phased sixteen KIR haplotypes from eight individuals without imputation,” the authors report. “Sixteen haplotypes from eight individuals were completely and [unambiguously] sequenced except for two haplotypes whose KIR3DL3 genes were not captured in the fosmid and a small gap in one of the haplotypes, located in a repetitive insertion spanning over 100,000 bp.” Haplotypes were as short as 69 kb and as long as 269 kb, and included four novel structures. The team also uncovered a new gene fusion as well as previously unreported structural variants.
One of the most important elements for resolving this difficult region was eliminating the need to shotgun shear the fosmids prior to sequencing. Because the longest SMRT Sequencing could span full-length fosmids, “it is therefore possible to span an entire fosmid insert with a single continuous read,” the scientists write. “Bypassing the shearing … helped improve the phasing accuracy of the individual fosmid sequences and the high-quality sequences of complete fosmids easily tiled into full haplotypes.”
This workflow made it possible to phase centromeric and telomeric regions, among other accomplishments. “Such completely de novo assembled sequences not only provide the ability to discover and annotate KIR gene alleles at the highest resolution, but also provide value as references, evolutionary informers, and source material for imputation,” the team writes.
August 1, 2017 | General