June 1, 2021  |  

Genomic Architecture of the KIR and MHC-B and -C Regions in Orangutan

PacBio 2013 User Group Meeting Presentation Slides: Lisbeth Guethlein from Stanford University School of Medicine looked at highly repetitive and variable immune regions of the orangutan genome. Guethlein reported that “PacBio managed to accomplish in a week what I have been working on for a couple years” (with Sanger sequencing), and the results were concordant. “Long story short, I was a happy customer.”


June 1, 2021  |  

Assembly of complete KIR haplotypes from a diploid individual by the direct sequencing of full-length fosmids.

We show that linearizing and directly sequencing full-length fosmids simplifies the assembly problem such that it is possible to unambiguously assemble individual haplotypes for the highly repetitive 100-200 kb killer Ig-like receptor (KIR) gene loci of chromosome 19. A tiling of targeted fosmids can be used to clone extended lengths of genomic DNA, 100s of kb in length, but repeat complexity in regions of particular interest, such as the KIR locus, means that sequence assembly of pooled samples into complete haplotypes is difficult and in many cases impossible. The current maximum read length generated by SMRT Sequencing exceeds the length of a 40 kb fosmid; it is therefore possible to span an entire fosmid in one sequencing read. Shearing, sequencing and assembling fosmids in a shotgun approach is prone to errors when the underlying sequence is highly repetitive. We show that it is possible to directly sequence linearized fosmids and generate a high-quality consensus by simple alignment, removing the need for an error-prone assembly step. The high-quality sequence of complete fosmids can then be tiled into full haplotypes. We demonstrate the method on DNA samples from a number of individuals and fully recover the sequence of both haplotypes from a pool of KIR fosmids. The ability to haplotype and sequence complex immunogenetic regions will bring exciting opportunities to explore the evolution of disease associations of the immune sub-genome. This simple and robust approach can be scaled-up allowing a complex genomic region to be sequenced at a population level. We expect such sequencing to be valuable in disease association research.


June 1, 2021  |  

HLA variant identification techniques

The Human Leukocyte Antigen (HLA) genes located on chromosome 6 are responsible for regulating immune function via antigen presentation and are one of the determining factors for stem cell and organ transplantation compatibility. Additionally various alleles within this region have been implicated in autoimmune disorders, cancer, vaccine response and both non-infectious and infectious disease risk. The HLA region is highly variable; containing repetitive regions; and co-dominantly expressed genes. This complicates short read mapping and means that assessing the effect of variation within a gene requires full phase information to resolve haplotypes.One solution to the problem of HLA identification is the use of statistical inference to suggest the most likely diploid alleles given the genotypes observed. The assumption of this approach is the availability of an extensive reference panel. Whilst there exists good population genetics data for imputing European populations, there remains a paucity of information about variation in African populations. Filling this gap is one of the aims of the Genome Diversity in Africa Project and as a first step we are performing a pilot study to identify the optimal method for determining HLA type information for large numbers of samples from African populations.To that end we have obtained samples from 125 consented African participants selected from 5 populations across Africa (Morrocan, Ashanti, Igbo, Kalenjin, and Zulu). The methods included in our pilot study are Sanger sequencing (ABI), NGS on HiSeqX Ten platform (Illumina); long-range PCR combined with single molecule real-time (SMRT) sequencing (PacBio); and for a subset of samples library preparation on GemCode Platform (10x Genomics), which delivers valuable long range contextual information, combined with Illumina NGS sequencing.Results from capillary sequencing suggests the presence of a minimum of two novel alleles. Long Range PCR have been performed initially on a subset of samples using both primers sourced from GenDX and designed as described in Shiina et al (2012). Initial results from both primer sets were promising on Promega DNA test samples but only the GenDX primers proved effective on the African samples, producing consistently PCR products of the expected size in the Igbo, Ashanti, Morrocan and Zulu samples. We will present early results from our evaluation of the different sequencing technologies


June 1, 2021  |  

Resolving KIR genotypes and haplotypes simultaneously using Single Molecule, Real-Time Sequencing

The killer immunoglobulin-like receptors (KIR) genes belong to the immunoglobulin superfamily and are widely studied due to the critical role they play in coordinating the innate immune response to infection and disease. Highly accurate, contiguous, long reads, like those generated by SMRT Sequencing, when combined with target-enrichment protocols, provide a straightforward strategy for generating complete de novo assembled KIR haplotypes. We have explored two different methods to capture the KIR region; one applying the use of fosmid clones and one using Nimblegen capture.


June 1, 2021  |  

Whole gene sequencing of KIR-3DL1 with SMRT Sequencing and the distribution of allelic variants in different ethnic groups

The killer-cell immunoglobulin-like receptor (KIR) gene family are involved in immune modulation during viral infection, autoimmune disease and in allogeneic stem cell transplantation. Most KIR gene diversity studies and their impact on the transplant outcome is performed by gene absence/presence assays. However, it is well known that KIR gene allelic variations have biological significance. Allele level typing of KIR genes has been very challenging until recently due to the homologous nature of those genes and very long intronic sequences. SMRT (Single Molecule Real-Time) Sequencing generates average long reads of 10 to 15 kb and allows us to obtain in-phase long sequence reads. We have developed a PCR assay for SMRT Sequencing on the PacBio RS II platform in our lab for 3DL1 whole gene sequencing. This approach allows us to obtain allele level typing for 3DL1 genes and could serve as a model to type other KIR genes at allelic level.


June 1, 2021  |  

Characterizing haplotype diversity at the immunoglobulin heavy chain locus across human populations using novel long-read sequencing and assembly approaches

The human immunoglobulin heavy chain locus (IGH) remains among the most understudied regions of the human genome. Recent efforts have shown that haplotype diversity within IGH is elevated and exhibits population specific patterns; for example, our re-sequencing of the locus from only a single chromosome uncovered >100 Kb of novel sequence, including descriptions of six novel alleles, and four previously unmapped genes. Historically, this complex locus architecture has hindered the characterization of IGH germline single nucleotide, copy number, and structural variants (SNVs; CNVs; SVs), and as a result, there remains little known about the role of IGH polymorphisms in inter-individual antibody repertoire variability and disease. To remedy this, we are taking a multi-faceted approach to improving existing genomic resources in the human IGH region. First, from whole-genome and fosmid-based datasets, we are building the largest and most ethnically diverse set of IGH reference assemblies to date, by employing PacBio long-read sequencing combined with novel algorithms for phased haplotype assembly. In total, our effort will result in the characterization of >15 phased haplotypes from individuals of Asian, African, and European descent, to be used as a representative reference set by the genomics and immunogenetics community. Second, we are utilizing this more comprehensive sequence catalogue to inform the design and analysis of novel targeted IGH genotyping assays. Standard targeted DNA enrichment methods (e.g., exome capture) are currently optimized for the capture of only very short (100’s of bp) DNA segments. Our platform uses a modified bench protocol to pair existing capture-array technologies with the enrichment of longer fragments of DNA, enabling the use of PacBio sequencing of DNA segments up to 7 Kb. This substantial increase in contiguity disambiguates many of the complex repeated structures inherent to the locus, while yielding the base pair fidelity required to call SNVs. Together these resources will establish a stronger framework for further characterizing IGH genetic diversity and facilitate IGH genomic profiling in the clinical and research settings, which will be key to fully understanding the role of IGH germline variation in antibody repertoire development and disease.


June 1, 2021  |  

Allelic specificity of immunoglobulin heavy chain ([email protected]) translocation in B-cell acute lymphoblastic leukemia (B-ALL) unveiled by long-read sequencing

Oncogenic fusion of IGH-DUX4 has recently been reported as a hallmark that defines a B-ALL subtype present in up to 7% of adolescents and young adults B-ALL. The translocation of DUX4 into IGH results in aberrant activation of DUX4 by hijacking the intronic IGH enhancer (Eµ). How IGH-DUX4 translocation interplays with IGH allelic exclusion was never been explored. We investigated this in Nalm6 B-ALL cell line, using long-read (PacBio Iso-Seq method and 10X Chromium WGS), short-read (Illumina total stranded RNA and WGS), epigenome (H3K27ac ChIP-seq, ATAC-seq) and 3-D genome (Hi-C, H3K27ac HiChIP, Capture-C).


April 21, 2020  |  

A comparison of immunoglobulin IGHV, IGHD and IGHJ genes in wild-derived and classical inbred mouse strains.

The genomes of classical inbred mouse strains include genes derived from all three major subspecies of the house mouse, Mus musculus. We recently posited that genetic diversity in the immunoglobulin heavy chain (IGH) gene loci of C57BL/6 and BALB/c mice reflect differences in subspecies origin. To investigate this hypothesis, we conducted high-throughput sequencing of IGH gene rearrangements to document IGH variable (IGHV), joining (IGHJ), and diversity (IGHD) genes in four inbred wild-derived mouse strains (CAST/EiJ, LEWES/EiJ, MSM/MsJ, and PWD/PhJ), and a single disease model strain (NOD/ShiLtJ), collectively representing genetic backgrounds of several major mouse subspecies. A total of 341 germline IGHV sequences were inferred in the wild-derived strains, including 247 not curated in the International Immunogenetics Information System. In contrast, 83/84 inferred NOD IGHV genes had previously been observed in C57BL/6 mice. Variability among the strains examined was observed for only a single IGHJ gene, involving a description of a novel allele. In contrast, unexpected variation was found in the IGHD gene loci, with four previously unreported IGHD gene sequences being documented. Very few IGHV sequences of C57BL/6 and BALB/c mice were shared with strains representing major subspecies, suggesting that their IGH loci may be complex mosaics of genes of disparate origins. This suggests a similar level of diversity is likely present in the IGH loci of other classical inbred strains. This must now be documented if we are to properly understand inter-strain variation in models of antibody-mediated disease. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

Transcriptional initiation of a small RNA, not R-loop stability, dictates the frequency of pilin antigenic variation in Neisseria gonorrhoeae.

Neisseria gonorrhoeae, the sole causative agent of gonorrhea, constitutively undergoes diversification of the Type IV pilus. Gene conversion occurs between one of the several donor silent copies located in distinct loci and the recipient pilE gene, encoding the major pilin subunit of the pilus. A guanine quadruplex (G4) DNA structure and a cis-acting sRNA (G4-sRNA) are located upstream of the pilE gene and both are required for pilin antigenic variation (Av). We show that the reduced sRNA transcription lowers pilin Av frequencies. Extended transcriptional elongation is not required for Av, since limiting the transcript to 32 nt allows for normal Av frequencies. Using chromatin immunoprecipitation (ChIP) assays, we show that cellular G4s are less abundant when sRNA transcription is lower. In addition, using ChIP, we demonstrate that the G4-sRNA forms a stable RNA:DNA hybrid (R-loop) with its template strand. However, modulating R-loop levels by controlling RNase HI expression does not alter G4 abundance quantified through ChIP. Since pilin Av frequencies were not altered when modulating R-loop levels by controlling RNase HI expression, we conclude that transcription of the sRNA is necessary, but stable R-loops are not required to promote pilin Av. © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

Acquired N-Linked Glycosylation Motifs in B-Cell Receptors of Primary Cutaneous B-Cell Lymphoma and the Normal B-Cell Repertoire.

Primary cutaneous follicle center lymphoma (PCFCL) is a rare mature B-cell lymphoma with an unknown etiology. PCFCL resembles follicular lymphoma (FL) by cytomorphologic and microarchitectural criteria. FL B cells are selected for N-linked glycosylation motifs in their B-cell receptors (BCRs) that are acquired during continuous somatic hypermutation. The stimulation of mannosylated BCR by lectins on the tumor microenvironment is therefore a candidate driver in FL pathogenesis. We investigated whether the same mechanism could play a role in PCFCL pathogenesis. Full-length functional variable, diversity, and joining gene sequences of 18 PCFCL and 8 primary cutaneous diffuse large B-cell lymphoma, leg-type were identified by unbiased Anchoring Reverse Transcription of Immunoglobulin Sequences and Amplification by Nested PCR and BCR reconstruction from RNA sequencing data. Low BCR variation demonstrated negligible ongoing somatic hypermutation in PCFCL and primary cutaneous diffuse large B-cell lymphoma, leg-type, and indicated that the PCFCL microarchitecture does not act as a functional germinal center. Similar to FL but in contrast to primary cutaneous diffuse large B-cell lymphoma, leg-type, BCR genes of 15 PCFCLs (83%) had acquired N-linked glycosylation motifs. These motifs were located at the BCR positions converted to N-linked glycosylation motifs in normal B-cell repertoires with low prevalence but mostly at different positions than those found in FL. The cutaneous localization of PCFCL might suggest a role for lectins from commensal skin bacteria in PCFCL lymphomagenesis.Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.


April 21, 2020  |  

Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits.

The ruminants are one of the most successful mammalian lineages, exhibiting morphological and habitat diversity and containing several key livestock species. To better understand their evolution, we generated and analyzed de novo assembled genomes of 44 ruminant species, representing all six Ruminantia families. We used these genomes to create a time-calibrated phylogeny to resolve topological controversies, overcoming the challenges of incomplete lineage sorting. Population dynamic analyses show that population declines commenced between 100,000 and 50,000 years ago, which is concomitant with expansion in human populations. We also reveal genes and regulatory elements that possibly contribute to the evolution of the digestive system, cranial appendages, immune system, metabolism, body size, cursorial locomotion, and dentition of the ruminants. Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.