New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods, both alignment- and de novo assembly-based, from short-, linked-, and long-read sequencing, as well as optical and electronic mapping. The final benchmark set contains 12745 isolated, sequence-resolved insertion and deletion calls =50 base pairs (bp) discovered by at least 2 technologies or 5 callsets, genotyped as heterozygous or homozygous variants by long reads. The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.66 Gbp and 9641 SVs supported by at least one diploid assembly. Support for SVs was assessed using svviz with short-, linked-, and long-read sequence data. In general, there was strong support from multiple technologies for the benchmark SVs, with 90 % of the Tier 1 SVs having support in reads from more than one technology. The Mendelian genotype error rate was 0.3 %, and genotype concordance with manual curation was >98.7 %. We demonstrate the utility of the benchmark set by showing it reliably identifies both false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping.
Acquired N-Linked Glycosylation Motifs in B-Cell Receptors of Primary Cutaneous B-Cell Lymphoma and the Normal B-Cell Repertoire.
Primary cutaneous follicle center lymphoma (PCFCL) is a rare mature B-cell lymphoma with an unknown etiology. PCFCL resembles follicular lymphoma (FL) by cytomorphologic and microarchitectural criteria. FL B cells are selected for N-linked glycosylation motifs in their B-cell receptors (BCRs) that are acquired during continuous somatic hypermutation. The stimulation of mannosylated BCR by lectins on the tumor microenvironment is therefore a candidate driver in FL pathogenesis. We investigated whether the same mechanism could play a role in PCFCL pathogenesis. Full-length functional variable, diversity, and joining gene sequences of 18 PCFCL and 8 primary cutaneous diffuse large B-cell lymphoma, leg-type were identified by unbiased Anchoring Reverse Transcription of Immunoglobulin Sequences and Amplification by Nested PCR and BCR reconstruction from RNA sequencing data. Low BCR variation demonstrated negligible ongoing somatic hypermutation in PCFCL and primary cutaneous diffuse large B-cell lymphoma, leg-type, and indicated that the PCFCL microarchitecture does not act as a functional germinal center. Similar to FL but in contrast to primary cutaneous diffuse large B-cell lymphoma, leg-type, BCR genes of 15 PCFCLs (83%) had acquired N-linked glycosylation motifs. These motifs were located at the BCR positions converted to N-linked glycosylation motifs in normal B-cell repertoires with low prevalence but mostly at different positions than those found in FL. The cutaneous localization of PCFCL might suggest a role for lectins from commensal skin bacteria in PCFCL lymphomagenesis.Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.
Do the toll-like receptors and complement systems play equally important roles in freshwater adapted Dolly Varden char (Salvelinus malma)?
Unlike the normal anadromous lifestyle, Chinese native Dolly Varden char (Salvelinus malma) is locked in land and lives in fresh water lifetime. To explore the effect of freshwater adaption on its immune system, we constructed a pooled cDNA library of hepatopancreas and spleen of Chinese freshwater Dolly Varden char (S. malma). A total of 27,829 unigenes were generated from 31,233 high-quality transcripts and 17,670 complete open reading frames (ORF) were identified. Totally 25,809 unigenes were successfully annotated and it classified more native than adaptive immunity-associated genes, and more genes involved in toll-like receptor signal pathway than those in complement and coagulation cascades (51 vs 3), implying the relative more important role of toll-like receptors than the complement system under bacterial injection for the freshwater Dolly Varden char. These huge different numbers of TLR and complement system identified in freshwater Dolly Varden char probably caused by distinct evolution pressure patterns between fish TLR and complement system, representative by TLR3 and TLR5 as well as C4 and C6, respectively, which were under purifying and positively selecting pressure, respectively. Further seawater adaptation experiment and the comparison study with our library will no doubt be helpful to elucidate the effect of freshwater adaption of Chinese native Dolly Varden char on its immune system.Copyright © 2018 Elsevier Ltd. All rights reserved.
The antibody repertoire of Bos taurus is characterized by a subset of variable heavy (VH) chain regions with ultralong third complementarity determining regions (CDR3) which, compared to other species, can provide a potent response to challenging antigens like HIV env. These unusual CDR3 can range to over seventy highly diverse amino acids in length and form unique ß-ribbon ‘stalk’ and disulfide bonded ‘knob’ structures, far from the typical antigen binding site. The genetic components and processes for forming these unusual cattle antibody VH CDR3 are not well understood. Here we analyze sequences of Bos taurus antibody VH domains and find that the subset with ultralong CDR3 exclusively uses a single variable gene, IGHV1-7 (VHBUL) rearranged to the longest diversity gene, IGHD8-2. An eight nucleotide duplication at the 3′ end of IGHV1-7 encodes a longer V-region producing an extended F ß-strand that contributes to the stalk in a rearranged CDR3. A low amino acid variability was observed in CDR1 and CDR2, suggesting that antigen binding for this subset most likely only depends on the CDR3. Importantly a novel, potentially AID mediated, deletional diversification mechanism of the B. taurus VH ultralong CDR3 knob was discovered, in which interior codons of the IGHD8-2 region are removed while maintaining integral structural components of the knob and descending strand of the stalk in place. These deletions serve to further diversify cysteine positions, and thus disulfide bonded loops. Hence, both germline and somatic genetic factors and processes appear to be involved in diversification of this structurally unusual cattle VH ultralong CDR3 repertoire.
The genomic structure of the Major Histocompatibility Complex (MHC) region and variation in selected MHC class I related genes in Old World camels, Camelus bactrianus and Camelus dromedaries were studied. The overall genomic organization of the camel MHC region follows a general pattern observed in other mammalian species and individual MHC loci appear to be well conserved. Selected MHC class I genes B-67 and BL3-7 exhibited unexpectedly low variability, even when compared to other camel MHC class I related genes MR1 and MICA. Interspecific SNP and allele sharing are relatively common, and frequencies of heterozygotes are usually low. Such a low variation in a genomic region generally considered as one of the most polymorphic in vertebrate genomes is unusual. Evolutionary relationships between MHC class I related genes and their counterparts from other species seem to be rather complex. Often, they do not follow the general evolutionary history of the species concerned. Close evolutionary relationships of individual MHC class I loci between camels, humans and dogs were observed. Based on the results of this study and on our data on MHC class II genes, the extent and the pattern of polymorphism of the MHC region of Old World camelids differed from most mammalian groups studied so far. Camels thus seem to be an important model for our understanding of the role of genetic diversity in immune functions, especially in the context of unique features of their immunoglobulin and T-cell receptor genes. © 2019 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Long-read sequencing technologies have advantages in genome assembly, structural variant detection and haplotype phasing, but are less suited for single-nucleotide variant (SNV) and insertion/deletion (indel) calling due to the high error rate in comparison with short-read sequencing. Wenger et al., from Pacific Biosciences, optimized the circular consensus sequencing (CCS) protocol to achieve long, high-fidelity reads, in which they selected the SMRTbell library with fractions tightly distributed at 15 kb for high-coverage sequencing.