Menu
June 1, 2021  |  

Immune regions are no longer incomprehensible with SMRT Sequencing

The complex immune regions of the genome, including MHC and KIR, contain large copy number variants (CNVs), a high density of genes, hyper-polymorphic gene alleles, and conserved extended haplotypes (CEH) with enormous linkage disequilibrium (LDs). This level of complexity and inherent biases of short-read sequencing make it challenging for extracting immune region haplotype information from reference-reliant, shotgun sequencing and GWAS methods. As NGS based genome and exome sequencing and SNP arrays have become a routine for population studies, numerous efforts are being made for developing software to extract and or impute the immune gene information from these datasets. Despite these efforts, the fine mapping of causal variants of immune genes for their well-documented association with cancer, drug-induced hypersensitivity and immune-related diseases, has been slower than expected. This has in many ways limited our understanding of the mechanisms leading to immune disease. In the present work, we demonstrate the advantages of long reads delivered by SMRT Sequencing for assembling complete haplotypes of MHC and KIR gene clusters, as well as calling correct genotypes of genes comprised within them. All the genotype information is detected at allele- level with full phasing information across SNP-poor regions. Genotypes were called correctly from targeted gene amplicons, haplotypes, as well as from a completely assembled 5 Mb contig of the MHC region from a de novo assembly of whole genome shotgun data. De novo analysis pipeline used in all these approaches allowed for reference-free analysis without imputation, a key for interrogation without prior knowledge about ethnic backgrounds. These methods are thus easily adoptable for previously uncharacterized human or non-human species.


April 21, 2020  |  

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

Adaptive archaic introgression of copy number variants and the discovery of previously unknown human genes

As they migrated out of Africa and into Europe and Asia, anatomically modern humans interbred with archaic hominins, such as Neanderthals and Denisovans. The result of this genetic introgression on the recipient populations has been of considerable interest, especially in cases of selection for specific archaic genetic variants. Hsieh et al. characterized adaptive structural variants and copy number variants that are likely targets of positive selection in Melanesians. Focusing on population-specific regions of the genome that carry duplicated genes and show an excess of amino acid replacements provides evidence for one of the mechanisms by which genetic novelty can arise and result in differentiation between human genomes.Science, this issue p. eaax2083INTRODUCTIONCharacterizing genetic variants underlying local adaptations in human populations is one of the central goals of evolutionary research. Most studies have focused on adaptive single-nucleotide variants that either arose as new beneficial mutations or were introduced after interbreeding with our now-extinct relatives, including Neanderthals and Denisovans. The adaptive role of copy number variants (CNVs), another well-known form of genomic variation generated through deletions or duplications that affect more base pairs in the genome, is less well understood, despite evidence that such mutations are subject to stronger selective pressures.RATIONALEThis study focuses on the discovery of introgressed and adaptive CNVs that have become enriched in specific human populations. We combine whole-genome CNV calling and population genetic inference methods to discover CNVs and then assess signals of selection after controlling for demographic history. We examine 266 publicly available modern human genomes from the Simons Genome Diversity Project and genomes of three ancient homininstextemdasha Denisovan, a Neanderthal from the Altai Mountains in Siberia, and a Neanderthal from Croatia. We apply long-read sequencing methods to sequence-resolve complex CNVs of interest specifically in the Melanesianstextemdashan Oceanian population distributed from Papua New Guinea to as far east as the islands of Fiji and known to harbor some of the greatest amounts of Neanderthal and Denisovan ancestry.RESULTSConsistent with the hypothesis of archaic introgression outside Africa, we find a significant excess of CNV sharing between modern non-African populations and archaic hominins (P = 0.039). Among Melanesians, we observe an enrichment of CNVs with potential signals of positive selection (n = 37 CNVs), of which 19 CNVs likely introgressed from archaic hominins. We show that Melanesian-stratified CNVs are significantly associated with signals of positive selection (P = 0.0323). Many map near or within genes associated with metabolism (e.g., ACOT1 and ACOT2), development and cell cycle or signaling (e.g., TNFRSF10D and CDK11A and CDK11B), or immune response (e.g., IFNLR1). We characterize two of the largest and most complex CNVs on chromosomes 16p11.2 and 8p21.3 that introgressed from Denisovans and Neanderthals, respectively, and are absent from most other human populations. At chromosome 16p11.2, we sequence-resolve a large duplication of >383 thousand base pairs (kbp) that originated from Denisovans and introgressed into the ancestral Melanesian population 60,000 to 170,000 years ago. This large duplication occurs at high frequency (>79%) in diverse Melanesian groups, shows signatures of positive selection, and maps adjacent to Homo sapienstextendashspecific duplications that predispose to rearrangements associated with autism. On chromosome 8p21.3, we identify a Melanesian haplotype that carries two CNVs, a ~6-kbp deletion, and a ~38-kbp duplication, with a Neanderthal origin and that introgressed into non-Africans 40,000 to 120,000 years ago. This CNV haplotype occurs at high frequency (44%) and shows signals consistent with a partial selective sweep in Melanesians. Using long-read sequencing genomic and transcriptomic data, we reconstruct the structure and complex evolutionary history for these two CNVs and discover previously undescribed duplicated genes (TNFRSF10D1, TNFRSF10D2, and NPIPB16) that show an excess of amino acid replacements consistent with the action of positive selection.CONCLUSIONOur results suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation that is absent from current reference genomes.Large adaptive-introgressed CNVs at chromosomes 8p21.3 and 16p11.2 in Melanesians.The magnifying glasses highlight structural differences between the archaic (top) and reference (bottom) genomes. Neanderthal (red) and Denisovan (blue) haplotypes encompassing large CNVs occur at high frequencies in Melanesians (44 and 79%, respectively) but are absent (black) in all non-Melanesians. These CNVs create positively selected genes (TNFRSF10D1, TNFRSF10D2, and NPIPB16) that are absent from the reference genome.Copy number variants (CNVs) are subject to stronger selective pressure than single-nucleotide variants, but their roles in archaic introgression and adaptation have not been systematically investigated. We show that stratified CNVs are significantly associated with signatures of positive selection in Melanesians and provide evidence for adaptive introgression of large CNVs at chromosomes 16p11.2 and 8p21.3 from Denisovans and Neanderthals, respectively. Using long-read sequence data, we reconstruct the structure and complex evolutionary history of these polymorphisms and show that both encode positively selected genes absent from most human populations. Our results collectively suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation.


April 21, 2020  |  

The comparative genomics and complex population history of Papio baboons.

Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of complex evolutionary divergence, hybridization, and reticulation. These results help inform our understanding of similar cases, including modern humans, Neanderthals, Denisovans, and other ancient hominins.


April 21, 2020  |  

Current advances in HIV vaccine preclinical studies using Macaque models.

The macaque simian or simian/human immunodeficiency virus (SIV/SHIV) challenge model has been widely used to inform and guide human vaccine trials. Substantial advances have been made recently in the application of repeated-low-dose challenge (RLD) approach to assess SIV/SHIV vaccine efficacies (VE). Some candidate HIV vaccines have shown protective effects in preclinical studies using the macaque SIV/SHIV model but the model’s true predictive value for screening potential HIV vaccine candidates needs to be evaluated further. Here, we review key parameters used in the RLD approach and discuss their relevance for evaluating VE to improve preclinical studies of candidate HIV vaccines.Crown Copyright © 2019. Published by Elsevier Ltd. All rights reserved.


April 21, 2020  |  

Alternative Splicing of the Delta-Opioid Receptor Gene Suggests Existence of New Functional Isoforms.

The delta-opioid receptor (DOPr) participates in mediating the effects of opioid analgesics. However, no selective agonists have entered clinical care despite potential to ameliorate many neurological and psychiatric disorders. In an effort to address the drug development challenges, the functional contribution of receptor isoforms created by alternative splicing of the three-exonic coding gene, OPRD1, has been overlooked. We report that the gene is transcriptionally more diverse than previously demonstrated, producing novel protein isoforms in humans and mice. We provide support for the functional relevance of splice variants through context-dependent expression profiling (tissues, disease model) and conservation of the transcriptional landscape in closely related vertebrates. The conserved alternative transcriptional events have two distinct patterns. First, cassette exon inclusions between exons 1 and 2 interrupt the reading frame, producing truncated receptor fragments comprising only the first transmembrane (TM) domain, despite the lack of exact exon orthologues between distant species. Second, a novel promoter and transcriptional start site upstream of exon 2 produces a transcript of an N-terminally truncated 6TM isoform. However, a fundamental difference in the exonic landscaping as well as translation and translation products poses limits for modelling the human DOPr receptor system in mice.


April 21, 2020  |  

Vaccine-induced protection from homologous tier 2 SHIV challenge in nonhuman primates depends on serum-neutralizing antibody titers.

Passive administration of HIV neutralizing antibodies (nAbs) can protect macaques from hard-to-neutralize (tier 2) chimeric simian-human immunodeficiency virus (SHIV) challenge. However, conditions for nAb-mediated protection after vaccination have not been established. Here, we selected groups of 6 rhesus macaques with either high or low serum nAb titers from a total of 78 animals immunized with recombinant native-like (SOSIP) Env trimers. Repeat intrarectal challenge with homologous tier 2 SHIVBG505 led to rapid infection in unimmunized and low-titer animals. High-titer animals, however, demonstrated protection that was gradually lost as nAb titers waned over time. An autologous serum ID50 nAb titer of ~1:500 afforded more than 90% protection from medium-dose SHIV infection. In contrast, antibody-dependent cellular cytotoxicity and T cell activity did not correlate with protection. Therefore, Env protein-based vaccination strategies can protect against hard-to-neutralize SHIV challenge in rhesus macaques by inducing tier 2 nAbs, provided appropriate neutralizing titers can be reached and maintained. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.


April 21, 2020  |  

Characterization of Mauritian cynomolgus macaque Fc?R alleles using long-read sequencing.

The Fc?Rs are immune cell surface proteins that bind IgG and facilitate cytokine production, phagocytosis, and Ab-dependent, cell-mediated cytotoxicity. Fc?Rs play a critical role in immunity; variation in these genes is implicated in autoimmunity and other diseases. Cynomolgus macaques are an excellent animal model for many human diseases, and Mauritian cynomolgus macaques (MCMs) are particularly useful because of their restricted genetic diversity. Previous studies of MCM immune gene diversity have focused on the MHC and killer cell Ig-like receptor. In this study, we characterize Fc?R diversity in 48 MCMs using PacBio long-read sequencing to identify novel alleles of each of the four expressed MCM Fc?R genes. We also developed a high-throughput Fc?R genotyping assay, which we used to determine allele frequencies and identify Fc?R haplotypes in more than 500 additional MCMs. We found three alleles for Fc?R1A, seven each for Fc?R2A and Fc?R2B, and four for Fc?R3A; these segregate into eight haplotypes. We also assessed whether different Fc?R alleles confer different Ab-binding affinities by surface plasmon resonance and found minimal difference in binding affinities across alleles for a panel of wild type and Fc-engineered human IgG. This work suggests that although MCMs may not fully represent the diversity of Fc?R responses in humans, they may offer highly reproducible results for mAb therapy and toxicity studies. Copyright © 2018 by The American Association of Immunologists, Inc.


April 21, 2020  |  

Long-read assembly of the Chinese rhesus macaque genome and identification of ape-specific structural variants.

We present a high-quality de novo genome assembly (rheMacS) of the Chinese rhesus macaque (Macaca mulatta) using long-read sequencing and multiplatform scaffolding approaches. Compared to the current Indian rhesus macaque reference genome (rheMac8), rheMacS increases sequence contiguity 75-fold, closing 21,940 of the remaining assembly gaps (60.8 Mbp). We improve gene annotation by generating more than two million full-length transcripts from ten different tissues by long-read RNA sequencing. We sequence resolve 53,916 structural variants (96% novel) and identify 17,000 ape-specific structural variants (ASSVs) based on comparison to ape genomes. Many ASSVs map within ChIP-seq predicted enhancer regions where apes and macaque show diverged enhancer activity and gene expression. We further characterize a subset that may contribute to ape- or great-ape-specific phenotypic traits, including taillessness, brain volume expansion, improved manual dexterity, and large body size. The rheMacS genome assembly serves as an ideal reference for future biomedical and evolutionary studies.


October 23, 2019  |  

Transmission, evolution, and endogenization: Lessons learned from recent retroviral invasions.

Viruses of the subfamily Orthoretrovirinaeare defined by the ability to reverse transcribe an RNA genome into DNA that integrates into the host cell genome during the intracellular virus life cycle. Exogenous retroviruses (XRVs) are horizontally transmitted between host individuals, with disease outcome depending on interactions between the retrovirus and the host organism. When retroviruses infect germ line cells of the host, they may become endogenous retroviruses (ERVs), which are permanent elements in the host germ line that are subject to vertical transmission. These ERVs sometimes remain infectious and can themselves give rise to XRVs. This review integrates recent developments in the phylogenetic classification of retroviruses and the identification of retroviral receptors to elucidate the origins and evolution of XRVs and ERVs. We consider whether ERVs may recurrently pressure XRVs to shift receptor usage to sidestep ERV interference. We discuss how related retroviruses undergo alternative fates in different host lineages after endogenization, with koala retrovirus (KoRV) receiving notable interest as a recent invader of its host germ line. KoRV is heritable but also infectious, which provides insights into the early stages of germ line invasions as well as XRV generation from ERVs. The relationship of KoRV to primate and other retroviruses is placed in the context of host biogeography and the potential role of bats and rodents as vectors for interspecies viral transmission. Combining studies of extant XRVs and “fossil” endogenous retroviruses in koalas and other Australasian species has broadened our understanding of the evolution of retroviruses and host-retrovirus interactions. Copyright © 2017 American Society for Microbiology.


October 23, 2019  |  

Bioengineered viral platform for intramuscular passive vaccine delivery to human skeletal muscle.

Skeletal muscle is ideal for passive vaccine administration as it is easily accessible by intramuscular injection. Recombinant adeno-associated virus (rAAV) vectors are in consideration for passive vaccination clinical trials for HIV and influenza. However, greater human skeletal muscle transduction is needed for therapeutic efficacy than is possible with existing serotypes. To bioengineer capsids with therapeutic levels of transduction, we utilized a directed evolution approach to screen libraries of shuffled AAV capsids in pools of surgically resected human skeletal muscle cells from five patients. Six rounds of evolution were performed in various muscle cell types, and evolved variants were validated against existing muscle-tropic serotypes rAAV1, 6, and 8. We found that evolved variants NP22 and NP66 had significantly increased primary human and rhesus skeletal muscle fiber transduction from surgical explants ex vivo and in various primary and immortalized myogenic lines in vitro. Importantly, we demonstrated reduced seroreactivity compared to existing serotypes against normal human serum from 50 adult donors. These capsids represent powerful tools for human skeletal muscle expression and secretion of antibodies from passive vaccines.


October 23, 2019  |  

Bioengineered AAV capsids with combined high human liver transduction in vivo and unique humoral seroreactivity.

Existing recombinant adeno-associated virus (rAAV) serotypes for delivering in vivo gene therapy treatments for human liver diseases have not yielded combined high-level human hepatocyte transduction and favorable humoral neutralization properties in diverse patient groups. Yet, these combined properties are important for therapeutic efficacy. To bioengineer capsids that exhibit both unique seroreactivity profiles and functionally transduce human hepatocytes at therapeutically relevant levels, we performed multiplexed sequential directed evolution screens using diverse capsid libraries in both primary human hepatocytes in vivo and with pooled human sera from thousands of patients. AAV libraries were subjected to five rounds of in vivo selection in xenografted mice with human livers to isolate an enriched human-hepatotropic library that was then used as input for a sequential on-bead screen against pooled human immunoglobulins. Evolved variants were vectorized and validated against existing hepatotropic serotypes. Two of the evolved AAV serotypes, NP40 and NP59, exhibited dramatically improved functional human hepatocyte transduction in vivo in xenografted mice with human livers, along with favorable human seroreactivity profiles, compared with existing serotypes. These novel capsids represent enhanced vector delivery systems for future human liver gene therapy applications. Copyright © 2017. Published by Elsevier Inc.


September 22, 2019  |  

Rapid infectious disease identification by next-generation DNA sequencing.

Currently, there is a critical need to rapidly identify infectious organisms in clinical samples. Next-Generation Sequencing (NGS) could surmount the deficiencies of culture-based methods; however, there are no standardized, automated programs to process NGS data. To address this deficiency, we developed the Rapid Infectious Disease Identification (RIDI™) system. The system requires minimal guidance, which reduces operator errors. The system is compatible with the three major NGS platforms. It automatically interfaces with the sequencing system, detects their data format, configures the analysis type, applies appropriate quality control, and analyzes the results. Sequence information is characterized using both the NCBI database and RIDI™ specific databases. RIDI™ was designed to identify high probability sequence matches and more divergent matches that could represent different or novel species. We challenged the system using defined American Type Culture Collection (ATCC) reference standards of 27 species, both individually and in varying combinations. The system was able to rapidly detect known organisms in <12h with multi-sample throughput. The system accurately identifies 99.5% of the DNA sequence reads at the genus-level and 75.3% at the species-level in reference standards. It has a limit of detection of 146cells/ml in simulated clinical samples, and is also able to identify the components of polymicrobial samples with 16.9% discrepancy at the genus-level and 31.2% at the species-level. Thus, the system's effectiveness may exceed current methods, especially in situations where culture methods could produce false negatives or where rapid results would influence patient outcomes. Copyright © 2016 Elsevier B.V. All rights reserved.


September 22, 2019  |  

The role of MHC-E in T cell immunity is conserved among humans, rhesus macaques, and cynomolgus macaques.

MHC-E is a highly conserved nonclassical MHC class Ib molecule that predominantly binds and presents MHC class Ia leader sequence-derived peptides for NK cell regulation. However, MHC-E also binds pathogen-derived peptide Ags for presentation to CD8+ T cells. Given this role in adaptive immunity and its highly monomorphic nature in the human population, HLA-E is an attractive target for novel vaccine and immunotherapeutic modalities. Development of HLA-E-targeted therapies will require a physiologically relevant animal model that recapitulates HLA-E-restricted T cell biology. In this study, we investigated MHC-E immunobiology in two common nonhuman primate species, Indian-origin rhesus macaques (RM) and Mauritian-origin cynomolgus macaques (MCM). Compared to humans and MCM, RM expressed a greater number of MHC-E alleles at both the population and individual level. Despite this difference, human, RM, and MCM MHC-E molecules were expressed at similar levels across immune cell subsets, equivalently upregulated by viral pathogens, and bound and presented identical peptides to CD8+ T cells. Indeed, SIV-specific, Mamu-E-restricted CD8+ T cells from RM recognized antigenic peptides presented by all MHC-E molecules tested, including cross-species recognition of human and MCM SIV-infected CD4+ T cells. Thus, MHC-E is functionally conserved among humans, RM, and MCM, and both RM and MCM represent physiologically relevant animal models of HLA-E-restricted T cell immunobiology. Copyright © 2017 by The American Association of Immunologists, Inc.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.