June 1, 2021  |  

Access full spectrum of polymorphisms in HLA class I & II genes, without imputation for disease association and evolutionary research.

MHC class I and II genes are critically monitored by high-resolution sequencing for organ transplant decisions due to their role in GVHD. Their direct or linkage-based causal association, have increased their prominence as targets for drug sensitivity, autoimmune, cancer and infectious disease research. Monitoring HLA genes can however be tricky due to their highly polymorphic nature. Allele-level resolution is thus strongly preferred. However, most studies were historically focused on peptide binding domains of the HLA genes, due to technological challenges. As a result knowledge about the functional role of polymorphisms outside of exons 2 and 3 of HLA genes was rather limited. There are also relatively few full-length gene references currently available in the IMGT HLA database. This made it difficult to quickly adopt high-throughput reference-reliant methods for allele-level HLA sequencing. Increasing awareness regarding role of regulatory region polymorphisms of HLA genes in disease association1, nonetheless have brought about a revolution in full-length HLA gene sequencing. Researchers are now exploring ways to obtain complete information for HLA genes and integrate it with the current HLA database so it can be interpreted used by clinical researchers. We have explored advantages of SMRT Sequencing to obtain fully phased, allele-specific sequences of HLA class I and II genes for 96 samples using completely De novo consensus generation approach for imputation-free 4-field typing. With long read lengths (average >10 kb) and consensus accuracy exceeding 99.999% (Q50), a comprehensive snapshot of variants in exons, introns and UTRs could be obtained for spectrum of polymorphisms in phase across SNP-poor regions. Such information can provide invaluable insights in future causality association and population diversity research.


June 1, 2021  |  

Immune regions are no longer incomprehensible with SMRT Sequencing

The complex immune regions of the genome, including MHC and KIR, contain large copy number variants (CNVs), a high density of genes, hyper-polymorphic gene alleles, and conserved extended haplotypes (CEH) with enormous linkage disequilibrium (LDs). This level of complexity and inherent biases of short-read sequencing make it challenging for extracting immune region haplotype information from reference-reliant, shotgun sequencing and GWAS methods. As NGS based genome and exome sequencing and SNP arrays have become a routine for population studies, numerous efforts are being made for developing software to extract and or impute the immune gene information from these datasets. Despite these efforts, the fine mapping of causal variants of immune genes for their well-documented association with cancer, drug-induced hypersensitivity and immune-related diseases, has been slower than expected. This has in many ways limited our understanding of the mechanisms leading to immune disease. In the present work, we demonstrate the advantages of long reads delivered by SMRT Sequencing for assembling complete haplotypes of MHC and KIR gene clusters, as well as calling correct genotypes of genes comprised within them. All the genotype information is detected at allele- level with full phasing information across SNP-poor regions. Genotypes were called correctly from targeted gene amplicons, haplotypes, as well as from a completely assembled 5 Mb contig of the MHC region from a de novo assembly of whole genome shotgun data. De novo analysis pipeline used in all these approaches allowed for reference-free analysis without imputation, a key for interrogation without prior knowledge about ethnic backgrounds. These methods are thus easily adoptable for previously uncharacterized human or non-human species.


June 1, 2021  |  

Full-Length RNA-seq of Alzheimer brain on the PacBio Sequel II System

The PacBio Iso-Seq method produces high-quality, full-length transcripts and can characterize a whole transcriptome with a single SMRT Cell 8M. We sequenced an Alzheimer whole brain sample on a single SMRT Cell 8M on the Sequel II System. Using the Iso-Seq bioinformatics pipeline followed by SQANTI2 analysis, we detected 162,290 transcripts for 17,670 genes up to 14 kb in length. More than 60% of the transcripts are novel isoforms, the vast majority of which have supporting cage peak data and polyadenylation signals, demonstrating the utility of long-read sequencing for human disease research.


June 1, 2021  |  

A workflow for the comprehensive detection and prioritization of variants in human genomes with PacBio HiFi reads

PacBio HiFi reads (minimum 99% accuracy, 15-25 kb read length) have emerged as a powerful data type for comprehensive variant detection in human genomes. The HiFi read length extends confident mapping and variant calling to repetitive regions of the genome that are not accessible with short reads. Read length also improves detection of structural variants (SVs), with recall exceeding that of short reads by over 30%. High read quality allows for accurate single nucleotide variant and small indel detection, with precision and recall matching that of short reads. While many tools have been developed to take advantage of these qualities of HiFi reads, there is no end-to-end workflow for the filtering and prioritization of variants uniquely detected with long reads for rare and undiagnosed disease research. We have developed a flexible, modular workflow and web portal for variant analysis from HiFi reads and applied it to a set of rare disease cases unsolved by short-read whole genome sequencing. We expect that broad application of long-read variant detection workflows will solve many more rare disease cases. We have made these tools available at https://github.com/williamrowell/pbRUGD-workflow, and we hope they serve a starting point for developing a robust analysis framework for long read variant detection for rare diseases.


April 21, 2020  |  

Schizophrenia risk variants influence multiple classes of transcripts of sorting nexin 19 (SNX19).

Genome-wide association studies (GWAS) have identified many genomic loci associated with risk for schizophrenia, but unambiguous identification of the relationship between disease-associated variants and specific genes, and in particular their effect on risk conferring transcripts, has proven difficult. To better understand the specific molecular mechanism(s) at the schizophrenia locus in 11q25, we undertook cis expression quantitative trait loci (cis-eQTL) mapping for this 2 megabase genomic region using postmortem human brain samples. To comprehensively assess the effects of genetic risk upon local expression, we evaluated multiple transcript features: genes, exons, and exon-exon junctions in multiple brain regions-dorsolateral prefrontal cortex (DLPFC), hippocampus, and caudate. Genetic risk variants strongly associated with expression of SNX19 transcript features that tag multiple rare classes of SNX19 transcripts, whereas they only weakly affected expression of an exon-exon junction that tags the majority of abundant transcripts. The most prominent class of SNX19 risk-associated transcripts is predicted to be overexpressed, defined by an exon-exon splice junction between exons 8 and 10 (junc8.10) and that is predicted to encode proteins that lack the characteristic nexin C terminal domain. Risk alleles were also associated with either increased or decreased expression of multiple additional classes of transcripts. With RACE, molecular cloning, and long read sequencing, we found a number of novel SNX19 transcripts that further define the set of potential etiological transcripts. We explored epigenetic regulation of SNX19 expression and found that DNA methylation at CpG sites near the primary transcription start site and within exon 2 partially mediate the effects of risk variants on risk-associated expression. ATAC sequencing revealed that some of the most strongly risk-associated SNPs are located within a region of open chromatin, suggesting a nearby regulatory element is involved. These findings indicate a potentially complex molecular etiology, in which risk alleles for schizophrenia generate epigenetic alterations and dysregulation of multiple classes of SNX19 transcripts.


April 21, 2020  |  

The ADEP Biosynthetic Gene Cluster in Streptomyces hawaiiensis NRRL 15010 Reveals an Accessory clpP Gene as a Novel Antibiotic Resistance Factor.

The increasing threat posed by multiresistant bacterial pathogens necessitates the discovery of novel antibacterials with unprecedented modes of action. ADEP1, a natural compound produced by Streptomyces hawaiiensis NRRL 15010, is the prototype for a new class of acyldepsipeptide (ADEP) antibiotics. ADEP antibiotics deregulate the proteolytic core ClpP of the bacterial caseinolytic protease, thereby exhibiting potent antibacterial activity against Gram-positive bacteria, including multiresistant pathogens. ADEP1 and derivatives, here collectively called ADEP, have been previously investigated for their antibiotic potency against different species, structure-activity relationship, and mechanism of action; however, knowledge on the biosynthesis of the natural compound and producer self-resistance have remained elusive. In this study, we identified and analyzed the ADEP biosynthetic gene cluster in S. hawaiiensis NRRL 15010, which comprises two NRPSs, genes necessary for the biosynthesis of (4S,2R)-4-methylproline, and a type II polyketide synthase (PKS) for the assembly of highly reduced polyenes. While no resistance factor could be identified within the gene cluster itself, we discovered an additional clpP homologous gene (named clpPADEP) located further downstream of the biosynthetic genes, separated from the biosynthetic gene cluster by several transposable elements. Heterologous expression of ClpPADEP in three ADEP-sensitive Streptomyces species proved its role in conferring ADEP resistance, thereby revealing a novel type of antibiotic resistance determinant.IMPORTANCE Antibiotic acyldepsipeptides (ADEPs) represent a promising new class of potent antibiotics and, at the same time, are valuable tools to study the molecular functioning of their target, ClpP, the proteolytic core of the bacterial caseinolytic protease. Here, we present a straightforward purification procedure for ADEP1 that yields substantial amounts of the pure compound in a time- and cost-efficient manner, which is a prerequisite to conveniently study the antimicrobial effects of ADEP and the operating mode of bacterial ClpP machineries in diverse bacteria. Identification and characterization of the ADEP biosynthetic gene cluster in Streptomyces hawaiiensis NRRL 15010 enables future bioinformatics screenings for similar gene clusters and/or subclusters to find novel natural compounds with specific substructures. Most strikingly, we identified a cluster-associated clpP homolog (named clpPADEP) as an ADEP resistance gene. ClpPADEP constitutes a novel bacterial resistance factor that alone is necessary and sufficient to confer high-level ADEP resistance to Streptomyces across species.Copyright © 2019 American Society for Microbiology.


April 21, 2020  |  

Genomic and Functional Analysis of Emerging Virulent and Multidrug-Resistant Escherichia coli Lineage Sequence Type 648.

The pathogenic extended-spectrum-beta-lactamase (ESBL)-producing Escherichia coli lineage ST648 is increasingly reported from multiple origins. Our study of a large and global ST648 collection from various hosts (87 whole-genome sequences) combining core and accessory genomics with functional analyses and in vivo experiments suggests that ST648 is a nascent and generalist lineage, lacking clear phylogeographic and host association signals. By including large numbers of ST131 (n?=?107) and ST10 (n?=?96) strains for comparative genomics and phenotypic analysis, we demonstrate that the combination of multidrug resistance and high-level virulence are the hallmarks of ST648, similar to international high-risk clonal lineage ST131. Specifically, our in silico, in vitro, and in vivo results demonstrate that ST648 is well equipped with biofilm-associated features, while ST131 shows sophisticated signatures indicative of adaption to urinary tract infection, potentially conveying individual ecological niche adaptation. In addition, we used a recently developed NFDS (negative frequency-dependent selection) population model suggesting that ST648 will increase significantly in frequency as a cause of bacteremia within the next few years. Also, ESBL plasmids impacting biofilm formation aided in shaping and maintaining ST648 strains to successfully emerge worldwide across different ecologies. Our study contributes to understanding what factors drive the evolution and spread of emerging international high-risk clonal lineages.Copyright © 2019 American Society for Microbiology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.