June 1, 2021  |  

Complex alternative splicing patterns in hematopoietic cell subpopulations revealed by third-generation long reads.

Background: Alternative splicing expands the repertoire of gene functions and is a signature for different cell populations. Here we characterize the transcriptome of human bone marrow subpopulations including progenitor cells to understand their contribution to homeostasis and pathological conditions such as atherosclerosis and tumor metastasis. To obtain full-length transcript structures, we utilized long reads in addition to RNA-seq for estimating isoform diversity and abundance. Method: Freshly harvested, viable human bone marrow tissues were extracted from discarded harvesting equipment and separated into total bone marrow (total), lineage-negative (lin-) progenitor cells and differentiated cells (lin+) by magnetic bead sorting with antibodies to surface markers of hematopoietic cell lineages. Sequencing was done with SOLiD, Illumina HiSeq (100bp paired-end reads), and PacBio RS II (full-length cDNA library protocol for 1 – 6 kb libraries). Short reads were assembled using both Trinity for de novo assembly and Cufflinks for genome-guided assembly. Full-length transcript consensus sequences were obtained for the PacBio data using the RS_IsoSeq protocol from PacBios SMRTAnalysis software. Quantitation for each sample was done independently for each sequencing platform using Sailfish to obtain the TPM (transcripts per million) using k-mer matching. Results: PacBios long read sequencing technology is capable of sequencing full-length transcripts up to 10 kb and reveals heretofore-unseen isoform diversity and complexity within the hematopoietic cell populations. A comparison of sequencing depth and de novo transcript assembly with short read, second-generation sequencing reveals that, while short reads provide precision in determining portions of isoform structure and supporting larger 5 and 3 UTR regions, it fails in providing a complete structure especially when multiple isoforms are present at the same locus. Increased breadth of isoform complexity is revealed by long reads that permits further elaboration of full isoform diversity and specific isoform abundance within each separate cell population. Sorting the distribution of major and minor isoforms reveals a cell population-specific balance focused on distinct genome loci and shows how tissue specificity and diversity are modulated by alternative splicing.


June 1, 2021  |  

Full-length HIV-1 env deep sequencing in a donor with broadly neutralizing V1/V2 antibodies.

Background: Understanding the co-evolution of HIV populations and broadly neutralizing antibodies (bNAbs) may inform vaccine design. Novel long-read, next-generation sequencing methods allow, for the first time, full-length deep sequencing of HIV env populations. Methods: We longitudinally examined HIV-1 env populations (12 time points) in a subtype A infected individual from the IAVI primary infection cohort (Protocol C) who developed bNAbs (62% ID50>50 on a diverse panel of 105 viruses) targeting the V1/V2 loop region. We developed a PacBio single molecule, real-time sequencing protocol to deeply sequence full-length env from HIV RNA. Bioinformatics tools were developed to align env sequences, infer phylogenies, and interrogate escape dynamics of key residues and glycosylation sites. PacBio env sequences were compared to env sequences generated through amplification and cloning. Env dynamics and viral escape motif evolution were interpreted in the context of the development V1/V2-targeting broadly neutralizing antibodies. Results: We collected a median of 6799 (range: 1770-14727) high quality full-length HIV env circular consensus sequences (CCS) per SMRT Cell, per time point. Using only CCS reads comprised of 6 or more passes over the HIV env insert (= 16 kb read length) ensured that our median per-base accuracy was 99.7%. A phylogeny inferred with PacBio and 100 cloned env sequences (10 time points) found the cloned sequences evenly distributed among PacBio sequences. Viral escape from the V1/V2 targeted bNAbs was evident at V2 positions 160, 166, 167, 169 and 181 (HxB2 numbering), exhibiting several distinct escape pathways by 40 months post-infection. Conclusions: Our PacBio full-length env sequencing method allowed unprecedented view and ability to characterize HIV-1 env dynamics throughout the first four years of infection. Longitudinal full-length env deep sequencing allows accurate phylogenetic inference, provides a detailed picture of escape dynamics in epitope regions, and can identify minority variants, all of which will prove critical for increasing our understanding of how env evolution drives the development of antibody breadth.


June 1, 2021  |  

Full-length env deep sequencing in a donor with broadly neutralizing V1/V2 antibodies.

Background: Understanding the co-evolution of HIV populations and broadly neutralizing antibody (bNAb) lineages may inform vaccine design. Novel long-read, next-generation sequencing methods allow, for the first time, full-length deep sequencing of HIV env populations. Methods: We longitudinally examined env populations (12 time points) in a subtype A infected individual from the IAVI primary infection cohort (Protocol C) who developed bNAbs (62% ID50>50 on a diverse panel of 105 viruses) targeting the V1/V2 region. We developed a Pacific Biosciences single molecule, real-time sequencing protocol to deeply sequence full-length env from HIV RNA. Bioinformatics tools were developed to align env sequences, infer phylogenies, and interrogate escape dynamics of key residues and glycosylation sites. PacBio env sequences were compared to env sequences generated through amplification and cloning. Env dynamics were interpreted in the context of the development of a V1/V2-targeting bNAb lineage isolated from the donor. Results: We collected a median of 6799 high quality full-length env sequences per timepoint (median per-base accuracy of 99.7%). A phylogeny inferred with PacBio and 100 cloned env sequences (10 time points) found cloned env sequences evenly distributed among PacBio sequences. Phylogenetic analyses also revealed a potential transient intra-clade superinfection visible as a minority variant (~5%) at 9 months post-infection (MPI), and peaking in prevalence at 12MPI (~64%), just preceding the development of heterologous neutralization. Viral escape from the bNAb lineage was evident at V2 positions 160, 166, 167, 169 and 181 (HxB2 numbering), exhibiting several distinct escape pathways by 40MPI. Conclusions: Our PacBio full-length env sequencing method allowed unprecedented characterization of env dynamics and revealed an intra-clade superinfection that was not detected through conventional methods. The importance of superinfection in the development of this donor’s V1/V2-directed bNAb lineage is under investigation. Longitudinal full-length env deep sequencing allows accurate phylogenetic inference, provides a detailed picture of escape dynamics in epitope regions, and can identify minority variants, all of which may prove useful for understanding how env evolution can drive the development of antibody breadth.


June 1, 2021  |  

Characterizing haplotype diversity at the immunoglobulin heavy chain locus across human populations using novel long-read sequencing and assembly approaches

The human immunoglobulin heavy chain locus (IGH) remains among the most understudied regions of the human genome. Recent efforts have shown that haplotype diversity within IGH is elevated and exhibits population specific patterns; for example, our re-sequencing of the locus from only a single chromosome uncovered >100 Kb of novel sequence, including descriptions of six novel alleles, and four previously unmapped genes. Historically, this complex locus architecture has hindered the characterization of IGH germline single nucleotide, copy number, and structural variants (SNVs; CNVs; SVs), and as a result, there remains little known about the role of IGH polymorphisms in inter-individual antibody repertoire variability and disease. To remedy this, we are taking a multi-faceted approach to improving existing genomic resources in the human IGH region. First, from whole-genome and fosmid-based datasets, we are building the largest and most ethnically diverse set of IGH reference assemblies to date, by employing PacBio long-read sequencing combined with novel algorithms for phased haplotype assembly. In total, our effort will result in the characterization of >15 phased haplotypes from individuals of Asian, African, and European descent, to be used as a representative reference set by the genomics and immunogenetics community. Second, we are utilizing this more comprehensive sequence catalogue to inform the design and analysis of novel targeted IGH genotyping assays. Standard targeted DNA enrichment methods (e.g., exome capture) are currently optimized for the capture of only very short (100’s of bp) DNA segments. Our platform uses a modified bench protocol to pair existing capture-array technologies with the enrichment of longer fragments of DNA, enabling the use of PacBio sequencing of DNA segments up to 7 Kb. This substantial increase in contiguity disambiguates many of the complex repeated structures inherent to the locus, while yielding the base pair fidelity required to call SNVs. Together these resources will establish a stronger framework for further characterizing IGH genetic diversity and facilitate IGH genomic profiling in the clinical and research settings, which will be key to fully understanding the role of IGH germline variation in antibody repertoire development and disease.


April 21, 2020  |  

DART-seq: an antibody-free method for global m6A detection.

N6-methyladenosine (m6A) is a widespread RNA modification that influences nearly every aspect of the messenger RNA lifecycle. Our understanding of m6A has been facilitated by the development of global m6A mapping methods, which use antibodies to immunoprecipitate methylated RNA. However, these methods have several limitations, including high input RNA requirements and cross-reactivity to other RNA modifications. Here, we present DART-seq (deamination adjacent to RNA modification targets), an antibody-free method for detecting m6A sites. In DART-seq, the cytidine deaminase APOBEC1 is fused to the m6A-binding YTH domain. APOBEC1-YTH expression in cells induces C-to-U deamination at sites adjacent to m6A residues, which are detected using standard RNA-seq. DART-seq identifies thousands of m6A sites in cells from as little as 10?ng of total RNA and can detect m6A accumulation in cells over time. Additionally, we use long-read DART-seq to gain insights into m6A distribution along the length of individual transcripts.


April 21, 2020  |  

Characterization of Reference Materials for Genetic Testing of CYP2D6 Alleles: A GeT-RM Collaborative Project.

Pharmacogenetic testing increasingly is available from clinical and research laboratories. However, only a limited number of quality control and other reference materials currently are available for the complex rearrangements and rare variants that occur in the CYP2D6 gene. To address this need, the Division of Laboratory Systems, CDC-based Genetic Testing Reference Material Coordination Program, in collaboration with members of the pharmacogenetic testing and research communities and the Coriell Cell Repositories (Camden, NJ), has characterized 179 DNA samples derived from Coriell cell lines. Testing included the recharacterization of 137 genomic DNAs that were genotyped in previous Genetic Testing Reference Material Coordination Program studies and 42 additional samples that had not been characterized previously. DNA samples were distributed to volunteer testing laboratories for genotyping using a variety of commercially available and laboratory-developed tests. These publicly available samples will support the quality-assurance and quality-control programs of clinical laboratories performing CYP2D6 testing.Published by Elsevier Inc.


April 21, 2020  |  

A comparison of immunoglobulin IGHV, IGHD and IGHJ genes in wild-derived and classical inbred mouse strains.

The genomes of classical inbred mouse strains include genes derived from all three major subspecies of the house mouse, Mus musculus. We recently posited that genetic diversity in the immunoglobulin heavy chain (IGH) gene loci of C57BL/6 and BALB/c mice reflect differences in subspecies origin. To investigate this hypothesis, we conducted high-throughput sequencing of IGH gene rearrangements to document IGH variable (IGHV), joining (IGHJ), and diversity (IGHD) genes in four inbred wild-derived mouse strains (CAST/EiJ, LEWES/EiJ, MSM/MsJ, and PWD/PhJ), and a single disease model strain (NOD/ShiLtJ), collectively representing genetic backgrounds of several major mouse subspecies. A total of 341 germline IGHV sequences were inferred in the wild-derived strains, including 247 not curated in the International Immunogenetics Information System. In contrast, 83/84 inferred NOD IGHV genes had previously been observed in C57BL/6 mice. Variability among the strains examined was observed for only a single IGHJ gene, involving a description of a novel allele. In contrast, unexpected variation was found in the IGHD gene loci, with four previously unreported IGHD gene sequences being documented. Very few IGHV sequences of C57BL/6 and BALB/c mice were shared with strains representing major subspecies, suggesting that their IGH loci may be complex mosaics of genes of disparate origins. This suggests a similar level of diversity is likely present in the IGH loci of other classical inbred strains. This must now be documented if we are to properly understand inter-strain variation in models of antibody-mediated disease. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

Transcriptional initiation of a small RNA, not R-loop stability, dictates the frequency of pilin antigenic variation in Neisseria gonorrhoeae.

Neisseria gonorrhoeae, the sole causative agent of gonorrhea, constitutively undergoes diversification of the Type IV pilus. Gene conversion occurs between one of the several donor silent copies located in distinct loci and the recipient pilE gene, encoding the major pilin subunit of the pilus. A guanine quadruplex (G4) DNA structure and a cis-acting sRNA (G4-sRNA) are located upstream of the pilE gene and both are required for pilin antigenic variation (Av). We show that the reduced sRNA transcription lowers pilin Av frequencies. Extended transcriptional elongation is not required for Av, since limiting the transcript to 32 nt allows for normal Av frequencies. Using chromatin immunoprecipitation (ChIP) assays, we show that cellular G4s are less abundant when sRNA transcription is lower. In addition, using ChIP, we demonstrate that the G4-sRNA forms a stable RNA:DNA hybrid (R-loop) with its template strand. However, modulating R-loop levels by controlling RNase HI expression does not alter G4 abundance quantified through ChIP. Since pilin Av frequencies were not altered when modulating R-loop levels by controlling RNase HI expression, we conclude that transcription of the sRNA is necessary, but stable R-loops are not required to promote pilin Av. © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

RNA sequencing: the teenage years.

Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.


April 21, 2020  |  

The replication-competent HIV-1 latent reservoir is primarily established near the time of therapy initiation.

Although antiretroviral therapy (ART) is highly effective at suppressing HIV-1 replication, the virus persists as a latent reservoir in resting CD4+ T cells during therapy. This reservoir forms even when ART is initiated early after infection, but the dynamics of its formation are largely unknown. The viral reservoirs of individuals who initiate ART during chronic infection are generally larger and genetically more diverse than those of individuals who initiate therapy during acute infection, consistent with the hypothesis that the reservoir is formed continuously throughout untreated infection. To determine when viruses enter the latent reservoir, we compared sequences of replication-competent viruses from resting peripheral CD4+ T cells from nine HIV-positive women on therapy to viral sequences circulating in blood collected longitudinally before therapy. We found that, on average, 71% of the unique viruses induced from the post-therapy latent reservoir were most genetically similar to viruses replicating just before ART initiation. This proportion is far greater than would be expected if the reservoir formed continuously and was always long lived. We conclude that ART alters the host environment in a way that allows the formation or stabilization of most of the long-lived latent HIV-1 reservoir, which points to new strategies targeted at limiting the formation of the reservoir around the time of therapy initiation.Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


April 21, 2020  |  

Rapid transcriptional responses to serum exposure are associated with sensitivity and resistance to antibody-mediated complement killing in invasive Salmonella Typhimurium ST313

Background: Salmonella Typhimurium ST313 exhibits signatures of adaptation to invasive human infection, including higher resistance to humoral immune responses than gastrointestinal isolates. Full resistance to antibody-mediated complement killing (serum resistance) among nontyphoidal Salmonellae is uncommon, but selection of highly resistant strains could compromise vaccine-induced antibody immunity. Here, we address the hypothesis that serum resistance is due to a distinct genotype or transcriptome response in S. Typhimurium ST313.


April 21, 2020  |  

A Novel Bacteriophage Exclusion (BREX) System Encoded by the pglX Gene in Lactobacillus casei Zhang.

The bacteriophage exclusion (BREX) system is a novel prokaryotic defense system against bacteriophages. To our knowledge, no study has systematically characterized the function of the BREX system in lactic acid bacteria. Lactobacillus casei Zhang is a probiotic bacterium originating from koumiss. By using single-molecule real-time sequencing, we previously identified N6-methyladenine (m6A) signatures in the genome of L. casei Zhang and a putative methyltransferase (MTase), namely, pglX This work further analyzed the genomic locus near the pglX gene and identified it as a component of the BREX system. To decipher the biological role of pglX, an L. casei Zhang pglX mutant (?pglX) was constructed. Interestingly, m6A methylation of the 5′-ACRCAG-3′ motif was eliminated in the ?pglX mutant. The wild-type and mutant strains exhibited no significant difference in morphology or growth performance in de Man-Rogosa-Sharpe (MRS) medium. A significantly higher plasmid acquisition capacity was observed for the ?pglX mutant than for the wild type if the transformed plasmids contained pglX recognition sites (i.e., 5′-ACRCAG-3′). In contrast, no significant difference was observed in plasmid transformation efficiency between the two strains when plasmids lacking pglX recognition sites were tested. Moreover, the ?pglX mutant had a lower capacity to retain the plasmids than the wild type, suggesting a decrease in genetic stability. Since the Rebase database predicted that the L. casei PglX protein was bifunctional, as both an MTase and a restriction endonuclease, the PglX protein was heterologously expressed and purified but failed to show restriction endonuclease activity. Taken together, the results show that the L. casei Zhang pglX gene is a functional adenine MTase that belongs to the BREX system.IMPORTANCELactobacillus casei Zhang is a probiotic that confers beneficial effects on the host, and it is thus increasingly used in the dairy industry. The possession of an effective bacterial immune system that can defend against invasion of phages and exogenous DNA is a desirable feature for industrial bacterial strains. The bacteriophage exclusion (BREX) system is a recently described phage resistance system in prokaryotes. This work confirmed the function of the BREX system in L. casei and that the methyltransferase (pglX) is an indispensable part of the system. Overall, our study characterizes a BREX system component gene in lactic acid bacteria. Copyright © 2019 American Society for Microbiology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.