Icahn Institute Archives

June 1, 2021 |

Characterizing haplotype diversity at the immunoglobulin heavy chain locus across human populations using novel long-read sequencing and assembly approaches

The human immunoglobulin heavy chain locus (IGH) remains among the most understudied regions of the human genome. Recent efforts have shown that haplotype diversity within IGH is elevated and exhibits population specific patterns; for example, our re-sequencing of the locus from only a single chromosome uncovered >100 Kb of novel sequence, including descriptions of six novel alleles, and four previously unmapped genes. Historically, this complex locus architecture has hindered the characterization of IGH germline single nucleotide, copy number, and structural variants (SNVs; CNVs; SVs), and as a result, there remains little known about the role of IGH polymorphisms in inter-individual antibody repertoire variability and disease. To remedy this, we are taking a multi-faceted approach to improving existing genomic resources in the human IGH region. First, from whole-genome and fosmid-based datasets, we are building the largest and most ethnically diverse set of IGH reference assemblies to date, by employing PacBio long-read sequencing combined with novel algorithms for phased haplotype assembly. In total, our effort will result in the characterization of >15 phased haplotypes from individuals of Asian, African, and European descent, to be used as a representative reference set by the genomics and immunogenetics community. Second, we are utilizing this more comprehensive sequence catalogue to inform the design and analysis of novel targeted IGH genotyping assays. Standard targeted DNA enrichment methods (e.g., exome capture) are currently optimized for the capture of only very short (100’s of bp) DNA segments. Our platform uses a modified bench protocol to pair existing capture-array technologies with the enrichment of longer fragments of DNA, enabling the use of PacBio sequencing of DNA segments up to 7 Kb. This substantial increase in contiguity disambiguates many of the complex repeated structures inherent to the locus, while yielding the base pair fidelity required to call SNVs. Together these resources will establish a stronger framework for further characterizing IGH genetic diversity and facilitate IGH genomic profiling in the clinical and research settings, which will be key to fully understanding the role of IGH germline variation in antibody repertoire development and disease.

February 5, 2021 |

AGBT 2015 Highlights: Customer interviews day 2

Commentary from PacBio users on their applications of SMRT Sequencing, including Ulf Gyllensten (Uppsala University), Tim Smith (USDA-ARS) and Bobby Sebra (Icahn School of Medicine)

February 5, 2021 |

Nature Webinar: Using long-read sequencing to characterize population diversity at the immunoglobulin heavy chain locus

Melissa Laird Smith from Icahn Institute at Mt. Sinai reviews her work studying the genetic background of immune response by characterizing population diversity at the immunoglobulin heavy chain locus. Webinar…

February 5, 2021 |

ASHG PacBio Workshop: SMRT Sequencing as a translational research tool to investigate germline, somatic and infectious diseases

Melissa Laird Smith discussed how the Icahn School of Medicine at Mount Sinai uses long-read sequencing for translational research. She gave several examples of targeted sequencing projects run on the…

February 5, 2021 |

Webinar: Complete genomes within reach – Closing bacterial genomes from the lakes of Minnesota to NYC hospitals

In this webinar, Ben Auch, Research Scientist, Innovation Lab, University of Minnesota Genomics Center, Cody Sheik, Assistant Professor of Biology, University of Minnesota Duluth, and Harm van Bakel, Assistant Professor…

February 5, 2021 |

User Group Meeting: Targeted PacBio Sequencing using Sage HLS-CATCH

In this PacBio User Group Meeting presentation, Mount Sinai’s Ethan Ellis presents results from the HLS-CATCH method, which involves the use of the SageHLS instrument with CRISPR design methods to…

February 5, 2021 |

Webinar: Understanding SARS-CoV-2 and host immune response to COVID-19 with PacBio sequencing

Studying microbial genomics and infectious disease? Learn how the PacBio Sequel II System can help advance your research, with first-hand perspectives from scientists who are investigating SARS-CoV-2 and COVID-19. In…

February 5, 2021 |

Webinar: Sequencing 101 – How long-read sequencing improves access to genetic information

In this webinar, Kristin Mars, Sequencing Specialist, PacBio, presents an introduction to PacBio’s technology and its applications followed by a panel discussion among sequencing experts. The panel discussion addresses such…

April 21, 2020 |

Characterization of Reference Materials for Genetic Testing of CYP2D6 Alleles: A GeT-RM Collaborative Project.

Pharmacogenetic testing increasingly is available from clinical and research laboratories. However, only a limited number of quality control and other reference materials currently are available for the complex rearrangements and rare variants that occur in the CYP2D6 gene. To address this need, the Division of Laboratory Systems, CDC-based Genetic Testing Reference Material Coordination Program, in collaboration with members of the pharmacogenetic testing and research communities and the Coriell Cell Repositories (Camden, NJ), has characterized 179 DNA samples derived from Coriell cell lines. Testing included the recharacterization of 137 genomic DNAs that were genotyped in previous Genetic Testing Reference Material Coordination Program studies and 42 additional samples that had not been characterized previously. DNA samples were distributed to volunteer testing laboratories for genotyping using a variety of commercially available and laboratory-developed tests. These publicly available samples will support the quality-assurance and quality-control programs of clinical laboratories performing CYP2D6 testing.Published by Elsevier Inc.

April 21, 2020 |

Deciphering bacterial epigenomes using modern sequencing technologies.

Prokaryotic DNA contains three types of methylation: N6-methyladenine, N4-methylcytosine and 5-methylcytosine. The lack of tools to analyse the frequency and distribution of methylated residues in bacterial genomes has prevented a full understanding of their functions. Now, advances in DNA sequencing technology, including single-molecule, real-time sequencing and nanopore-based sequencing, have provided new opportunities for systematic detection of all three forms of methylated DNA at a genome-wide scale and offer unprecedented opportunities for achieving a more complete understanding of bacterial epigenomes. Indeed, as the number of mapped bacterial methylomes approaches 2,000, increasing evidence supports roles for methylation in regulation of gene expression, virulence and pathogen-host interactions.

September 22, 2019 |

HIV-1 infection of primary CD4(+) T cells regulates the expression of specific HERV-K (HML-2) elements.

Endogenous retroviruses (ERVs) occupy extensive regions of the human genome. Although many of these retroviral elements have lost their ability to replicate, those whose insertion took place more recently, such as the HML-2 group of HERV-K elements, still retain intact open reading frames and the capacity to produce certain viral RNA and/or proteins. Transcription of these ERVs is, however, tightly regulated by dedicated epigenetic control mechanisms. Nonetheless, it has been reported that some pathologic states, such as viral infections and certain cancers, coincide with ERV expression suggesting transcriptional reawakening is possible. HML-2 elements are reportedly induced during HIV-1 infection, but the conserved nature of these elements has, until recently, rendered their expression profiling problematic.Here, we provide comprehensive HERV-K HML-2 expression profiles specific for productively HIV-1 infected primary human CD4(+) T cells. We combined enrichment of HIV-1 infected cells using a reporter virus expressing a surface reporter for gentle and efficient purification with long-read Single Molecule Real-Time sequencing. We show that three HML-2 proviruses, 6q25.1, 8q24.3, and 19q13.42 are up-regulated on average between 3- and 5-fold in HIV-1 infected CD4(+) T cells. One provirus, HML-2 12q24.33, in contrast, was repressed in the presence of active HIV replication.In conclusion, this report identifies the HERV-K HML-2 loci whose expression profiles differ upon HIV-1 infection in primary human CD4(+) T cells. These data will help pave the way for further studies on the influence of endogenous retroviruses on HIV-1 replication.Importance Endogenous retroviruses inhabit big portions of our genome. And although they are mainly inert some of the evolutionarily younger members maintain the ability to express both RNA as well as proteins. We have developed an approach using long-read SMRT sequencing that produces long reads, that provides us with ability to obtain detailed and accurate HERV-K HML-2 expression profiles. We have now applied this approach to study HERV-K expression in the presence and absence of productive HIV-1 infection of primary human CD4(+) T cells. In addition to using SMRT sequencing, our strategy also includes the magnetic selection of the infected cells so that levels of background expression due to uninfected cells are kept at a minimum. The results in this manuscript provide the blueprint for in-depth studies of the interactions of the authentic upregulated HERV-K HML-2 elements and HIV-1. Copyright © 2017 American Society for Microbiology.

September 22, 2019 |

Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation.

Shotgun metagenomics methods enable characterization of microbial communities in human microbiome and environmental samples. Assembly of metagenome sequences does not output whole genomes, so computational binning methods have been developed to cluster sequences into genome ‘bins’. These methods exploit sequence composition, species abundance, or chromosome organization but cannot fully distinguish closely related species and strains. We present a binning method that incorporates bacterial DNA methylation signatures, which are detected using single-molecule real-time sequencing. Our method takes advantage of these endogenous epigenetic barcodes to resolve individual reads and assembled contigs into species- and strain-level bins. We validate our method using synthetic and real microbiome sequences. In addition to genome binning, we show that our method links plasmids and other mobile genetic elements to their host species in a real microbiome sample. Incorporation of DNA methylation information into shotgun metagenomics analyses will complement existing methods to enable more accurate sequence binning.

September 22, 2019 |

Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering

BACKGROUND: High-throughput bacterial 16S rRNA gene sequencing followed by clustering of short sequences into operational taxonomic units (OTUs) is widely used for microbiome profiling. However, clustering of short 16S rRNA gene reads into biologically meaningful OTUs is challenging, in part because nucleotide variation along the 16S rRNA gene is only partially captured by short reads. The recent emergence of long-read platforms, such as single-molecule real-time (SMRT) sequencing from Pacific Biosciences, offers the potential for improved taxonomic and phylogenetic profiling. Here, we evaluate the performance of long- and short-read 16S rRNA gene sequencing using simulated and experimental data, followed by OTU inference using computational pipelines based on heuristic and complete-linkage hierarchical clustering. RESULTS: In simulated data, long-read sequencing was shown to improve OTU quality and decrease variance. We then profiled 40 human gut microbiome samples using a combination of Illumina MiSeq and Blautia-specific SMRT sequencing, further supporting the notion that long reads can identify additional OTUs. We implemented a complete-linkage hierarchical clustering strategy using a flexible computational pipeline, tailored specifically for PacBio circular consensus sequencing (CCS) data that outperforms heuristic methods in most settings: https://github.com/oscar-franzen/oclust/. CONCLUSION: Our data demonstrate that long reads can improve OTU inference; however, the choice of clustering algorithm and associated clustering thresholds has significant impact on performance.

September 22, 2019 |

The features of mucosa-associated microbiota in primary sclerosing cholangitis.

Little is known about the role of the microbiome in primary sclerosing cholangitis.To explore the mucosa-associated microbiota in primary sclerosing cholangitis (PSC) patients across different locations in the gut, and to compare it with inflammatory bowel disease (IBD)-only patients and healthy controls.Biopsies from the terminal ileum, right colon, and left colon were collected from patients and healthy controls undergoing colonoscopy. Microbiota profiling using bacterial 16S rRNA sequencing was performed on all biopsies.Forty-four patients were recruited: 20 with PSC (19 with PSC-IBD and one with PSC-only), 15 with IBD-only and nine healthy controls. The overall microbiome profile was similar throughout different locations in the gut. No differences in the global microbiome profile were found. However, we observed significant PSC-associated enrichment in Barnesiellaceae at the family level, and in Blautia and an unidentified Barnesiellaceae at the genus level. At the operational taxa unit level, most shifts in PSC were observed in Clostridiales and Bacteroidales orders, with approximately 86% of shifts occurring within the former order.The overall microbiota profile was similar across multiple locations in the gut from the same individual regardless of disease status. In this study, the mucosa associated-microbiota of patients with primary sclerosing cholangitis was characterised by enrichment of Blautia and Barnesiellaceae and by major shifts in operational taxa units within Clostridiales order.© 2016 John Wiley & Sons Ltd.

September 22, 2019 |

Iso-Seq analysis of Nepenthes ampullaria, Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants.

Tropical pitcher plants in the species-rich Nepenthaceae family of carnivorous plants possess unique pitcher organs. Hybridisation, natural or artificial, in this family is extensive resulting in pitchers with diverse features. The pitcher functions as a passive insect trap with digestive fluid for nutrient acquisition in nitrogen-poor habitats. This organ shows specialisation according to the dietary habit of different Nepenthes species. In this study, we performed the first single-molecule real-time isoform sequencing (Iso-Seq) analysis of full-length cDNA from Nepenthes ampullaria which can feed on leaf litter, compared to carnivorous Nepenthes rafflesiana, and their carnivorous hybrid Nepenthes × hookeriana. This allows the comparison of pitcher transcriptomes from the parents and the hybrid to understand how hybridisation could shape the evolution of dietary habit in Nepenthes. Raw reads have been deposited to SRA database with the accession numbers SRX2692198 (N. ampullaria), SRX2692197 (N. rafflesiana), and SRX2692196 (N. × hookeriana).

Auto Tag: Icahn Institute

Characterizing haplotype diversity at the immunoglobulin heavy chain locus across human populations using novel long-read sequencing and assembly approaches

AGBT 2015 Highlights: Customer interviews day 2

Nature Webinar: Using long-read sequencing to characterize population diversity at the immunoglobulin heavy chain locus

ASHG PacBio Workshop: SMRT Sequencing as a translational research tool to investigate germline, somatic and infectious diseases

Webinar: Complete genomes within reach – Closing bacterial genomes from the lakes of Minnesota to NYC hospitals

User Group Meeting: Targeted PacBio Sequencing using Sage HLS-CATCH

Webinar: Understanding SARS-CoV-2 and host immune response to COVID-19 with PacBio sequencing

Webinar: Sequencing 101 – How long-read sequencing improves access to genetic information

Characterization of Reference Materials for Genetic Testing of CYP2D6 Alleles: A GeT-RM Collaborative Project.

Deciphering bacterial epigenomes using modern sequencing technologies.

HIV-1 infection of primary CD4(+) T cells regulates the expression of specific HERV-K (HML-2) elements.

Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation.

Improved OTU-picking using long-read 16S rRNA gene amplicon sequencing and generic hierarchical clustering

The features of mucosa-associated microbiota in primary sclerosing cholangitis.

Iso-Seq analysis of Nepenthes ampullaria, Nepenthes rafflesiana and Nepenthes × hookeriana for hybridisation study in pitcher plants.

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert