Menu
April 21, 2020  |  

Fast and accurate genomic analyses using genome graphs.

The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5?h using a system with 36?CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.


April 21, 2020  |  

The CF Canada-Sick Kids Program in individual CF therapy: A resource for the advancement of personalized medicine in CF.

Therapies targeting certain CFTR mutants have been approved, yet variations in clinical response highlight the need for in-vitro and genetic tools that predict patient-specific clinical outcomes. Toward this goal, the CF Canada-Sick Kids Program in Individual CF Therapy (CFIT) is generating a “first of its kind”, comprehensive resource containing patient-specific cell cultures and data from 100 CF individuals that will enable modeling of therapeutic responses.The CFIT program is generating: 1) nasal cells from drug naïve patients suitable for culture and the study of drug responses in vitro, 2) matched gene expression data obtained by sequencing the RNA from the primary nasal tissue, 3) whole genome sequencing of blood derived DNA from each of the 100 participants, 4) induced pluripotent stem cells (iPSCs) generated from each participant’s blood sample, 5) CRISPR-edited isogenic control iPSC lines and 6) prospective clinical data from patients treated with CF modulators.To date, we have recruited 57 of 100 individuals to CFIT, most of whom are homozygous for F508del (to assess in-vitro: in-vivo correlations with respect to ORKAMBI response) or heterozygous for F508del and a minimal function mutation. In addition, several donors are homozygous for rare nonsense and missense mutations. Nasal epithelial cell cultures and matched iPSC lines are available for many of these donors.This accessible resource will enable development of tools that predict individual outcomes to current and emerging modulators targeting F508del-CFTR and facilitate therapy discovery for rare CF causing mutations.Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.


April 21, 2020  |  

Development of CRISPR-Cas systems for genome editing and beyond

The development of clustered regularly interspaced short-palindromic repeat (CRISPR)-Cas systems for genome editing has transformed the way life science research is conducted and holds enormous potential for the treatment of disease as well as for many aspects of biotech- nology. Here, I provide a personal perspective on the development of CRISPR-Cas9 for genome editing within the broader context of the field and discuss our work to discover novel Cas effectors and develop them into additional molecular tools. The initial demonstra- tion of Cas9-mediated genome editing launched the development of many other technologies, enabled new lines of biological inquiry, and motivated a deeper examination of natural CRISPR-Cas systems, including the discovery of new types of CRISPR-Cas systems. These new discoveries in turn spurred further technological developments. I review these exciting discoveries and technologies as well as provide an overview of the broad array of applications of these technologies in basic research and in the improvement of human health. It is clear that we are only just beginning to unravel the potential within microbial diversity, and it is quite likely that we will continue to discover other exciting phenomena, some of which it may be possible to repurpose as molecular technologies. The transformation of mysterious natural phenomena to powerful tools, however, takes a collective effort to discover, characterize, and engineer them, and it has been a privilege to join the numerous researchers who have contributed to this transformation of CRISPR-Cas systems.


April 21, 2020  |  

Long-Read Sequencing Emerging in Medical Genetics

The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.


April 21, 2020  |  

Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing

In recent genome analyses, population-specific reference panels have indicated important. However, reference panels based on short-read sequencing data do not sufficiently cover long insertions. Therefore, the nature of long insertions has not been well documented. Here, we assembled a Japanese genome using single-molecule real-time sequencing data and characterized insertions found in the assembled genome. We identified 3691 insertions ranging from 100?bps to ~10,000?bps in the assembled genome relative to the international reference sequence (GRCh38). To validate and characterize these insertions, we mapped short-reads from 1070 Japanese individuals and 728 individuals from eight other populations to insertions integrated into GRCh38. With this result, we constructed JRGv1 (Japanese Reference Genome version 1) by integrating the 903 verified insertions, totaling 1,086,173 bases, shared by at least two Japanese individuals into GRCh38. We also constructed decoyJRGv1 by concatenating 3559 verified insertions, totaling 2,536,870 bases, shared by at least two Japanese individuals or by six other assemblies. This assembly improved the alignment ratio by 0.4% on average. These results demonstrate the importance of refining the reference assembly and creating a population-specific reference genome. JRGv1 and decoyJRGv1 are available at the JRG website.


October 23, 2019  |  

Structural determination of the broadly reactive anti-IGHV1-69 anti-idiotypic antibody G6 and its idiotope.

The heavy chain IGHV1-69 germline gene exhibits a high level of polymorphism and shows biased use in protective antibody (Ab) responses to infections and vaccines. It is also highly expressed in several B cell malignancies and autoimmune diseases. G6 is an anti-idiotypic monoclonal Ab that selectively binds to IGHV1-69 heavy chain germline gene 51p1 alleles that have been implicated in these Ab responses and disease processes. Here, we determine the co-crystal structure of humanized G6 (hG6.3) in complex with anti-influenza hemagglutinin stem-directed broadly neutralizing Ab D80. The core of the hG6.3 idiotope is a continuous string of CDR-H2 residues starting with M53 and ending with N58. G6 binding studies demonstrate the remarkable breadth of binding to 51p1 IGHV1-69 Abs with diverse CDR-H3, light chain, and antigen binding specificities. These studies detail the broad expression of the G6 cross-reactive idiotype (CRI) that further define its potential role in precision medicine. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II’s sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.


September 22, 2019  |  

Defining a personal, allele-specific, and single-molecule long-read transcriptome.

Personal transcriptomes in which all of an individual’s genetic variants (e.g., single nucleotide variants) and transcript isoforms (transcription start sites, splice sites, and polyA sites) are defined and quantified for full-length transcripts are expected to be important for understanding individual biology and disease, but have not been described previously. To obtain such transcriptomes, we sequenced the lymphoblastoid transcriptomes of three family members (GM12878 and the parents GM12891 and GM12892) by using a Pacific Biosciences long-read approach complemented with Illumina 101-bp sequencing and made the following observations. First, we found that reads representing all splice sites of a transcript are evident for most sufficiently expressed genes =3 kb and often for genes longer than that. Second, we added and quantified previously unidentified splicing isoforms to an existing annotation, thus creating the first personalized annotation to our knowledge. Third, we determined SNVs in a de novo manner and connected them to RNA haplotypes, including HLA haplotypes, thereby assigning single full-length RNA molecules to their transcribed allele, and demonstrated Mendelian inheritance of RNA molecules. Fourth, we show how RNA molecules can be linked to personal variants on a one-by-one basis, which allows us to assess differential allelic expression (DAE) and differential allelic isoforms (DAI) from the phased full-length isoform reads. The DAI method is largely independent of the distance between exon and SNV–in contrast to fragmentation-based methods. Overall, in addition to improving eukaryotic transcriptome annotation, these results describe, to our knowledge, the first large-scale and full-length personal transcriptome.


September 22, 2019  |  

Androgen receptor variant AR-V9 is co-expressed with AR-V7 in prostate cancer metastases and predicts abiraterone resistance.

Purpose: Androgen receptor (AR) variant AR-V7 is a ligand-independent transcription factor that promotes prostate cancer resistance to AR-targeted therapies.  Accordingly, efforts are underway to develop strategies for monitoring and inhibiting AR-V7 in castration-resistant prostate cancer (CRPC).  The purpose of this study was to understand whether other AR variants may be co-expressed with AR-V7 and promote resistance to AR-targeted therapies. Experimental Design:  We utilized complementary short- and long-read sequencing of intact AR mRNA isoforms to characterize AR expression in CRPC models.  Co-expression of AR-V7 and AR-V9 mRNA in CRPC metastases and circulating tumor cells was assessed by RNA-seq and RT-PCR, respectively.  Expression of AR-V9 protein in CRPC models was evaluated with polyclonal antisera.  Multivariate analysis was performed to test whether AR variant mRNA expression in metastatic tissues was associated with a 12-week progression-free survival endpoint in a prospective clinical trial of 78 CRPC-stage patients initiating therapy with the androgen synthesis inhibitor, abiraterone acetate. Results: AR-V9 was frequently co-expressed with AR-V7.  Both AR variant species were found to share a common 3′ terminal cryptic exon, which rendered AR-V9 susceptible to experimental manipulations that were previously-thought to target AR-V7 uniquely.  AR-V9 promoted ligand-independent growth of prostate cancer cells.  High AR-V9 mRNA expression in CRPC metastases was predictive of primary resistance to abiraterone acetate (HR = 4.0, 95% CI = 1.31-12.2, P = 0.02).   Conclusions:  AR-V9 may be an important component of therapeutic resistance in CRPC. Copyright ©2017, American Association for Cancer Research.


September 22, 2019  |  

Single-molecule DNA sequencing of acute myeloid leukemia and myelodysplastic syndromes with multiple TP53 alterations.

Although the frequency of TP53 mutations in hemato- logic malignancies is low, these mutations have a high clinical relevance and are usually associated with poor prognosis. Somatic TP53 mutations have been detected in up to 73.3% of cases of acute myeloid leukemia (AML) with complex karyotype and 18.9% of AML with other unfavorable cytogenetic risk factors. AML with TP53 mutations, and/or chromosomal aneuploidy, has been defined as a distinct AML subtype. In low-risk myelodysplastic syndromes (MDS), TP53 mutations occur at an early disease stage and predict disease progression. TP53 mutation diagnosis is now part of the revised European LeukemiaNet (ELN) guidelines.


September 22, 2019  |  

Current progress in EBV-associated B-cell lymphomas.

Epstein-Barr virus (EBV) was the first human tumor virus discovered more than 50 years ago. EBV-associated lymphomagenesis is still a significant viral-associated disease as it involves a diverse range of pathologies, especially B-cell lymphomas. Recent development of high-throughput next-generation sequencing technologies and in vivo mouse models have significantly promoted our understanding of the fundamental molecular mechanisms which drive these cancers and allowed for the development of therapeutic intervention strategies. This review will highlight the current advances in EBV-associated B-cell lymphomas, focusing on transcriptional regulation, chromosome aberrations, in vivo studies of EBV-mediated lymphomagenesis, as well as the treatment strategies to target viral-associated lymphomas.


September 22, 2019  |  

Role of clinicogenomics in infectious disease diagnostics and public health microbiology.

Clinicogenomics is the exploitation of genome sequence data for diagnostic, therapeutic, and public health purposes. Central to this field is the high-throughput DNA sequencing of genomes and metagenomes. The role of clinicogenomics in infectious disease diagnostics and public health microbiology was the topic of discussion during a recent symposium (session 161) presented at the 115th general meeting of the American Society for Microbiology that was held in New Orleans, LA. What follows is a collection of the most salient and promising aspects from each presentation at the symposium. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


September 22, 2019  |  

Ensembl 2018

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.


September 22, 2019  |  

Transcriptome sequencing reveals thousands of novel long non-coding RNAs in B cell lymphoma.

Gene profiling of diffuse large B cell lymphoma (DLBCL) has revealed broad gene expression deregulation compared to normal B cells. While many studies have interrogated well known and annotated genes in DLBCL, none have yet performed a systematic analysis to uncover novel unannotated long non-coding RNAs (lncRNA) in DLBCL. In this study we sought to uncover these lncRNAs by examining RNA-seq data from primary DLBCL tumors and performed supporting analysis to identify potential role of these lncRNAs in DLBCL.We performed a systematic analysis of novel lncRNAs from the poly-adenylated transcriptome of 116 primary DLBCL samples. RNA-seq data were processed using de novo transcript assembly pipeline to discover novel lncRNAs in DLBCL. Systematic functional, mutational, cross-species, and co-expression analyses using numerous bioinformatics tools and statistical analysis were performed to characterize these novel lncRNAs.We identified 2,632 novel, multi-exonic lncRNAs expressed in more than one tumor, two-thirds of which are not expressed in normal B cells. Long read single molecule sequencing supports the splicing structure of many of these lncRNAs. More than one-third of novel lncRNAs are differentially expressed between the two major DLBCL subtypes, ABC and GCB. Novel lncRNAs are enriched at DLBCL super-enhancers, with a fraction of them conserved between human and dog lymphomas. We see transposable elements (TE) overlap in the exonic regions; particularly significant in the last exon of the novel lncRNAs suggest potential usage of cryptic TE polyadenylation signals. We identified highly co-expressed protein coding genes for at least 88 % of the novel lncRNAs. Functional enrichment analysis of co-expressed genes predicts a potential function for about half of novel lncRNAs. Finally, systematic structural analysis of candidate point mutations (SNVs) suggests that such mutations frequently stabilize lncRNA structures instead of destabilizing them.Discovery of these 2,632 novel lncRNAs in DLBCL significantly expands the lymphoma transcriptome and our analysis identifies potential roles of these lncRNAs in lymphomagenesis and/or tumor maintenance. For further studies, these novel lncRNAs also provide an abundant source of new targets for antisense oligonucleotide pharmacology, including shared targets between human and dog lymphomas.


September 22, 2019  |  

Emergence and genomic analysis of MDR Laribacter hongkongensis strain HLGZ1 from Guangzhou, China.

Laribacter hongkongensis is a facultative anaerobic, non-fermentative, Gram-negative bacillus associated with community-acquired gastroenteritis and traveller’s diarrhoea. No clinical MDR L. hongkongensis isolate has been reported yet.We performed WGS (PacBio and Illumina) on a clinical L. hongkongensis strain HLGZ1 with an MDR phenotype.HLGZ1 was resistant to eight classes of commonly used antibiotics. Its complete genome was a single circular chromosome of 3?424?272?bp with a G?+?C content of 62.29%. In comparison with the reference strain HLHK9, HLGZ1 had a higher abundance of genes associated with DNA metabolism and recombination. Several inserts including two acquired resistance gene clusters (RC1 and RC2) were also identified. RC1 carried two resistance gene cassette arrays, aac(6′)-Ib-cr-aadA2-?qac-?sul1-floR-tetR-tetG and arr-3-dfrA32-ereA2-?qac-sul1, which shared significant nucleotide sequence identities with the MDR region of Salmonella Genomic Island 1 from Salmonella enterica serovar Typhimurium DT104. There was also an integron-like structure, intl1-arr3-dfrA27-?qac-sul1-aph(3′)-Ic, and a tetR-tetA operon located on RC2. MLST analysis identified HLGZ1 as ST167, a novel ST clustered with two strains previously isolated from frogs.This study provides insight into the genomic characteristics of MDR L. hongkongensis and highlights the possibilities of horizontal resistance gene transfer in this bacterium with other pathogens.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.