Menu
April 21, 2020

Patterns of non-ARD variation in more than 300 full-length HLA-DPB1 alleles.

Our understanding of sequence variation in the HLA-DPB1 gene is largely restricted to the hypervariable antigen recognition domain (ARD) encoded by exon 2. Here, we employed a redundant sequencing strategy combining long-read and short-read data to accurately phase and characterise in full length the majority of common and well-documented (CWD) DPB1 alleles as well as alleles with an observed frequency of at least 0.0006% in our predominantly European sample set. We generated 664 DPB1 sequences, comprising 279 distinct allelic variants. This allows us to present the, to date, most comprehensive analysis of the nature and extent of DPB1 sequence variation. The full-length sequence analysis revealed the existence of two highly diverged allele clades. These clades correlate with the rs9277534 A???G variant, a known expression marker located in the 3′-UTR. The two clades are fully differentiated by 174 fixed polymorphisms throughout a 3.6?kb stretch at the 3′-end of DPB1. The region upstream of this differentiation zone is characterised by increasingly shared variation between the clades. The low-expression A clade comprises 59% of the distinct allelic sequences including the three by far most frequent DPB1 alleles, DPB1*04:01, DPB1*02:01 and DPB1*04:02. Alleles in the A clade show reduced nucleotide diversity with an excess of rare variants when compared to the high-expression G clade. This pattern is consistent with a scenario of recent proliferation of A-clade alleles. The full-length characterisation of all but the most rare DPB1 alleles will benefit the application of NGS for DPB1 genotyping and provides a helpful framework for a deeper understanding of high- and low-expression alleles and their implications in the context of unrelated haematopoietic stem-cell transplantation.Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.


April 21, 2020

The red bayberry genome and genetic basis of sex determination.

Morella rubra, red bayberry, is an economically important fruit tree in south China. Here, we assembled the first high-quality genome for both a female and a male individual of red bayberry. The genome size was 313-Mb, and 90% sequences were assembled into eight pseudo chromosome molecules, with 32 493 predicted genes. By whole-genome comparison between the female and male and association analysis with sequences of bulked and individual DNA samples from female and male, a 59-Kb region determining female was identified and located on distal end of pseudochromosome 8, which contains abundant transposable element and seven putative genes, four of them are related to sex floral development. This 59-Kb female-specific region was likely to be derived from duplication and rearrangement of paralogous genes and retained non-recombinant in the female-specific region. Sex-specific molecular markers developed from candidate genes co-segregated with sex in a genetically diverse female and male germplasm. We propose sex determination follow the ZW model of female heterogamety. The genome sequence of red bayberry provides a valuable resource for plant sex chromosome evolution and also provides important insights for molecular biology, genetics and modern breeding in Myricaceae family. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


April 21, 2020

Physiological properties and genetic analysis related to exopolysaccharide (EPS) production in the fresh-water unicellular cyanobacterium Aphanothece sacrum (Suizenji Nori).

The clonal strains, phycoerythrin(PE)-rich- and PE-poor strains, of the unicellular, fresh water cyanobacterium Aphanothece sacrum (Suringar) Okada (Suizenji Nori, in Japanese) were isolated from traditional open-air aquafarms in Japan. A. sacrum appeared to be oligotrophic on the basis of its growth characteristics. The optimum temperature for growth was around 20°C. Maximum growth and biomass increase at 20°C was obtained under light intensities between 40 to 80 µmol m-2 s-1 (fluorescent lamps, 12 h light/12 h dark cycles) and between 40 to 120 µmol m-2 s-1 for PE-rich and PE-poor strains, respectively, of A. sacrum . Purified exopolysaccharide (EPS) of A. sacrum has a molecular weight of ca. 104 kDa with five major monosaccharides (glucose, xylose, rhamnose, galactose and mannose; =85 mol%). We also deciphered the whole genome sequence of the two strains of A. sacrum. The putative genes involved in the polymerization, chain length control, and export of EPS would contribute to understand the biosynthetic process of their extremely high molecular weight EPS. The putative genes encoding Wzx-Wzy-Wzz- and Wza-Wzb-Wzc were conserved in the A. sacrum strains FPU1 and FPU3. This result suggests that the Wzy-dependent pathway participates in the EPS production of A. sacrum.


April 21, 2020

Discovery of tandem and interspersed segmental duplications using high-throughput sequencing.

Several algorithms have been developed that use high-throughput sequencing technology to characterize structural variations (SVs). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. Additionally, due to similar sequencing signatures, inverted duplications or gene conversion events that include inverted segmental duplications are often characterized as simple inversions, likewise, duplications and gene conversions in direct orientation may be called as simple deletions. Therefore, there is still a need for accurate algorithms to fully characterize complex SVs and thus improve calling accuracy of more simple variants.We developed novel algorithms to accurately characterize tandem, direct and inverted interspersed segmental duplications using short read whole genome sequencing datasets. We integrated these methods to our TARDIS tool, which is now capable of detecting various types of SVs using multiple sequence signatures such as read pair, read depth and split read. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real datasets. In the simulation experiments, using a 30× coverage TARDIS achieved 96% sensitivity with only 4% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state-of-the-art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (<5% for the top 50 predictions).TARDIS source code is available at https://github.com/BilkentCompGen/tardis, and a corresponding Docker image is available at https://hub.docker.com/r/alkanlab/tardis/.Supplementary data are available at Bioinformatics online. © The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


April 21, 2020

RADAR-seq: A RAre DAmage and Repair sequencing method for detecting DNA damage on a genome-wide scale.

RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire genome. Here, we use RADAR-seq to measure the frequency and map the location of ribonucleotides in wild-type and RNaseH2-deficient E. coli and Thermococcus kodakarensis strains. Additionally, by tracking ribonucleotides incorporated during in vivo lagging strand DNA synthesis, we determined the replication initiation point in E. coli, and its relation to the origin of replication (oriC). RADAR-seq was also used to map cyclobutane pyrimidine dimers (CPDs) in Escherichia coli (E. coli) genomic DNA exposed to UV-radiation. On a broader scale, RADAR-seq can be applied to understand formation and repair of DNA damage, the correlation between DNA damage and disease initiation and progression, and complex biological pathways, including DNA replication.Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.


April 21, 2020

A 12-kb structural variation in progressive myoclonic epilepsy was newly identified by long-read whole-genome sequencing.

We report a family with progressive myoclonic epilepsy who underwent whole-exome sequencing but was negative for pathogenic variants. Similar clinical courses of a devastating neurodegenerative phenotype of two affected siblings were highly suggestive of a genetic etiology, which indicates that the survey of genetic variation by whole-exome sequencing was not comprehensive. To investigate the presence of a variant that remained unrecognized by standard genetic testing, PacBio long-read sequencing was performed. Structural variant (SV) detection using low-coverage (6×) whole-genome sequencing called 17,165 SVs (7,216 deletions and 9,949 insertions). Our SV selection narrowed down potential candidates to only five SVs (two deletions and three insertions) on the genes tagged with autosomal recessive phenotypes. Among them, a 12.4-kb deletion involving the CLN6 gene was the top candidate because its homozygous abnormalities cause neuronal ceroid lipofuscinosis. This deletion included the initiation codon and was found in a GC-rich region containing multiple repetitive elements. These results indicate the presence of a causal variant in a difficult-to-sequence region and suggest that such variants that remain enigmatic after the application of current whole-exome sequencing technology could be uncovered by unbiased application of long-read whole-genome sequencing.


April 21, 2020

Fast and accurate genomic analyses using genome graphs.

The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, thus impairing analysis accuracy. Here we present a graph reference genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million insertions and deletions (indels). The pipeline processes one whole-genome sequencing sample in 6.5?h using a system with 36?CPU cores. We show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, with unaffected specificity. Structural variations incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is an important advance toward fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.


April 21, 2020

The CF Canada-Sick Kids Program in individual CF therapy: A resource for the advancement of personalized medicine in CF.

Therapies targeting certain CFTR mutants have been approved, yet variations in clinical response highlight the need for in-vitro and genetic tools that predict patient-specific clinical outcomes. Toward this goal, the CF Canada-Sick Kids Program in Individual CF Therapy (CFIT) is generating a “first of its kind”, comprehensive resource containing patient-specific cell cultures and data from 100 CF individuals that will enable modeling of therapeutic responses.The CFIT program is generating: 1) nasal cells from drug naïve patients suitable for culture and the study of drug responses in vitro, 2) matched gene expression data obtained by sequencing the RNA from the primary nasal tissue, 3) whole genome sequencing of blood derived DNA from each of the 100 participants, 4) induced pluripotent stem cells (iPSCs) generated from each participant’s blood sample, 5) CRISPR-edited isogenic control iPSC lines and 6) prospective clinical data from patients treated with CF modulators.To date, we have recruited 57 of 100 individuals to CFIT, most of whom are homozygous for F508del (to assess in-vitro: in-vivo correlations with respect to ORKAMBI response) or heterozygous for F508del and a minimal function mutation. In addition, several donors are homozygous for rare nonsense and missense mutations. Nasal epithelial cell cultures and matched iPSC lines are available for many of these donors.This accessible resource will enable development of tools that predict individual outcomes to current and emerging modulators targeting F508del-CFTR and facilitate therapy discovery for rare CF causing mutations.Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.


April 21, 2020

Snf2 controls pulcherriminic acid biosynthesis and antifungal activity of the biocontrol yeast Metschnikowia pulcherrima.

Metschnikowia pulcherrima synthesises the pigment pulcherrimin, from cyclodileucine (cyclo(Leu-Leu)) as a precursor, and exhibits strong antifungal activity against notorious plant pathogenic fungi. This yeast therefore has great potential for biocontrol applications against fungal diseases; particularly in the phyllosphere where this species is frequently found. To elucidate the molecular basis of the antifungal activity of M. pulcherrima, we compared a wild-type strain with a spontaneously occurring, pigmentless, weakly antagonistic mutant derivative. Whole genome sequencing of the wild-type and mutant strains identified a point mutation that creates a premature stop codon in the transcriptional regulator gene SNF2 in the mutant. Complementation of the mutant strain with the wild-type SNF2 gene restored pigmentation and recovered the strong antifungal activity. Mass spectrometry (UPLC HR HESI-MS) proved the presence of the pulcherrimin precursors cyclo(Leu-Leu) and pulcherriminic acid and identified new precursor and degradation products of pulcherriminic acid and/or pulcherrimin. All of these compounds were identified in the wild-type and complemented strain, but were undetectable in the pigmentless snf2 mutant strain. These results thus identify Snf2 as a regulator of antifungal activity and pulcherriminic acid biosynthesis in M. pulcherrima and provide a starting point for deciphering the molecular functions underlying the antagonistic activity of this yeast. © 2019 The Authors. Molecular Microbiology Published by John Wiley & Sons Ltd.


April 21, 2020

Hybrid sequencing-based personal full-length transcriptomic analysis implicates proteostatic stress in metastatic ovarian cancer.

Comprehensive molecular characterization of myriad somatic alterations and aberrant gene expressions at personal level is key to precision cancer therapy, yet limited by current short-read sequencing technology, individualized catalog of complete genomic and transcriptomic features is thus far elusive. Here, we integrated second- and third-generation sequencing platforms to generate a multidimensional dataset on a patient affected by metastatic epithelial ovarian cancer. Whole-genome and hybrid transcriptome dissection captured global genetic and transcriptional variants at previously unparalleled resolution. Particularly, single-molecule mRNA sequencing identified a vast array of unannotated transcripts, novel long noncoding RNAs and gene chimeras, permitting accurate determination of transcription start, splice, polyadenylation and fusion sites. Phylogenetic and enrichment inference of isoform-level measurements implicated early functional divergence and cytosolic proteostatic stress in shaping ovarian tumorigenesis. A complementary imaging-based high-throughput drug screen was performed and subsequently validated, which consistently pinpointed proteasome inhibitors as an effective therapeutic regime by inducing protein aggregates in ovarian cancer cells. Therefore, our study suggests that clinical application of the emerging long-read full-length analysis for improving molecular diagnostics is feasible and informative. An in-depth understanding of the tumor transcriptome complexity allowed by leveraging the hybrid sequencing approach lays the basis to reveal novel and valid therapeutic vulnerabilities in advanced ovarian malignancies.


April 21, 2020

Oenococcus sicerae sp. nov., isolated from French cider.

Two Gram-stain-positive, small ellipsoidal cocci, non-motile, oxidase- and catalase-negative, and facultative anaerobic strains (UCMA15228T and UCMA17102) were isolated in France, from fermented apple juices (ciders). The 16S rRNA gene sequence was identical between the two isolates and showed 97 % similarity with respect to the closest related species Oenococcus oeni and O. kitaharae. Therefore, the two isolates were classified within the genus Oenococcus. The phylogeny based on the pheS gene sequences also confirmed the position of the new taxon. DNA-DNA hybridizations based on in silico genome-to-genome comparisons (GGDC) and Average Nucleotide Identity (ANI) values, as well as species-specific PCR, validated the novelty of the taxon. Various phenotypic characteristics such as the optimum temperature and pH for growth, the ability to metabolise sugars, the aptitude to perform the malolactic fermentation, and the resistance to ethanol and NaCl, revealed that the two strains are distinguishable from the other members of the Oenococcus genus. The combined genotypic and phenotypic data support the classification of strains UCMA15228T and UCMA17102 into a novel species of Oenococcus, for which the name O. sicerae sp. nov. is proposed. The type strain is UCMA15228T (=DSM107163T=CIRM-BIA2288T).Copyright © 2018 Elsevier GmbH. All rights reserved.


April 21, 2020

Alternative Splicing of the Delta-Opioid Receptor Gene Suggests Existence of New Functional Isoforms.

The delta-opioid receptor (DOPr) participates in mediating the effects of opioid analgesics. However, no selective agonists have entered clinical care despite potential to ameliorate many neurological and psychiatric disorders. In an effort to address the drug development challenges, the functional contribution of receptor isoforms created by alternative splicing of the three-exonic coding gene, OPRD1, has been overlooked. We report that the gene is transcriptionally more diverse than previously demonstrated, producing novel protein isoforms in humans and mice. We provide support for the functional relevance of splice variants through context-dependent expression profiling (tissues, disease model) and conservation of the transcriptional landscape in closely related vertebrates. The conserved alternative transcriptional events have two distinct patterns. First, cassette exon inclusions between exons 1 and 2 interrupt the reading frame, producing truncated receptor fragments comprising only the first transmembrane (TM) domain, despite the lack of exact exon orthologues between distant species. Second, a novel promoter and transcriptional start site upstream of exon 2 produces a transcript of an N-terminally truncated 6TM isoform. However, a fundamental difference in the exonic landscaping as well as translation and translation products poses limits for modelling the human DOPr receptor system in mice.


April 21, 2020

Long-read sequencing identified intronic repeat expansions in SAMD12 from Chinese pedigrees affected with familial cortical myoclonic tremor with epilepsy.

The locus for familial cortical myoclonic tremor with epilepsy (FCMTE) has long been mapped to 8q24 in linkage studies, but the causative mutations remain unclear. Recently, expansions of intronic TTTCA and TTTTA repeat motifs within SAMD12 were found to be involved in the pathogenesis of FCMTE in Japanese pedigrees. We aim to identify the causative mutations of FCMTE in Chinese pedigrees.We performed genetic linkage analysis by microsatellite markers in a five-generation Chinese pedigree with 55 members. We also used array-comparative genomic hybridisation (CGH) and next-generation sequencing (NGS) technologies (whole-exome sequencing, capture region deep sequencing and whole-genome sequencing) to identify the causative mutations in the disease locus. Recently, we used low-coverage (~10×) long-read genome sequencing (LRS) on the PacBio Sequel and Oxford Nanopore platforms to identify the causative mutations, and used repeat-primed PCR for validation of the repeat expansions.Linkage analysis mapped the disease locus to 8q23.3-24.23. Array-CGH and NGS failed to identify causative mutations in this locus. LRS identified the intronic TTTCA and TTTTA repeat expansions in SAMD12 as the causative mutations, thus corroborating the recently published results in Japanese pedigrees.We identified the pentanucleotide repeat expansion in SAMD12 as the causative mutation in Chinese FCMTE pedigrees. Our study also suggested that LRS is an effective tool for molecular diagnosis of genetic disorders, especially for neurological diseases that cannot be positively diagnosed by conventional clinical microarray and NGS technologies. © Author(s) (or their employer(s)) 2019. No commercial re-use. See rights and permissions. Published by BMJ.


April 21, 2020

Iron-associated protein interaction networks reveal the key functional modules related to survival and virulence of Pasteurella multocida.

Pasteurella multocida causes respiratory infectious diseases in a multitude of birds and mammals. A number of virulence-associated genes were reported across different strains of P. multocida, including those involved in the iron transport and metabolism. Comparative iron-associated genes of P. multocida among different animal hosts towards their interaction networks have not been fully revealed. Therefore, this study aimed to identify the iron-associated genes from core- and pan-genomes of fourteen P. multocida strains and to construct iron-associated protein interaction networks using genome-scale network analysis which might be associated with the virulence. Results showed that these fourteen strains had 1587 genes in the core-genome and 3400 genes constituting their pan-genome. Out of these, 2651 genes associated with iron transport and metabolism were selected to construct the protein interaction networks and 361 genes were incorporated into the iron-associated protein interaction network (iPIN) consisting of nine different iron-associated functional modules. After comparing with the virulence factor database (VFDB), 21 virulence-associated proteins were determined and 11 of these belonged to the heme biosynthesis module. From this study, the core heme biosynthesis module and the core outer membrane hemoglobin receptor HgbA were proposed as candidate targets to design novel antibiotics and vaccines for preventing pasteurellosis across the serotypes or animal hosts for enhanced precision agriculture to ensure sustainability in food security. Copyright © 2018. Published by Elsevier Ltd.


April 21, 2020

Long-read sequence capture of the haemoglobin gene clusters across codfish species.

Combining high-throughput sequencing with targeted sequence capture has become an attractive tool to study specific genomic regions of interest. Most studies have so far focused on the exome using short-read technology. These approaches are not designed to capture intergenic regions needed to reconstruct genomic organization, including regulatory regions and gene synteny. Here, we demonstrate the power of combining targeted sequence capture with long-read sequencing technology for comparative genomic analyses of the haemoglobin (Hb) gene clusters across eight species separated by up to 70 million years. Guided by the reference genome assembly of the Atlantic cod (Gadus morhua) together with genome information from draft assemblies of selected codfishes, we designed probes covering the two Hb gene clusters. Use of custom-made barcodes combined with PacBio RSII sequencing led to highly continuous assemblies of the LA (~100 kb) and MN (~200 kb) clusters, which include syntenic regions of coding and intergenic sequences. Our results revealed an overall conserved genomic organization of the Hb genes within this lineage, yet with several, lineage-specific gene duplications. Moreover, for some of the species examined, we identified amino acid substitutions at two sites in the Hbb1 gene as well as length polymorphisms in its regulatory region, which has previously been linked to temperature adaptation in Atlantic cod populations. This study highlights the use of targeted long-read capture as a versatile approach for comparative genomic studies by generation of a cross-species genomic resource elucidating the evolutionary history of the Hb gene family across the highly divergent group of codfishes. © 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.