Menu
July 19, 2019

Characterization of a human-specific tandem repeat associated with bipolar disorder and schizophrenia.

Bipolar disorder (BD) and schizophrenia (SCZ) are highly heritable diseases that affect more than 3% of individuals worldwide. Genome-wide association studies have strongly and repeatedly linked risk for both of these neuropsychiatric diseases to a 100 kb interval in the third intron of the human calcium channel gene CACNA1C. However, the causative mutation is not yet known. We have identified a human-specific tandem repeat in this region that is composed of 30 bp units, often repeated hundreds of times. This large tandem repeat is unstable using standard polymerase chain reaction and bacterial cloning techniques, which may have resulted in its incorrect size in the human reference genome. The large 30-mer repeat region is polymorphic in both size and sequence in human populations. Particular sequence variants of the 30-mer are associated with risk status at several flanking single-nucleotide polymorphisms in the third intron of CACNA1C that have previously been linked to BD and SCZ. The tandem repeat arrays function as enhancers that increase reporter gene expression in a human neural progenitor cell line. Different human arrays vary in the magnitude of enhancer activity, and the 30-mer arrays associated with increased psychiatric disease risk status have decreased enhancer activity. Changes in the structure and sequence of these arrays likely contribute to changes in CACNA1C function during human evolution and may modulate neuropsychiatric disease risk in modern human populations. Copyright © 2018. Published by Elsevier Inc.


July 19, 2019

Reference grade characterization of polymorphisms in full-length HLA class I and II genes with short-read sequencing on the Ion PGM system and long-reads generated by Single Molecule, Real-time Sequencing on the PacBio platform

Although NGS technologies fuel advances in high-throughput HLA genotyping methods for identification and classification of HLA genes to assist with precision medicine efforts in disease and transplantation, the efficiency of these methods are impeded by the absence of adequately-characterized high-frequency HLA allele reference sequence databases for the highly polymorphic HLA gene system. Here, we report on producing a comprehensive collection of full-length HLA allele sequences for eight classical HLA loci found in the Japanese population. We augmented the second-generation short read data generated by the Ion Torrent technology with long amplicon spanning consensus reads delivered by the third-generation SMRT sequencing method to create reference grade high-quality sequences of HLA class I and II gene alleles resolved at the genomic coding and non-coding level. Forty-six DNAs were obtained from a reference set used previously to establish the HLA allele frequency data in Japanese subjects. The samples included alleles with a collective allele frequency in the Japanese population of more than 99.2%. The HLA loci were independently amplified by long-range PCR using previously designed HLA-locus specific primers and subsequently sequenced using SMRT and Ion PGM sequencers. The mapped long and short-reads were used to produce a reference library of consensus HLA allelic sequences with the help of the reference-aware software tool LAA for SMRT Sequencing. A total of 253 distinct alleles were determined for 46 healthy subjects. Of them, 137 were novel alleles: 101 SNVs and/or indels and 36 extended alleles at a partial or full-length level. Comparing the HLA sequences from the perspective of nucleotide diversity revealed that HLA-DRB1 was the most divergent among the eight HLA genes, and that the HLA-DPB1 gene sequences diverged into two distinct groups, DP2 and DP5, with evidence of independent polymorphisms generated in exon 2. We also identified two specific intronic variations in HLA-DRB1 that might be involved in rheumatoid arthritis. In conclusion, full-length HLA allele sequencing by third-generation and second-generation technologies has provided polymorphic gene reference sequences at a genomic allelic resolution including allelic variations assigned up to the field-4 level for a stronger foundation in precision medicine and HLA-related disease and transplantation studies.


July 19, 2019

De novo assembly of two Swedish genomes reveals missing segments from the human GRCh38 reference and improves variant calling of population-scale sequencing data.

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.


July 19, 2019

Global genetic diversity of var2csa in Plasmodium falciparum with implications for malaria in pregnancy and vaccine development.

Malaria infection during pregnancy, caused by the sequestering of Plasmodium falciparum parasites in the placenta, leads to high infant mortality and maternal morbidity. The parasite-placenta adherence mechanism is mediated by the VAR2CSA protein, a target for natural occurring immunity. Currently, vaccine development is based on its ID1-DBL2Xb domain however little is known about the global genetic diversity of the encoding var2csa gene, which could influence vaccine efficacy. In a comprehensive analysis of the var2csa gene in >2,000?P. falciparum field isolates across 23 countries, we found that var2csa is duplicated in high prevalence (>25%), African and Oceanian populations harbour a much higher diversity than other regions, and that insertions/deletions are abundant leading to an underestimation of the diversity of the locus. Further, ID1-DBL2Xb haplotypes associated with adverse birth outcomes are present globally, and African-specific haplotypes exist, which should be incorporated into vaccine design.


July 19, 2019

Prediction of smoking by multiplex bisulfite PCR with long amplicons considering allele-specific effects on DNA methylation.

Methylation of DNA is associated with a variety of biological processes. With whole-genome studies of DNA methylation, it became possible to determine a set of genomic sites where DNA methylation is associated with a specific phenotype. A method is needed that allows detailed follow-up studies of the sites, including taking into account genetic information. Bisulfite PCR is a natural choice for this kind of task, but multiplexing is one of the most important problems impeding its implementation. To address this task, we took advantage of a recently published method based on Pacbio sequencing of long bisulfite PCR products (single-molecule real-time bisulfite sequencing, SMRT-BS) and tested the validity of the improved methodology with a smoking phenotype.Herein, we describe the “panhandle” modification of the method, which permits a more robust PCR with multiple targets. We applied this technique to determine smoking by DNA methylation in 71 healthy people and 83 schizophrenia patients (n?=?50 smokers and n?=?104 non-smokers, Russians of the Moscow region). We used five targets known to be influenced by smoking (regions of genes AHRR, ALPPL2, IER3, GNG12, and GFI1). We discovered significant allele-specific methylation effects in the AHRR and IER3 regions and assessed how this information could be exploited to improve the prediction of smoking based on the collected DNA methylation data. We found no significant difference in the methylation profiles of selected targets in relation to schizophrenia suggesting that smoking affects methylation at the studied genomic sites in healthy people and schizophrenia patients in a similar way.We determined that SMRT-BS with “panhandle” modification performs well in the described setting. Additional information regarding methylation and allele-specific effects could improve the predictive accuracy of DNA methylation-based models, which could be valuable for both basic research and clinical applications.


July 19, 2019

Mapping the landscape of tandem repeat variability by targeted long read single molecule sequencing in familial X-linked intellectual disability.

The etiology of more than half of all patients with X-linked intellectual disability remains elusive, despite array-based comparative genomic hybridization, whole exome or genome sequencing. Since short read massive parallel sequencing approaches do not allow the detection of larger tandem repeat expansions, we hypothesized that such expansions could be a hidden cause of X-linked intellectual disability.We selectively captured over 1800 tandem repeats on the X chromosome and characterized them by long read single molecule sequencing in 3 families with idiopathic X-linked intellectual disability. In male DNA samples, full tandem repeat length sequences were obtained for 88-93% of the targets and up to 99.6% of the repeats with a moderate guanine-cytosine content. Read length and analysis pipeline allow to detect cases of >?900?bp tandem repeat expansion. In one family, one repeat expansion co-occurs with down-regulation of the neighboring MIR222 gene. This gene has previously been implicated in intellectual disability and is apparently linked to FMR1 and NEFH overexpression associated with neurological disorders.This study demonstrates the power of single molecule sequencing to measure tandem repeat lengths and detect expansions, and suggests that tandem repeat mutations may be a hidden cause of X-linked intellectual disability.


July 19, 2019

A forward genetic screen reveals a primary role for Plasmodium falciparum Reticulocyte Binding Protein Homologue 2a and 2b in determining alternative erythrocyte invasion pathways.

Invasion of human erythrocytes is essential for Plasmodium falciparum parasite survival and pathogenesis, and is also a complex phenotype. While some later steps in invasion appear to be invariant and essential, the earlier steps of recognition are controlled by a series of redundant, and only partially understood, receptor-ligand interactions. Reverse genetic analysis of laboratory adapted strains has identified multiple genes that when deleted can alter invasion, but how the relative contributions of each gene translate to the phenotypes of clinical isolates is far from clear. We used a forward genetic approach to identify genes responsible for variable erythrocyte invasion by phenotyping the parents and progeny of previously generated experimental genetic crosses. Linkage analysis using whole genome sequencing data revealed a single major locus was responsible for the majority of phenotypic variation in two invasion pathways. This locus contained the PfRh2a and PfRh2b genes, members of one of the major invasion ligand gene families, but not widely thought to play such a prominent role in specifying invasion phenotypes. Variation in invasion pathways was linked to significant differences in PfRh2a and PfRh2b expression between parasite lines, and their role in specifying alternative invasion was confirmed by CRISPR-Cas9-mediated genome editing. Expansion of the analysis to a large set of clinical P. falciparum isolates revealed common deletions, suggesting that variation at this locus is a major cause of invasion phenotypic variation in the endemic setting. This work has implications for blood-stage vaccine development and will help inform the design and location of future large-scale studies of invasion in clinical isolates.


July 19, 2019

Whole-genome sequencing reveals principles of brain retrotransposition in neurodevelopmental disorders.

Neural progenitor cells undergo somatic retrotransposition events, mainly involving L1 elements, which can be potentially deleterious. Here, we analyze the whole genomes of 20 brain samples and 80 non-brain samples, and characterized the retrotransposition landscape of patients affected by a variety of neurodevelopmental disorders including Rett syndrome, tuberous sclerosis, ataxia-telangiectasia and autism. We report that the number of retrotranspositions in brain tissues is higher than that observed in non-brain samples and even higher in pathologic vs normal brains. The majority of somatic brain retrotransposons integrate into pre-existing repetitive elements, preferentially A/T rich L1 sequences, resulting in nested insertions. Our findings document the fingerprints of encoded endonuclease independent mechanisms in the majority of L1 brain insertion events. The insertions are “non-classical” in that they are truncated at both ends, integrate in the same orientation as the host element, and their target sequences are enriched with a CCATT motif in contrast to the classical endonuclease motif of most other retrotranspositions. We show that L1Hs elements integrate preferentially into genes associated with neural functions and diseases. We propose that pre-existing retrotransposons act as “lightning rods” for novel insertions, which may give fine modulation of gene expression while safeguarding from deleterious events. Overwhelmingly uncontrolled retrotransposition may breach this safeguard mechanism and increase the risk of harmful mutagenesis in neurodevelopmental disorders.


July 7, 2019

Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds.

Pseudomonas knackmussii B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103?kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a ‘core’ region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220?kb region and a prophage that drastically change the host metabolic capacity and survivability. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.


July 7, 2019

A novel Tn3-like composite transposon harboring blaVIM-1 in Klebsiella pneumoniae spp. pneumoniae isolated from river water.

We present a new plasmid (pOW16C2) with a novel Tn3-like transposon harboring blaVIM-1 from a Klebsiella pneumoniae strain isolated from river water in Switzerland.Complete nucleotide sequence of pOW16C2 was obtained using a Pacific Biosciences SMRT sequencing approach and coding sequences were predicted.The 59,228?bp sequence included a typical IncN-like backbone and a mosaic structure with blaVIM-1, aacA4, aphA15, aadA1, catB2, qnrS1, sul1, and dfrA14 conferring resistance to carbapenems and other ß-lactam antibiotics, aminoglycosides, chloramphenicol, quinolones, sulfonamides, and trimethoprim, respectively. Most of these resistance genes were inserted in a class 1 integron that was embedded in a novel Tn3-like composite transposon.IncN plasmids carrying carbapenemases are frequently isolated from K. pneumoniae strains in clinical settings. The dissemination of K. pneumoniae harboring blaVIM-1 in surface water is a cause for increased concern to public health.


July 7, 2019

Emergence of scarlet fever Streptococcus pyogenes emm12 clones in Hong Kong is associated with toxin acquisition and multidrug resistance.

A scarlet fever outbreak began in mainland China and Hong Kong in 2011 (refs. 1-6). Macrolide- and tetracycline-resistant Streptococcus pyogenes emm12 isolates represent the majority of clinical cases. Recently, we identified two mobile genetic elements that were closely associated with emm12 outbreak isolates: the integrative and conjugative element ICE-emm12, encoding genes for tetracycline and macrolide resistance, and prophage FHKU.vir, encoding the superantigens SSA and SpeC, as well as the DNase Spd1 (ref. 4). Here we sequenced the genomes of 141 emm12 isolates, including 132 isolated in Hong Kong between 2005 and 2011. We found that the introduction of several ICE-emm12 variants, FHKU.vir and a new prophage, FHKU.ssa, occurred in three distinct emm12 lineages late in the twentieth century. Acquisition of ssa and transposable elements encoding multidrug resistance genes triggered the expansion of scarlet fever-associated emm12 lineages in Hong Kong. The occurrence of multidrug-resistant ssa-harboring scarlet fever strains should prompt heightened surveillance within China and abroad for the dissemination of these mobile genetic elements.


July 7, 2019

Drug resistance analysis by next generation sequencing in Leishmania.

The use of next generation sequencing has the power to expedite the identification of drug resistance determinants and biomarkers and was applied successfully to drug resistance studies in Leishmania. This allowed the identification of modulation in gene expression, gene dosage alterations, changes in chromosome copy numbers and single nucleotide polymorphisms that correlated with resistance in Leishmania strains derived from the laboratory and from the field. An impressive heterogeneity at the population level was also observed, individual clones within populations often differing in both genotypes and phenotypes, hence complicating the elucidation of resistance mechanisms. This review summarizes the most recent highlights that whole genome sequencing brought to our understanding of Leishmania drug resistance and likely new directions.


July 7, 2019

Molecular characterization of plasmid pMoma1of Moraxella macacae, a newly described bacterial pathogen of macaques.

We report the complete nucleotide sequence and characterization of a small cryptic plasmid of Moraxella macacae 0408225, a newly described bacterial species within the family Moraxellaceae and a causative agent of epistaxis in macaques. The complete nucleotide sequence of the plasmid pMoma1 was determined and found to be 5,375 bp in size with a GC content of 37.4 %. Computer analysis of the sequence data revealed five open reading frames encoding putative proteins of 54.4 kDa (ORF1), 17.6 kDa (ORF2), 13.3 kDa (ORF3), 51.6 kDa (ORF4), and 25.0 kDa (ORF5). ORF1, ORF2, and ORF3 encode putative proteins with high identity (72, 42, and 55 %, respectively) to mobilization proteins of plasmids found in other Moraxella species. ORF3 encodes a putative protein with similarity (about 40 %) to several plasmid replicase (RepA) proteins. The fifth open reading frames (ORF) was most similar to hypothetical proteins with unknown functions, although domain analysis of this sequence suggests it belongs to the Abi-like protein family. Upstream of the repA gene, a 470-bp intergenic region, was identified that contained an AT-rich section and two sets of tandem direct and indirect repeats, consistent with a putative origin of replication site. In contrast to other plasmids of Moraxella, the occurrence of pMoma1 in M. macacae isolates appears to be common as PCR testing of 14 clinical isolates from two different research institutions all contained the plasmid.


July 7, 2019

Prevalence of subtilase cytotoxin-encoding subAB variants among Shiga toxin-producing Escherichia coli strains isolated from wild ruminants and sheep differs from that of cattle and pigs and is predominated by the new allelic variant subAB2-2.

Subtilase cytotoxin (SubAB) is an AB5 toxin produced by Shiga toxin (Stx)-producing Escherichia coli (STEC) strains usually lacking the eae gene product intimin. Three allelic variants of SubAB encoding genes have been described: subAB1, located on a plasmid, subAB2-1, located on the pathogenicity island SE-PAI and subAB2-2 located in an outer membrane efflux protein (OEP) region. SubAB is becoming increasingly recognized as a toxin potentially involved in human pathogenesis. Ruminants and cattle have been identified as reservoirs of subAB-positive STEC. The presence of the three subAB allelic variants was investigated by PCR for 152 STEC strains originating from chamois, ibex, red deer, roe deer, cattle, sheep and pigs. Overall, subAB genes were detected in 45.5% of the strains. Prevalence was highest for STEC originating from ibex (100%), chamois (92%) and sheep (65%). None of the STEC of bovine or of porcine origin tested positive for subAB. None of the strains tested positive for subAB1. The allelic variant subAB2-2 was detected the most commonly, with 51.4% possessing subAb2-1 together with subAB2-2. STEC of ovine origin, serotypes O91:H- and O128:H2, the saa gene, which encodes for the autoagglutinating adhesin and stx2b were significantly associated with subAB-positive STEC. Our results suggest that subAB2-1 and subAB2-2 is widespread among STEC from wild ruminants and sheep and may be important as virulence markers in STEC pathogenic to humans. Copyright © 2014 Elsevier GmbH. All rights reserved.


July 7, 2019

Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles.

Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. © 2015 Nandi et al.; Published by Cold Spring Harbor Laboratory Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.