Menu
September 22, 2019

Relationship between Alzheimer’s disease-associated SNPs within the CLU gene, local DNA methylation and episodic verbal memory in healthy and schizophrenia subjects.

Genetic variation may impact on local DNA methylation patterns. Therefore, information about allele-specific DNA methylation (ASM) within disease-related loci has been proposed to be useful for the interpretation of GWAS results. To explore mechanisms that may underlie associations between Alzheimer’s disease (AD) and schizophrenia risk CLU gene and verbal memory, one of the most affected cognitive domains in both conditions, we studied DNA methylation in a region between AD-associated SNPs rs9331888 and rs9331896 in 72 healthy individuals and 73 schizophrenia patients. Using single-molecule real-time bisulfite sequencing we assessed the haplotype-dependent ASM in this region. We then investigated whether its methylation could influence episodic verbal memory measured with the Rey Auditory Verbal Learning Test in these two cohorts. The region showed a complex methylation pattern, which was similar in healthy and schizophrenia individuals and unrelated to haplotypes. The pattern predicted memory scores in controls. The results suggest that epigenetic modifications within the CLU locus may play a role in memory variation, independent of ASM. Copyright © 2018 Elsevier B.V. All rights reserved.


September 21, 2019

Whole genome sequence of the soybean aphid, Aphis glycines.

Aphids are emerging as model organisms for both basic and applied research. Of the 5,000 estimated species, only three aphids have published whole genome sequences: the pea aphid Acyrthosiphon pisum, the Russian wheat aphid, Diuraphis noxia, and the green peach aphid, Myzus persicae. We present the whole genome sequence of a fourth aphid, the soybean aphid (Aphis glycines), which is an extreme specialist and an important invasive pest of soybean (Glycine max). The availability of genomic resources is important to establish effective and sustainable pest control, as well as to expand our understanding of aphid evolution. We generated a 302.9 Mbp draft genome assembly for Ap. glycines using a hybrid sequencing approach. This assembly shows high completeness with 19,182 predicted genes, 92% of known Ap. glycines transcripts mapping to contigs, and substantial continuity with a scaffold N50 of 174,505 bp. The assembly represents 95.5% of the predicted genome size of 317.1 Mbp based on flow cytometry. Ap. glycines contains the smallest known aphid genome to date, based on updated genome sizes for 19 aphid species. The repetitive DNA content of the Ap. glycines genome assembly (81.6 Mbp or 26.94% of the 302.9 Mbp assembly) shows a reduction in the number of classified transposable elements compared to Ac. pisum, and likely contributes to the small estimated genome size. We include comparative analyses of gene families related to host-specificity (cytochrome P450’s and effectors), which may be important in Ap. glycines evolution. This Ap. glycines draft genome sequence will provide a resource for the study of aphid genome evolution, their interaction with host plants, and candidate genes for novel insect control methods. Copyright © 2017 Elsevier Ltd. All rights reserved.


September 21, 2019

Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element.

The discordance between genome size and the complexity of eukaryotes can partly be attributed to differences in repeat density. The Muller F element (~5.2 Mb) is the smallest chromosome in Drosophila melanogaster, but it is substantially larger (>18.7 Mb) in D. ananassae To identify the major contributors to the expansion of the F element and to assess their impact, we improved the genome sequence and annotated the genes in a 1.4-Mb region of the D. ananassae F element, and a 1.7-Mb region from the D element for comparison. We find that transposons (particularly LTR and LINE retrotransposons) are major contributors to this expansion (78.6%), while Wolbachia sequences integrated into the D. ananassae genome are minor contributors (0.02%). Both D. melanogaster and D. ananassae F-element genes exhibit distinct characteristics compared to D-element genes (e.g., larger coding spans, larger introns, more coding exons, and lower codon bias), but these differences are exaggerated in D. ananassae Compared to D. melanogaster, the codon bias observed in D. ananassae F-element genes can primarily be attributed to mutational biases instead of selection. The 5′ ends of F-element genes in both species are enriched in dimethylation of lysine 4 on histone 3 (H3K4me2), while the coding spans are enriched in H3K9me2. Despite differences in repeat density and gene characteristics, D. ananassae F-element genes show a similar range of expression levels compared to genes in euchromatic domains. This study improves our understanding of how transposons can affect genome size and how genes can function within highly repetitive domains. Copyright © 2017 Leung et al.


July 19, 2019

A benchmark study on error assessment and quality control of CCS reads derived from the PacBio RS.

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.


July 19, 2019

Error correction and assembly complexity of single molecule sequencing reads.

Third generation single molecule sequencing technology is poised to revolutionize genomics by en- abling the sequencing of long, individual molecules of DNA and RNA. These technologies now routinely produce reads exceeding 5,000 basepairs, and can achieve reads as long as 50,000 basepairs. Here we evaluate the limits of single molecule sequencing by assessing the impact of long read sequencing in the assembly of the human genome and 25 other important genomes across the tree of life. From this, we develop a new data-driven model using support vector regression that can accurately predict assembly performance. We also present a novel hybrid error correction algorithm for long PacBio sequencing reads that uses pre-assembled Illumina sequences for the error correction. We apply it several prokaryotic and eukaryotic genomes, and show it can achieve near-perfect assemblies of small genomes (< 100Mbp) and substantially improved assemblies of larger ones. All source code and the assembly model are available open-source.


July 19, 2019

The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development.

Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement. Copyright © 2014 Elsevier Inc. All rights reserved.


July 19, 2019

Genome reference and sequence variation in the large repetitive central exon of human MUC5AC.

Despite modern sequencing efforts, the difficulty in assembly of highly repetitive sequences has prevented resolution of human genome gaps, including some in the coding regions of genes with important biological functions. One such gene, MUC5AC, encodes a large, secreted mucin, which is one of the two major secreted mucins in human airways. The MUC5AC region contains a gap in the human genome reference (hg19) across the large, highly repetitive, and complex central exon. This exon is predicted to contain imperfect tandem repeat sequences and multiple conserved cysteine-rich (CysD) domains. To resolve the MUC5AC genomic gap, we used high-fidelity long PCR followed by single molecule real-time (SMRT) sequencing. This technology yielded long sequence reads and robust coverage that allowed for de novo sequence assembly spanning the entire repetitive region. Furthermore, we used SMRT sequencing of PCR amplicons covering the central exon to identify genetic variation in four individuals. The results demonstrated the presence of segmental duplications of CysD domains, insertions/deletions (indels) of tandem repeats, and single nucleotide variants. Additional studies demonstrated that one of the identified tandem repeat insertions is tagged by nonexonic single nucleotide polymorphisms. Taken together, these data illustrate the successful utility of SMRT sequencing long reads for de novo assembly of large repetitive sequences to fill the gaps in the human genome. Characterization of the MUC5AC gene and the sequence variation in the central exon will facilitate genetic and functional studies for this critical airway mucin.


July 19, 2019

Progress, challenges and the future of crop genomes.

The availability of plant reference genomes has ushered in a new era of crop genomics. More than 100 plant genomes have been sequenced since 2000, 63% of which are crop species. These genome sequences provide insight into architecture, evolution and novel aspects of crop genomes such as the retention of key agronomic traits after whole genome duplication events. Some crops have very large, polyploid, repeat-rich genomes, which require innovative strategies for sequencing, assembly and analysis. Even low quality reference genomes have the potential to improve crop germplasm through genome-wide molecular markers, which decrease expensive phenotyping and breeding cycles. The next stage of plant genomics will require draft genome refinement, building resources for crop wild relatives, resequencing broad diversity panels, and plant ENCODE projects to better understand the complexities of these highly diverse genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.


July 19, 2019

Comprehensive analysis of cancer-associated somatic mutations in class I HLA genes.

Detection of somatic mutations in human leukocyte antigen (HLA) genes using whole-exome sequencing (WES) is hampered by the high polymorphism of the HLA loci, which prevents alignment of sequencing reads to the human reference genome. We describe a computational pipeline that enables accurate inference of germline alleles of class I HLA-A, B and C genes and subsequent detection of mutations in these genes using the inferred alleles as a reference. Analysis of WES data from 7,930 pairs of tumor and healthy tissue from the same patient revealed 298 nonsilent HLA mutations in tumors from 266 patients. These 298 mutations are enriched for likely functional mutations, including putative loss-of-function events. Recurrence of mutations suggested that these ‘hotspot’ sites were positively selected. Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.


July 19, 2019

Fc? receptors: genetic variation, function, and disease.

Fc? receptors (Fc?Rs) are key immune receptors responsible for the effective control of both humoral and innate immunity and are central to maintaining the balance between generating appropriate responses to infection and preventing autoimmunity. When this balance is lost, pathology results in increased susceptibility to cancer, autoimmunity, and infection. In contrast, optimal Fc?R engagement facilitates effective disease resolution and response to monoclonal antibody immunotherapy. The underlying genetics of the Fc?R gene family are a central component of this careful balance. Complex in humans and generated through ancestral duplication events, here we review the evolution of the gene family in mammals, the potential importance of copy number, and functionally relevant single nucleotide polymorphisms, as well as discussing current approaches and limitations when exploring genetic variation in this region. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.


July 19, 2019

Single-molecule sequencing reveals complex genomic variation of hepatitis B virus during 15 years of chronic infection following liver transplantation.

Chronic hepatitis B (CHB) is prevalent worldwide. The infectious agent, hepatitis B virus (HBV) replicates via an RNA intermediate and is error-prone, leading to rapid generation of closely related but not identical viral variants, including those that can escape host immune responses and antiviral treatments. The complexity of CHB can be further enhanced by the presence of HBV variants with large deletions in the genome, generated via splicing (spHBV). Although spHBV variants are incapable of autonomous replication, their replication is rescued by wild-type HBV. SpHBV variants have been shown to enhance wild-type virus replication, and their prevalence increases with liver disease progression. Single-molecule deep sequencing was performed on whole HBV genomes extracted from longitudinal samples of a post-liver transplant CHB subject, collected over a 15-year period that included the liver explant. By employing novel bioinformatics methods, this analysis showed a complex dynamics of the viral population across a period of changing treatment regimens. The spHBV detected in the liver explant remained present post-transplantation, along with emergence of a highly diverse novel spHBV population as well as variants with multiple deletions in the preS genes. The identification of novel mutations outside the HBV reverse transcriptase gene that co-occur with known drug resistant mutations, highlight the relevance of using full genome deep sequencing and support the hypothesis that drug resistance involves interactions across the full-length HBV genome.Single-molecule sequencing allowed characterising, in unprecedented detail, the evolution of HBV populations and offered unique insights into the dynamics of defective and spHBV variants following liver transplantation and complex treatment regimes. This analysis also showed rapid adaptation of HBV populations to treatment regimens with evolving drug resistance phenotypes and evidence of purifying selection across the whole genome. Finally, the new open source bioinformatics tools are freely available, with the capacity to easily identify potential spliced variants from deep sequencing data. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 19, 2019

Defective HIV-1 proviruses produce novel protein-coding RNA species in HIV-infected patients on combination antiretroviral therapy.

Despite years of plasma HIV-RNA levels <40 copies per milliliter during combination antiretroviral therapy (cART), the majority of HIV-infected patients exhibit persistent seropositivity to HIV-1 and evidence of immune activation. These patients also show persistence of proviruses of HIV-1 in circulating peripheral blood mononuclear cells. Many of these proviruses have been characterized as defective and thus thought to contribute little to HIV-1 pathogenesis. By combining 5'LTR-to-3'LTR single-genome amplification and direct amplicon sequencing, we have identified the presence of "defective" proviruses capable of transcribing novel unspliced HIV-RNA (usHIV-RNA) species in patients at all stages of HIV-1 infection. Although these novel usHIV-RNA transcripts had exon structures that were different from those of the known spliced HIV-RNA variants, they maintained translationally competent ORFs, involving elements of gag, pol, env, rev, and nef to encode a series of novel HIV-1 chimeric proteins. These novel usHIV-RNAs were detected in five of five patients, including four of four patients with prolonged viral suppression of HIV-RNA levels <40 copies per milliliter for more than 6 y. Our findings suggest that the persistent defective proviruses of HIV-1 are not "silent," but rather may contribute to HIV-1 pathogenesis by stimulating host-defense pathways that target foreign nucleic acids and proteins.


July 19, 2019

Living apart together: crosstalk between the core and supernumerary genomes in a fungal plant pathogen.

Eukaryotes display remarkable genome plasticity, which can include supernumerary chromosomes that differ markedly from the core chromosomes. Despite the widespread occurrence of supernumerary chromosomes in fungi, their origin, relation to the core genome and the reason for their divergent characteristics are still largely unknown. The complexity of genome assembly due to the presence of repetitive DNA partially accounts for this.Here we use single-molecule real-time (SMRT) sequencing to assemble the genome of a prominent fungal wheat pathogen, Fusarium poae, including at least one supernumerary chromosome. The core genome contains limited transposable elements (TEs) and no gene duplications, while the supernumerary genome holds up to 25 % TEs and multiple gene duplications. The core genome shows all hallmarks of repeat-induced point mutation (RIP), a defense mechanism against TEs, specific for fungi. The absence of RIP on the supernumerary genome accounts for the differences between the two (sub)genomes, and results in a functional crosstalk between them. The supernumerary genome is a reservoir for TEs that migrate to the core genome, and even large blocks of supernumerary sequence (>200 kb) have recently translocated to the core. Vice versa, the supernumerary genome acts as a refuge for genes that are duplicated from the core genome.For the first time, a mechanism was determined that explains the differences that exist between the core and supernumerary genome in fungi. Different biology rather than origin was shown to be responsible. A “living apart together” crosstalk exists between the core and supernumerary genome, accelerating chromosomal and organismal evolution.


July 19, 2019

Comparative DNA methylation and gene expression analysis identifies novel genes for structural congenital heart diseases.

For the majority of congenital heart diseases (CHDs), the full complexity of the causative molecular network, which is driven by genetic, epigenetic, and environmental factors, is yet to be elucidated. Epigenetic alterations are suggested to play a pivotal role in modulating the phenotypic expression of CHDs and their clinical course during life. Candidate approaches implied that DNA methylation might have a developmental role in CHD and contributes to the long-term progress of non-structural cardiac diseases. The aim of the present study is to define the postnatal epigenome of two common cardiac malformations, representing epigenetic memory, and adaption to hemodynamic alterations, which are jointly relevant for the disease course.We present the first analysis of genome-wide DNA methylation data obtained from myocardial biopsies of Tetralogy of Fallot (TOF) and ventricular septal defect patients. We defined stringent sets of differentially methylated regions between patients and controls, which are significantly enriched for genomic features like promoters, exons, and cardiac enhancers. For TOF, we linked DNA methylation with genome-wide expression data and found a significant overlap for hypermethylated promoters and down-regulated genes, and vice versa. We validated and replicated the methylation of selected CpGs and performed functional assays. We identified a hypermethylated novel developmental CpG island in the promoter of SCO2 and demonstrate its functional impact. Moreover, we discovered methylation changes co-localized with novel, differential splicing events among sarcomeric genes as well as transcription factor binding sites. Finally, we demonstrated the interaction of differentially methylated and expressed genes in TOF with mutated CHD genes in a molecular network.By interrogating DNA methylation and gene expression data, we identify two novel mechanism contributing to the phenotypic expression of CHDs: aberrant methylation of promoter CpG islands and methylation alterations leading to differential splicing. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2016. For permissions please email: journals.permissions@oup.com.


July 19, 2019

Rapid functional and sequence differentiation of a tandemly repeated species-specific multigene family in Drosophila.

Gene clusters of recently duplicated genes are hotbeds for evolutionary change. However, our understanding of how mutational mechanisms and evolutionary forces shape the structural and functional evolution of these clusters is hindered by the high sequence identity among the copies, which typically results in their inaccurate representation in genome assemblies. The presumed testis-specific, chimeric gene Sdic originated, and tandemly expanded in Drosophila melanogaster, contributing to increased male-male competition. Using various types of massively parallel sequencing data, we studied the organization, sequence evolution, and functional attributes of the different Sdic copies. By leveraging long-read sequencing data, we uncovered both copy number and order differences from the currently accepted annotation for the Sdic region. Despite evidence for pervasive gene conversion affecting the Sdic copies, we also detected signatures of two episodes of diversifying selection, which have contributed to the evolution of a variety of C-termini and miRNA binding site compositions. Expression analyses involving RNA-seq datasets from 59 different biological conditions revealed distinctive expression breadths among the copies, with three copies being transcribed in females, opening the possibility to a sexually antagonistic effect. Phenotypic assays using Sdic knock-out strains indicated that should this antagonistic effect exist, it does not compromise female fertility. Our results strongly suggest that the genome consolidation of the Sdic gene cluster is more the result of a quick exploration of different paths of molecular tinkering by different copies than a mere dosage increase, which could be a recurrent evolutionary outcome in the presence of persistent sexual selection. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.