The accumulations of different types of genetic alterations such as nucleotide substitutions, structural rearrangements and viral genome integrations and epigenetic alterations contribute to carcinogenesis. Here, we report correlation between the occurrence of epigenetic features and genetic aberrations by whole-genome bisulfite, whole-genome shotgun, long-read, and virus capture sequencing of 373 liver cancers. Somatic substitutions and rearrangement breakpoints are enriched in tumor-specific hypo-methylated regions with inactive chromatin marks and actively transcribed highly methylated regions in the cancer genome. Individual mutation signatures depend on chromatin status, especially, signatures with a higher transcriptional strand bias occur within active chromatic areas. Hepatitis B virus (HBV) integration sites are frequently detected within inactive chromatin regions in cancer cells, as a consequence of negative selection for integrations in active chromatin regions. Ultra-high structural instability and preserved unmethylation of integrated HBV genomes are observed. We conclude that both precancerous and somatic epigenetic features contribute to the cancer genome architecture.
Eukaryotic genomes are replete with repeated sequences in the form of transposable elements (TEs) dispersed across the genome or as satellite arrays, large stretches of tandemly repeated sequences. Many satellites clearly originated as TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling is needed to determine the frequency and generative mechanisms of tandem TEs, at all stages from their initial formation to their subsequent expansion and maintenance as satellites. The best available population resources, short-read DNA sequences, are often considered to be of limited utility for analyzing repetitive DNA due to the challenge of mapping individual repeats to unique genomic locations. Here we develop a new pipeline called ConTExt that demonstrates that paired-end Illumina data can be successfully leveraged to identify a wide range of structural variation within repetitive sequence, including tandem elements. By analyzing 85 genomes from five populations of Drosophila melanogaster, we discover that TEs commonly form tandem dimers. Our results further suggest that insertion site preference is the major mechanism by which dimers arise and that, consequently, dimers form rapidly during periods of active transposition. This abundance of TE dimers has the potential to provide source material for future expansion into satellite arrays, and we discover one such copy number expansion of the DNA transposon hobo to approximately 16 tandem copies in a single line. The very process that defines TEs-transposition-thus regularly generates sequences from which new satellites can arise.© 2018 McGurk and Barbash; Published by Cold Spring Harbor Laboratory Press.
The hpRNA/RNAi pathway is essential to resolve intragenomic conflict in the Drosophila male germline.
Intragenomic conflicts are fueled by rapidly evolving selfish genetic elements, which induce selective pressures to innovate opposing repressive mechanisms. This is patently manifest in sex-ratio (SR) meiotic drive systems, in which distorter and suppressor factors bias and restore equal transmission of X and Y sperm. Here, we reveal that multiple SR suppressors in Drosophila simulans (Nmy and Tmy) encode related hairpin RNAs (hpRNAs), which generate endo-siRNAs that repress the paralogous distorters Dox and MDox. All components in this drive network are recently evolved and largely testis restricted. To connect SR hpRNA function to the RNAi pathway, we generated D. simulans null mutants of Dcr-2 and AGO2. Strikingly, these core RNAi knockouts massively derepress Dox and MDox and are in fact completely male sterile and exhibit highly defective spermatogenesis. Altogether, our data reveal how the adaptive capacity of hpRNAs is critically deployed to restrict selfish gonadal genetic systems that can exterminate a species. Copyright © 2018 Elsevier Inc. All rights reserved.
Asymmetric processing of DNA ends at a double-strand break leads to unconstrained dynamics and ectopic translocation.
Multiple pathways regulate the repair of double-strand breaks (DSBs) to suppress potentially dangerous ectopic recombination. Both sequence and chromatin context are thought to influence pathway choice between non-homologous end-joining (NHEJ) and homology-driven recombination. To test the effect of repetitive sequences on break processing, we have inserted TG-rich repeats on one side of an inducible DSB at the budding yeast MAT locus on chromosome III. Five clustered Rap1 sites within a break-proximal TG repeat are sufficient to block Mre11-Rad50-Xrs2 recruitment, impair resection, and favor elongation by telomerase. The two sides of the break lose end-to-end tethering and show enhanced, uncoordinated movement. Only the TG-free side is resected and shifts to the nuclear periphery. In contrast to persistent DSBs without TG repeats that are repaired by imprecise NHEJ, nearly all survivors of repeat-proximal DSBs repair the break by a homology-driven, non-reciprocal translocation from ChrIII-R to ChrVII-L. This suppression of imprecise NHEJ at TG-repeat-flanked DSBs requires the Uls1 translocase activity. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
The haploid genome of the pathogenic fungus Zymoseptoria tritici is contained on “core” and “accessory” chromosomes. While 13 core chromosomes are found in all strains, as many as eight accessory chromosomes show presence/absence variation and rearrangements among field isolates. The factors influencing these presence/absence polymorphisms are so far unknown. We investigated chromosome stability using experimental evolution, karyotyping, and genome sequencing. We report extremely high and variable rates of accessory chromosome loss during mitotic propagation in vitro and in planta Spontaneous chromosome loss was observed in 2 to >50% of cells during 4 weeks of incubation. Similar rates of chromosome loss in the closely related Zymoseptoria ardabiliae suggest that this extreme chromosome dynamic is a conserved phenomenon in the genus. Elevating the incubation temperature greatly increases instability of accessory and even core chromosomes, causing severe rearrangements involving telomere fusion and chromosome breakage. Chromosome losses do not affect the fitness of Zymoseptoria tritici in vitro, but some lead to increased virulence, suggesting an adaptive role of this extraordinary chromosome instability. Copyright © 2018 by the Genetics Society of America.
The genome of tapeworm Taenia multiceps sheds light on understanding parasitic mechanism and control of coenurosis disease.
Coenurosis, caused by the larval coenurus of the tapeworm Taenia multiceps, is a fatal central nervous system disease in both sheep and humans. Though treatment and prevention options are available, the control of coenurosis still faces presents great challenges. Here, we present a high-quality genome sequence of T. multiceps in which 240 Mb (96%) of the genome has been successfully assembled using Pacbio single-molecule real-time (SMRT) and Hi-C data with a N50 length of 44.8 Mb. In total, 49.5 Mb (20.6%) repeat sequences and 13, 013 gene models were identified. We found that Taenia spp. have an expansion of transposable elements and recent small-scale gene duplications following the divergence of Taenia from Echinococcus, but not in Echinococcus genomes, and the genes underlying environmental adaptability and dosage effect tend to be over-retained in the T. multiceps genome. Moreover, we identified several genes encoding proteins involved in proglottid formation and interactions with the host central nervous system, which may contribute to the adaption of T. multiceps to its parasitic life style. Our study not only provides insights into the biology and evolution of T. multiceps, but also identifies a set of species-specific gene targets for developing novel treatment and control tools for coenurosis.
Meiotic drive is widespread in nature. The conflict it generates is expected to be an important motor for evolutionary change and innovation. In this study, we investigated the genomic consequences of two large multi-gene meiotic drive elements, Sk-2 and Sk-3, found in the filamentous ascomycete Neurospora intermedia. Using long-read sequencing, we generated the first complete and well-annotated genome assemblies of large, highly diverged, non-recombining regions associated with meiotic drive elements. Phylogenetic analysis shows that, even though Sk-2 and Sk-3 are located in the same chromosomal region, they do not form sister clades, suggesting independent origins or at least a long evolutionary separation. We conclude that they have in a convergent manner accumulated similar patterns of tandem inversions and dense repeat clusters, presumably in response to similar needs to create linkage between genes causing drive and resistance.
Chromosomal structural variations (SV) including insertions, deletions, inversions, and translocations occur within the genome and can have a significant effect on organismal phenotype. Some of these effects are caused by structural variations containing genes. Large structural variations represent a significant amount of the genetic diversity within a population. We used a global sampling of Drosophila melanogaster (Ithaca, Zimbabwe, Beijing, Tasmania, and Netherlands) to represent diverse populations within the species. We used long-read sequencing and optical mapping technologies to identify SVs in these genomes. Among the five lines examined, we found an average of 2,928 structural variants within these genomes. These structural variations varied greatly in size and location, included many exonic regions, and could impact adaptation and genomic evolution. Copyright © 2018 Long et al.
Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.
The evolutionary dynamics of the conflict between transposable elements (TEs) and their host genome remain elusive. This conflict will be intense in stress-adapted plants as stress can often reactivate TEs. Mangroves reduce TE load convergently in their adaptation to intertidal environments and thus provide a unique opportunity to address the host-TE conflict and its interaction with stress adaptation. Using the mangrove Rhizophora apiculata as a model, we investigated methylation and short interfering RNA (siRNA) targeting patterns in relation to the abundance and age of long terminal repeat (LTR) retrotransposons. We also examined the distance of LTR retrotransposons to genes, the impact on neighboring gene expression and population frequencies. We found differential accumulation amongst classes of LTR retrotransposons despite high overall methylation levels. This can be attributed to 24-nucleotide siRNA-mediated CHH methylation preferentially targeting Gypsy elements, particularly in their LTR regions. Old Gypsy elements possess unusually abundant siRNAs which show cross-mapping to young copies. Gypsy elements appear to be closer to genes and under stronger purifying selection than other classes. Our results suggest a continuous host-TE battle masked by the TE load reduction in R. apiculata. This conflict may enable mangroves, such as R. apiculata, to maintain genetic diversity and thus evolutionary potential during stress adaptation.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Comparison of genome sequences of wild emmer wheat and Aegilops tauschii suggests a novel scenario of the evolution of rearranged wheat chromosomes 4A, 5A, and 7B. Past research suggested that wheat chromosome 4A was subjected to a reciprocal translocation T(4AL;5AL)1 that occurred in the diploid progenitor of the wheat A subgenome and to three major rearrangements that occurred in polyploid wheat: pericentric inversion Inv(4AS;4AL)1, paracentric inversion Inv(4AL;4AL)1, and reciprocal translocation T(4AL;7BS)1. Gene collinearity along the pseudomolecules of tetraploid wild emmer wheat (Triticum turgidum ssp. dicoccoides, subgenomes AABB) and diploid Aegilops tauschii (genomes DD) was employed to confirm these rearrangements and to analyze the breakpoints. The exchange of distal regions of chromosome arms 4AS and 4AL due to pericentric inversion Inv(4AS;4AL)1 was detected, and breakpoints were validated with an optical Bionano genome map. Both breakpoints contained satellite DNA. The breakpoints of reciprocal translocation T(4AL;7BS)1 were also found. However, the breakpoints that generated paracentric inversion Inv(4AL;4AL)1 appeared to be collocated with the 4AL breakpoints that had produced Inv(4AS;4AL)1 and T(4AL;7BS)1. Inv(4AS;4AL)1, Inv(4AL;4AL)1, and T(4AL;7BS)1 either originated sequentially, and Inv(4AL;4AL)1 was produced by recurrent chromosome breaks at the same breakpoints that generated Inv(4AS;4AL)1 and T(4AL;7BS)1, or Inv(4AS;4AL)1, Inv(4AL;4AL)1, and T(4AL;7BS)1 originated simultaneously. We prefer the latter hypothesis since it makes fewer assumptions about the sequence of events that produced these chromosome rearrangements.
Unexpected patterns of segregation distortion at a selfish supergene in the fire ant Solenopsis invicta.
The Sb supergene in the fire ant Solenopsis invicta determines the form of colony social organization, with colonies whose inhabitants bear the element containing multiple reproductive queens and colonies lacking it containing only a single queen. Several features of this supergene – including suppressed recombination, presence of deleterious mutations, association with a large centromere, and “green-beard” behavior – suggest that it may be a selfish genetic element that engages in transmission ratio distortion (TRD), defined as significant departures in progeny allele frequencies from Mendelian inheritance ratios. We tested this possibility by surveying segregation ratios in embryo progenies of 101 queens of the “polygyne” social form (3512 embryos) using three supergene-linked markers and twelve markers outside the supergene.Significant departures from Mendelian ratios were observed at the supergene loci in 3-5 times more progenies than expected in the absence of TRD and than found, on average, among non-supergene loci. Also, supergene loci displayed the greatest mean deviations from Mendelian ratios among all study loci, although these typically were modest. A surprising feature of the observed inter-progeny variation in TRD was that significant deviations involved not only excesses of supergene alleles but also similarly frequent excesses of the alternate alleles on the homologous chromosome. As expected given the common occurrence of such “drive reversal” in this system, alleles associated with the supergene gain no consistent transmission advantage over their alternate alleles at the population level. Finally, we observed low levels of recombination and incomplete gametic disequilibrium across the supergene, including between adjacent markers within a single inversion.Our data confirm the prediction that the Sb supergene is a selfish genetic element capable of biasing its own transmission during reproduction, yet counterselection for suppressor loci evidently has produced an evolutionary stalemate in TRD between the variant homologous haplotypes on the “social chromosome”. Evidence implicates prezygotic segregation distortion as responsible for the TRD we document, with “true” meiotic drive the most likely mechanism. Low levels of recombination and incomplete gametic disequilibrium across the supergene suggest that selection does not preserve a single uniform supergene haplotype responsible for inducing polygyny.
Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response.
Loss of Rap1 supports recombination-based telomere maintenance independent of RNA-DNA hybrids in fission yeast
To investigate the molecular changes needed for cells to maintain their telomeres by recombination, we monitored telomere appearance during serial culture of fission yeast cells lacking the telomerase recruitment factor Ccq1. Rad52 is loaded onto critically short telomeres shortly after germination despite continued telomere erosion, suggesting that recruitment of recombination factors is not sufficient to maintain telomeres in the absence of telomerase function. Instead, survivor formation coincides with the derepression of telomeric repeat-containing RNA (TERRA). Degradation of telomere-associated TERRA in this context drives a severe growth crisis, ultimately leading to a distinct type of linear survivor with altered cytological telomere characteristics and the eviction of the shelterin component Rap1 (but not the TRF1/TRF2 orthologue, Taz1) from the telomere. We demonstrate that deletion of Rap1 is protective, preventing the growth crisis that is otherwise triggered by degradation of telomere-engaged TERRA in survivors with linear chromosomes. Thus, modulating the stoichiometry of shelterin components appears to support recombination-dependent survivors to persist in the absence of telomere-engaged TERRA.
Meiosis is a key cellular process of sexual reproduction that includes pairing of homologous sequences. In many species however, meiosis can also involve the segregation of supernumerary chromosomes, which can lack a homolog. How these unpaired chromosomes undergo meiosis is largely unknown. In this study we investigated chromosome segregation during meiosis in the haploid fungus Zymoseptoria tritici that possesses a large complement of supernumerary chromosomes. We used isogenic whole chromosome deletion strains to compare meiotic transmission of chromosomes when paired and unpaired. Unpaired chromosomes inherited from the male parent as well as paired supernumerary chromosomes in general showed Mendelian inheritance. In contrast, unpaired chromosomes inherited from the female parent showed non-Mendelian inheritance but were amplified and transmitted to all meiotic products. We concluded that the supernumerary chromosomes of Z. tritici show a meiotic drive and propose an additional feedback mechanism during meiosis, which initiates amplification of unpaired female-inherited chromosomes.© 2018, Habig et al.
Advances in deciphering the functional architecture of eukaryotic genomes have been facilitated by recent breakthroughs in sequencing technologies, enabling a more comprehensive representation of genes and repeat elements in genome sequence assemblies, as well as more sensitive and tissue-specific analyses of gene expression. Here we show that PacBio sequencing has led to a substantially improved genome assembly of Medicago truncatula A17, a legume model species notable for endosymbiosis studies1, and has enabled the identification of genome rearrangements between genotypes at a near-base-pair resolution. Annotation of the new M. truncatula genome sequence has allowed for a thorough analysis of transposable elements and their dynamics, as well as the identification of new players involved in symbiotic nodule development, in particular 1,037 upregulated long non-coding RNAs (lncRNAs). We have also discovered that a substantial proportion (~35% and 38%, respectively) of the genes upregulated in nodules or expressed in the nodule differentiation zone colocalize in genomic clusters (270 and 211, respectively), here termed symbiotic islands. These islands contain numerous expressed lncRNA genes and display differentially both DNA methylation and histone marks. Epigenetic regulations and lncRNAs are therefore attractive candidate elements for the orchestration of symbiotic gene expression in the M. truncatula genome.