Cas9 can induce extensive on-target damage, including large deletions, inversions, and insertions.
Comprehensive genomic analysis of malignant pleural mesothelioma identifies recurrent mutations, gene fusions and splicing alterations.
We analyzed transcriptomes (n = 211), whole exomes (n = 99) and targeted exomes (n = 103) from 216 malignant pleural mesothelioma (MPM) tumors. Using RNA-seq data, we identified four distinct molecular subtypes: sarcomatoid, epithelioid, biphasic-epithelioid (biphasic-E) and biphasic-sarcomatoid (biphasic-S). Through exome analysis, we found BAP1, NF2, TP53, SETD2, DDX3X, ULK2, RYR2, CFAP45, SETDB1 and DDX51 to be significantly mutated (q-score = 0.8) in MPMs. We identified recurrent mutations in several genes, including SF3B1 (~2%; 4/216) and TRAF7 (~2%; 5/216). SF3B1-mutant samples showed a splicing profile distinct from that of wild-type tumors. TRAF7 alterations occurred primarily in the WD40 domain and were, except in one case, mutually exclusive with NF2 alterations. We found recurrent gene fusions and splice alterations to be frequent mechanisms for inactivation of NF2, BAP1 and SETD2. Through integrated analyses, we identified alterations in Hippo, mTOR, histone methylation, RNA helicase and p53 signaling pathways in MPMs.
Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Mixed fibrolamellar hepatocellular carcinoma (mFL-HCC) is a rare liver tumor defined by the presence of both pure FL-HCC and conventional HCC components, represents up to 25% of cases of FL-HCC, and has been associated with worse prognosis. Recent genomic characterization of pure FL-HCC identified a highly recurrent transcript fusion (DNAJB1:PRKACA) not found in conventional HCC.We performed exome and transcriptome sequencing of a case of mFL-HCC. A novel BAC-capture approach was developed to identify a 400 kb deletion as the underlying genomic mechanism for a DNAJB1:PRKACA fusion in this case. A sensitive Nanostring Elements assay was used to screen for this transcript fusion in a second case of mFL-HCC, 112 additional HCC samples and 44 adjacent non-tumor liver samples.We report the first comprehensive genomic analysis of a case of mFL-HCC. No common HCC-associated mutations were identified. The very low mutation rate of this case, large number of mostly single-copy, long-range copy number variants, and high expression of ERBB2 were more consistent with previous reports of pure FL-HCC than conventional HCC. In particular, the DNAJB1:PRKACA fusion transcript specifically associated with pure FL-HCC was detected at very high expression levels. Subsequent analysis revealed the presence of this fusion in all primary and metastatic samples, including those with mixed or conventional HCC pathology. A second case of mFL-HCC confirmed our finding that the fusion was detectable in conventional components. An expanded screen identified a third case of fusion-positive HCC, which upon review, also had both conventional and fibrolamellar features. This screen confirmed the absence of the fusion in all conventional HCC and adjacent non-tumor liver samples.These results indicate that mFL-HCC is similar to pure FL-HCC at the genomic level and the DNAJB1:PRKACA fusion can be used as a diagnostic tool for both pure and mFL-HCC.© The Author 2016. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid genomes along the entire length of all chromosomes. We demonstrate this by building a complete haplotype for a HapMap individual (NA12878) at high accuracy (concordance 99.3%), without using generational information or statistical inference. By use of this approach, we mapped all meiotic recombination events in a family trio with high resolution (median range ~14 kb) and phased larger structural variants like deletions, indels, and balanced rearrangements like inversions. Lastly, the single-cell resolution of Strand-seq allowed us to observe loss of heterozygosity regions in a small number of cells, a significant advantage for studies of heterogeneous cell populations, such as cancer cells. We conclude that Strand-seq is a unique and powerful approach to completely phase individual genomes and map inheritance patterns in families, while preserving haplotype differences between single cells.© 2016 Porubský et al.; Published by Cold Spring Harbor Laboratory Press.
Somatic second hit mutation of RASA1 in vascular endothelial cells in capillary malformation-arteriovenous malformation.
Capillary malformation-arteriovenous malformation (CM-AVM) is an autosomal dominant vascular disorder that is associated with inherited inactivating mutations of the RASA1 gene in the majority of cases. Characteristically, patients exhibit one or more focal cutaneous CM that may occur alone or together with AVM, arteriovenous fistulas or lymphatic vessel abnormalities. The focal nature and varying presentation of lesions has led to the hypothesis that somatic “second hit” inactivating mutations of RASA1 are necessary for disease development. In this study, we examined CM from four different CM-AVM patients for the presence of somatically acquired RASA1 mutations. All four patients were shown to possess inactivating heterozygous germline RASA1 mutations. In one of the patients, a somatic inactivating RASA1 mutation (c.1534C > T, p.Arg512*) was additionally identified in CM lesion tissue. The somatic RASA1 mutation was detected within endothelial cells specifically and was in trans with the germline RASA1 mutation. Together with the germline RASA1 mutation (c.2125C > T, p.Arg709*) in the same patient, the endothelial cell somatic RASA1 mutation likely contributed to lesion development. These studies provide the first clear evidence of the second hit model of CM-AVM pathogenesis. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Nuclear and mitochondrial genomes of the hybrid fungal plant pathogen Verticillium longisporum display a mosaic structure
Allopolyploidization, genome duplication through interspecific hybridization, is an important evolutionary mechanism that can enable organisms to adapt to environmental changes or stresses. This increased adaptive potential of allopolyploids can be particularly relevant for plant pathogens in their quest for host immune response evasion. Allodiploidization likely caused the shift in host range of the fungal pathogen plant Verticillium longisporum, as V. longisporum mainly infects Brassicaceae plants in contrast to haploid Verticillium spp. In this study, we investigated the allodiploid genome structure of V. longisporum and its evolution in the hybridization aftermath. The nuclear genome of V. longisporum displays a mosaic structure, as numerous contigs consists of sections of both parental origins. V. longisporum encountered extensive genome rearrangements, whereas the contribution of gene conversion is negligible. Thus, the mosaic genome structure mainly resulted from genomic rearrangements between parental chromosome sets. Furthermore, a mosaic structure was also found in the mitochondrial genome, demonstrating its bi-parental inheritance. In conclusion, the nuclear and mitochondrial genomes of V. longisporum parents interacted dynamically in the hybridization aftermath. Conceivably, novel combinations of DNA sequence of different parental origin facilitated genome stability after hybridization and consecutive niche adaptation of V. longisporum.
Complete genome sequence and analysis of the industrial Saccharomyces cerevisiae strain N85 used in Chinese rice wine production.
Chinese rice wine is a popular traditional alcoholic beverage in China, while its brewing processes have rarely been explored. We herein report the first gapless, near-finished genome sequence of the yeast strain Saccharomyces cerevisiae N85 for Chinese rice wine production. Several assembly methods were used to integrate Pacific Bioscience (PacBio) and Illumina sequencing data to achieve high-quality genome sequencing of the strain. The genome encodes more than 6,000 predicted proteins, and 238 long non-coding RNAs, which are validated by RNA-sequencing data. Moreover, our annotation predicts 171 novel genes that are not present in the reference S288c genome. We also identified 65,902 single nucleotide polymorphisms and small indels, many of which are located within genic regions. Dozens of larger copy-number variations and translocations were detected, mainly enriched in the subtelomeres, suggesting these regions may be related to genomic evolution. This study will serve as a milestone in studying of Chinese rice wine and related beverages in China and in other countries. It will help to develop more scientific and modern fermentation processes of Chinese rice wine, and explore metabolism pathways of desired and harmful components in Chinese rice wine to improve its taste and nutritional value.© The Author(s) 2018. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
DNA strand-exchange patterns associated with double-strand break-induced and spontaneous mitotic crossovers in Saccharomyces cerevisiae.
Mitotic recombination can result in loss of heterozygosity and chromosomal rearrangements that shape genome structure and initiate human disease. Engineered double-strand breaks (DSBs) are a potent initiator of recombination, but whether spontaneous events initiate with the breakage of one or both DNA strands remains unclear. In the current study, a crossover (CO)-specific assay was used to compare heteroduplex DNA (hetDNA) profiles, which reflect strand exchange intermediates, associated with DSB-induced versus spontaneous events in yeast. Most DSB-induced CO products had the two-sided hetDNA predicted by the canonical DSB repair model, with a switch in hetDNA position from one product to the other at the position of the break. Approximately 40% of COs, however, had hetDNA on only one side of the initiating break. This anomaly can be explained by a modified model in which there is frequent processing of an early invasion (D-loop) intermediate prior to extension of the invading end. Finally, hetDNA tracts exhibited complexities consistent with frequent expansion of the DSB into a gap, migration of strand-exchange junctions, and template switching during gap-filling reactions. hetDNA patterns in spontaneous COs isolated in either a wild-type background or in a background with elevated levels of reactive oxygen species (tsa1? mutant) were similar to those associated with the DSB-induced events, suggesting that DSBs are the major instigator of spontaneous mitotic recombination in yeast.
Kluyveromyces marxianus is traditionally associated with fermented dairy products, but can also be isolated from diverse non-dairy environments. Because of thermotolerance, rapid growth and other traits, many different strains are being developed for food and industrial applications but there is, as yet, little understanding of the genetic diversity or population genetics of this species. K. marxianus shows a high level of phenotypic variation but the only phenotype that has been clearly linked to a genetic polymorphism is lactose utilisation, which is controlled by variation in the LAC12 gene. The genomes of several strains have been sequenced in recent years and, in this study, we sequenced a further nine strains from different origins. Analysis of the Single Nucleotide Polymorphisms (SNPs) in 14 strains was carried out to examine genome structure and genetic diversity. SNP diversity in K. marxianus is relatively high, with up to 3% DNA sequence divergence between alleles. It was found that the isolates include haploid, diploid, and triploid strains, as shown by both SNP analysis and flow cytometry. Diploids and triploids contain long genomic tracts showing loss of heterozygosity (LOH). All six isolates from dairy environments were diploid or triploid, whereas 6 out 7 isolates from non-dairy environment were haploid. This also correlated with the presence of functional LAC12 alleles only in dairy haplotypes. The diploids were hybrids between a non-dairy and a dairy haplotype, whereas triploids included three copies of a dairy haplotype.
Large-scale population genomic surveys are essential to explore the phenotypic diversity of natural populations. Here we report the whole-genome sequencing and phenotyping of 1,011 Saccharomyces cerevisiae isolates, which together provide an accurate evolutionary picture of the genomic variants that shape the species-wide phenotypic landscape of this yeast. Genomic analyses support a single ‘out-of-China’ origin for this species, followed by several independent domestication events. Although domesticated isolates exhibit high variation in ploidy, aneuploidy and genome content, genome evolution in wild isolates is mainly driven by the accumulation of single nucleotide polymorphisms. A common feature is the extensive loss of heterozygosity, which represents an essential source of inter-individual variation in this mainly asexual species. Most of the single nucleotide polymorphisms, including experimentally identified functional polymorphisms, are present at very low frequencies. The largest numbers of variants identified by genome-wide association are copy-number changes, which have a greater phenotypic effect than do single nucleotide polymorphisms. This resource will guide future population genomics and genotype-phenotype studies in this classic model system.
Bdelloid rotifers are a class of microscopic invertebrates that have existed for millions of years apparently without sex or meiosis. They inhabit a variety of temporary and permanent freshwater habitats globally, and many species are remarkably tolerant of desiccation. Bdelloids offer an opportunity to better understand the evolution of sex and recombination, but previous work has emphasised desiccation as the cause of several unusual genomic features in this group. Here, we present high-quality whole-genome sequences of 3 bdelloid species: Rotaria macrura and R. magnacalcarata, which are both desiccation intolerant, and Adineta ricciae, which is desiccation tolerant. In combination with the published assembly of A. vaga, which is also desiccation tolerant, we apply a comparative genomics approach to evaluate the potential effects of desiccation tolerance and asexuality on genome evolution in bdelloids. We find that ancestral tetraploidy is conserved among all 4 bdelloid species, but homologous divergence in obligately aquatic Rotaria genomes is unexpectedly low. This finding is contrary to current models regarding the role of desiccation in shaping bdelloid genomes. In addition, we find that homologous regions in A. ricciae are largely collinear and do not form palindromic repeats as observed in the published A. vaga assembly. Consequently, several features interpreted as genomic evidence for long-term ameiotic evolution are not general to all bdelloid species, even within the same genus. Finally, we substantiate previous findings of high levels of horizontally transferred nonmetazoan genes in both desiccating and nondesiccating bdelloid species and show that this unusual feature is not shared by other animal phyla, even those with desiccation-tolerant representatives. These comparisons call into question the proposed role of desiccation in mediating horizontal genetic transfer.
The Phytophthora cactorum genome provides insights into the adaptation to host defense compounds and fungicides.
Phytophthora cactorum is a homothallic oomycete pathogen, which has a wide host range and high capability to adapt to host defense compounds and fungicides. Here we report the 121.5?Mb genome assembly of the P. cactorum using the third-generation single-molecule real-time (SMRT) sequencing technology. It is the second largest genome sequenced so far in the Phytophthora genera, which contains 27,981 protein-coding genes. Comparison with other Phytophthora genomes showed that P. cactorum had a closer relationship with P. parasitica, P. infestans and P. capsici. P. cactorum has similar gene families in the secondary metabolism and pathogenicity-related effector proteins compared with other oomycete species, but specific gene families associated with detoxification enzymes and carbohydrate-active enzymes (CAZymes) underwent expansion in P. cactorum. P. cactorum had a higher utilization and detoxification ability against ginsenosides-a group of defense compounds from Panax notoginseng-compared with the narrow host pathogen P. sojae. The elevated expression levels of detoxification enzymes and hydrolase activity-associated genes after exposure to ginsenosides further supported that the high detoxification and utilization ability of P. cactorum play a crucial role in the rapid adaptability of the pathogen to host plant defense compounds and fungicides.
Characterization of phenotypic variation and genome aberrations observed among Phytophthora ramorum isolates from diverse hosts.
Accumulating evidence suggests that genome plasticity allows filamentous plant pathogens to adapt to changing environments. Recently, the generalist plant pathogen Phytophthora ramorum has been documented to undergo irreversible phenotypic alterations accompanied by chromosomal aberrations when infecting trunks of mature oak trees (genus Quercus). In contrast, genomes and phenotypes of the pathogen derived from the foliage of California bay (Umbellularia californica) are usually stable. We define this phenomenon as host-induced phenotypic diversification (HIPD). P. ramorum also causes a severe foliar blight in some ornamental plants such as Rhododendron spp. and Viburnum spp., and isolates from these hosts occasionally show phenotypes resembling those from oak trunks that carry chromosomal aberrations. The aim of this study was to investigate variations in phenotypes and genomes of P. ramorum isolates from non-oak hosts and substrates to determine whether HIPD changes may be equivalent to those among isolates from oaks.We analyzed genomes of diverse non-oak isolates including those taken from foliage of Rhododendron and other ornamental plants, as well as from natural host species, soil, and water. Isolates recovered from artificially inoculated oak logs were also examined. We identified diverse chromosomal aberrations including copy neutral loss of heterozygosity (cnLOH) and aneuploidy in isolates from non-oak hosts. Most identified aberrations in non-oak hosts were also common among oak isolates; however, trisomy, a frequent type of chromosomal aberration in oak isolates was not observed in isolates from Rhododendron.This work cross-examined phenotypic variation and chromosomal aberrations in P. ramorum isolates from oak and non-oak hosts and substrates. The results suggest that HIPD comparable to that occurring in oak hosts occurs in non-oak environments such as in Rhododendron leaves. Rhododendron leaves are more easily available than mature oak stems and thus can potentially serve as a model host for the investigation of HIPD, the newly described plant-pathogen interaction.
Phenotypic diversification by enhanced genome restructuring after induction of multiple DNA double-strand breaks.
DNA double-strand break (DSB)-mediated genome rearrangements are assumed to provide diverse raw genetic materials enabling accelerated adaptive evolution; however, it remains unclear about the consequences of massive simultaneous DSB formation in cells and their resulting phenotypic impact. Here, we establish an artificial genome-restructuring technology by conditionally introducing multiple genomic DSBs in vivo using a temperature-dependent endonuclease TaqI. Application in yeast and Arabidopsis thaliana generates strains with phenotypes, including improved ethanol production from xylose at higher temperature and increased plant biomass, that are stably inherited to offspring after multiple passages. High-throughput genome resequencing revealed that these strains harbor diverse rearrangements, including copy number variations, translocations in retrotransposons, and direct end-joinings at TaqI-cleavage sites. Furthermore, large-scale rearrangements occur frequently in diploid yeasts (28.1%) and tetraploid plants (46.3%), whereas haploid yeasts and diploid plants undergo minimal rearrangement. This genome-restructuring system (TAQing system) will enable rapid genome breeding and aid genome-evolution studies.