Domestication of clonally propagated crops such as pineapple from South America was hypothesized to be a ‘one-step operation’. We sequenced the genome of Ananas comosus var. bracteatus CB5 and assembled 513?Mb into 25 chromosomes with 29,412 genes. Comparison of the genomes of CB5, F153 and MD2 elucidated the genomic basis of fiber production, color formation, sugar accumulation and fruit maturation. We also resequenced 89 Ananas genomes. Cultivars ‘Smooth Cayenne’ and ‘Queen’ exhibited ancient and recent admixture, while ‘Singapore Spanish’ supported a one-step operation of domestication. We identified 25 selective sweeps, including a strong sweep containing a pair of tandemly duplicated bromelain inhibitors. Four candidate genes for self-incompatibility were linked in F153, but were not functional in self-compatible CB5. Our findings support the coexistence of sexual recombination and a one-step operation in the domestication of clonally propagated crops. This work guides the exploration of sexual and asexual domestication trajectories in other clonally propagated crops.
Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.
Brassica napus (AACC, 2n = 38) is an important oilseed crop grown worldwide. However, little is known about the population evolution of this species, the genomic difference between its major genetic groups, such as European and Asian rapeseed, and the impacts of historical large-scale introgression events on this young tetraploid. In this study, we reported the de novo assembly of the genome sequences of an Asian rapeseed (B. napus), Ningyou 7, and its four progenitors and compared these genomes with other available genomic data from diverse European and Asian cultivars. Our results showed that Asian rapeseed originally derived from European rapeseed but subsequently significantly diverged, with rapid genome differentiation after hybridization and intensive local selective breeding. The first historical introgression of B. rapa dramatically broadened the allelic pool but decreased the deleterious variations of Asian rapeseed. The second historical introgression of the double-low traits of European rapeseed (canola) has reshaped Asian rapeseed into two groups (double-low and double-high), accompanied by an increase in genetic load in the double-low group. This study demonstrates distinctive genomic footprints and deleterious SNP (single nucleotide polymorphism) variants for local adaptation by recent intra- and interspecies introgression events and provides novel insights for understanding the rapid genome evolution of a young allopolyploid crop. © 2019 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Suppressed recombination allows divergence between homologous sex chromosomes and the functionality of their genes. Here, we reveal patterns of the earliest stages of sex-chromosome evolution in the diploid dioecious herb Mercurialis annua on the basis of cytological analysis, de novo genome assembly and annotation, genetic mapping, exome resequencing of natural populations, and transcriptome analysis. The genome assembly contained 34,105 expressed genes, of which 10,076 were assigned to linkage groups. Genetic mapping and exome resequencing of individuals across the species range both identified the largest linkage group, LG1, as the sex chromosome. Although the sex chromosomes of M. annua are karyotypically homomorphic, we estimate that about a third of the Y chromosome has ceased recombining, containing 568 transcripts and spanning 22.3 cM in the corresponding female map. Nevertheless, we found limited evidence for Y-chromosome degeneration in terms of gene loss and pseudogenization, and most X- and Y-linked genes appear to have diverged in the period subsequent to speciation between M. annua and its sister species M. huetii which shares the same sex-determining region. Taken together, our results suggest that the M. annua Y chromosome has at least two evolutionary strata: a small old stratum shared with M. huetii, and a more recent larger stratum that is probably unique to M. annua and that stopped recombining about one million years ago. Patterns of gene expression within the non-recombining region are consistent with the idea that sexually antagonistic selection may have played a role in favoring suppressed recombination.Copyright © 2019, Genetics.
African cichlid fishes are well known for their rapid radiations and are a model system for studying evolutionary processes. Here we compare multiple, high-quality, chromosome-scale genome assemblies to elucidate the genetic mechanisms underlying cichlid diversification and study how genome structure evolves in rapidly radiating lineages.We re-anchored our recent assembly of the Nile tilapia (Oreochromis niloticus) genome using a new high-density genetic map. We also developed a new de novo genome assembly of the Lake Malawi cichlid, Metriaclima zebra, using high-coverage Pacific Biosciences sequencing, and anchored contigs to linkage groups (LGs) using 4 different genetic maps. These new anchored assemblies allow the first chromosome-scale comparisons of African cichlid genomes. Large intra-chromosomal structural differences (~2-28 megabase pairs) among species are common, while inter-chromosomal differences are rare (<10 megabase pairs total). Placement of the centromeres within the chromosome-scale assemblies identifies large structural differences that explain many of the karyotype differences among species. Structural differences are also associated with unique patterns of recombination on sex chromosomes. Structural differences on LG9, LG11, and LG20 are associated with reduced recombination, indicative of inversions between the rock- and sand-dwelling clades of Lake Malawi cichlids. M. zebra has a larger number of recent transposable element insertions compared with O. niloticus, suggesting that several transposable element families have a higher rate of insertion in the haplochromine cichlid lineage.This study identifies novel structural variation among East African cichlid genomes and provides a new set of genomic resources to support research on the mechanisms driving cichlid adaptation and speciation. © The Author(s) 2019. Published by Oxford University Press.
Newly emerged wheat blast disease is a serious threat to global wheat production. Wheat blast is caused by a distinct, exceptionally diverse lineage of the fungus causing rice blast disease. Through sequencing a recent field isolate, we report a reference genome that includes seven core chromosomes and mini-chromosome sequences that harbor effector genes normally found on ends of core chromosomes in other strains. No mini-chromosomes were observed in an early field strain, and at least two from another isolate each contain different effector genes and core chromosome end sequences. The mini-chromosome is enriched in transposons occurring most frequently at core chromosome ends. Additionally, transposons in mini-chromosomes lack the characteristic signature for inactivation by repeat-induced point (RIP) mutation genome defenses. Our results, collectively, indicate that dispensable mini-chromosomes and core chromosomes undergo divergent evolutionary trajectories, and mini-chromosomes and core chromosome ends are coupled as a mobile, fast-evolving effector compartment in the wheat pathogen genome.
A whole genome scan of SNP data suggests a lack of abundant hard selective sweeps in the genome of the broad host range plant pathogenic fungus Sclerotinia sclerotiorum.
The pathogenic fungus Sclerotinia sclerotiorum infects over 600 species of plant. It is present in numerous environments throughout the world and causes significant damage to many agricultural crops. Fragmentation and lack of gene flow between populations may lead to population sub-structure. Within discrete recombining populations, positive selection may lead to a ‘selective sweep’. This is characterised by an increase in frequency of a favourable allele leading to reduction in genotypic diversity in a localised genomic region due to the phenomenon of genetic hitchhiking. We aimed to assess whether isolates of S. sclerotiorum from around the world formed genotypic clusters associated with geographical origin and to determine whether signatures of population-specific positive selection could be detected. To do this, we sequenced the genomes of 25 isolates of S. sclerotiorum collected from four different continents-Australia, Africa (north and south), Europe and North America (Canada and the northen United States) and conducted SNP based analyses of population structure and selective sweeps. Among the 25 isolates, there was evidence for two major population clusters. One of these consisted of 11 isolates from Canada, the USA and France (population 1), and the other consisted of nine isolates from Australia and one from Morocco (population 2). The rest of the isolates were genotypic outliers. We found that there was evidence of outcrossing in these two populations based on linkage disequilibrium decay. However, only a single candidate selective sweep was observed, and it was present in population 2. This sweep was close to a Major Facilitator Superfamily transporter gene, and we speculate that this gene may have a role in nutrient uptake from the host. The low abundance of selective sweeps in the S. sclerotiorum genome contrasts the numerous examples in the genomes of other fungal pathogens. This may be a result of its slow rate of evolution and low effective recombination rate due to self-fertilisation and vegetative reproduction.
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity. Copyright © 2018 Elsevier Inc. All rights reserved.
Recombination between loci underlying mate choice and ecological traits is a major evolutionary force acting against speciation with gene flow. The evolution of linkage disequilibrium between such loci is therefore a fundamental step in the origin of species. Here, we show that this process can take place in the absence of physical linkage in hamlets-a group of closely related reef fishes from the wider Caribbean that differ essentially in colour pattern and are reproductively isolated through strong visually-based assortative mating. Using full-genome analysis, we identify four narrow genomic intervals that are consistently differentiated among sympatric species in a backdrop of extremely low genomic divergence. These four intervals include genes involved in pigmentation (sox10), axial patterning (hoxc13a), photoreceptor development (casz1) and visual sensitivity (SWS and LWS opsins) that develop islands of long-distance and inter-chromosomal linkage disequilibrium as species diverge. The relatively simple genomic architecture of species differences facilitates the evolution of linkage disequilibrium in the presence of gene flow.
Our understanding of sequence variation in the HLA-DPB1 gene is largely restricted to the hypervariable antigen recognition domain (ARD) encoded by exon 2. Here, we employed a redundant sequencing strategy combining long-read and short-read data to accurately phase and characterise in full length the majority of common and well-documented (CWD) DPB1 alleles as well as alleles with an observed frequency of at least 0.0006% in our predominantly European sample set. We generated 664 DPB1 sequences, comprising 279 distinct allelic variants. This allows us to present the, to date, most comprehensive analysis of the nature and extent of DPB1 sequence variation. The full-length sequence analysis revealed the existence of two highly diverged allele clades. These clades correlate with the rs9277534 A???G variant, a known expression marker located in the 3′-UTR. The two clades are fully differentiated by 174 fixed polymorphisms throughout a 3.6?kb stretch at the 3′-end of DPB1. The region upstream of this differentiation zone is characterised by increasingly shared variation between the clades. The low-expression A clade comprises 59% of the distinct allelic sequences including the three by far most frequent DPB1 alleles, DPB1*04:01, DPB1*02:01 and DPB1*04:02. Alleles in the A clade show reduced nucleotide diversity with an excess of rare variants when compared to the high-expression G clade. This pattern is consistent with a scenario of recent proliferation of A-clade alleles. The full-length characterisation of all but the most rare DPB1 alleles will benefit the application of NGS for DPB1 genotyping and provides a helpful framework for a deeper understanding of high- and low-expression alleles and their implications in the context of unrelated haematopoietic stem-cell transplantation.Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Genetic map-guided genome assembly reveals a virulence-governing minichromosome in the lentil anthracnose pathogen Colletotrichum lentis.
Colletotrichum lentis causes anthracnose, which is a serious disease on lentil and can account for up to 70% crop loss. Two pathogenic races, 0 and 1, have been described in the C. lentis population from lentil. To unravel the genetic control of virulence, an isolate of the virulent race 0 was sequenced at 1481-fold genomic coverage. The 56.10-Mb genome assembly consists of 50 scaffolds with N50 scaffold length of 4.89 Mb. A total of 11 436 protein-coding gene models was predicted in the genome with 237 coding candidate effectors, 43 secondary metabolite biosynthetic enzymes and 229 carbohydrate-active enzymes (CAZymes), suggesting a contraction of the virulence gene repertoire in C. lentis. Scaffolds were assigned to 10 core and two minichromosomes using a population (race 0 × race 1, n = 94 progeny isolates) sequencing-based, high-density (14 312 single nucleotide polymorphisms) genetic map. Composite interval mapping revealed a single quantitative trait locus (QTL), qClVIR-11, located on minichromosome 11, explaining 85% of the variability in virulence of the C. lentis population. The QTL covers a physical distance of 0.84 Mb with 98 genes, including seven candidate effector and two secondary metabolite genes. Taken together, the study provides genetic and physical evidence for the existence of a minichromosome controlling the C. lentis virulence on lentil. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
A 12-kb structural variation in progressive myoclonic epilepsy was newly identified by long-read whole-genome sequencing.
We report a family with progressive myoclonic epilepsy who underwent whole-exome sequencing but was negative for pathogenic variants. Similar clinical courses of a devastating neurodegenerative phenotype of two affected siblings were highly suggestive of a genetic etiology, which indicates that the survey of genetic variation by whole-exome sequencing was not comprehensive. To investigate the presence of a variant that remained unrecognized by standard genetic testing, PacBio long-read sequencing was performed. Structural variant (SV) detection using low-coverage (6×) whole-genome sequencing called 17,165 SVs (7,216 deletions and 9,949 insertions). Our SV selection narrowed down potential candidates to only five SVs (two deletions and three insertions) on the genes tagged with autosomal recessive phenotypes. Among them, a 12.4-kb deletion involving the CLN6 gene was the top candidate because its homozygous abnormalities cause neuronal ceroid lipofuscinosis. This deletion included the initiation codon and was found in a GC-rich region containing multiple repetitive elements. These results indicate the presence of a causal variant in a difficult-to-sequence region and suggest that such variants that remain enigmatic after the application of current whole-exome sequencing technology could be uncovered by unbiased application of long-read whole-genome sequencing.
The human disease lymphatic filariasis causes the debilitating effects of elephantiasis and hydrocele. Lymphatic filariasis currently affects the lives of 90 million people in 52 countries. There are three nematodes that cause lymphatic filariasis, Brugia malayi, Brugia timori, and Wuchereria bancrofti, but 90% of all cases of lymphatic filariasis are caused solely by W. bancrofti (Wb). Here we use population genomics to reconstruct the probable route and timing of migration of Wb strains that currently infect Africa, Haiti, and Papua New Guinea (PNG). We used selective whole genome amplification to sequence 42 whole genomes of single Wb worms from populations in Haiti, Mali, Kenya, and PNG. Our results are consistent with a hypothesis of an Island Southeast Asia or East Asian origin of Wb. Our demographic models support divergence times that correlate with the migration of human populations. We hypothesize that PNG was infected at two separate times, first by the Melanesians and later by the migrating Austronesians. The migrating Austronesians also likely introduced Wb to Madagascar where later migrations spread it to continental Africa. From Africa, Wb spread to the New World during the transatlantic slave trade. Genome scans identified 17 genes that were highly differentiated among Wb populations. Among these are genes associated with human immune suppression, insecticide sensitivity, and proposed drug targets. Identifying the distribution of genetic diversity in Wb populations and selection forces acting on the genome will build a foundation to test future hypotheses and help predict response to current eradication efforts. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.
Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement.
Maize is one of the most important crops globally, and it shows remarkable genetic diversity. Knowledge of this diversity could help in crop improvement; however, gold-standard genomes have been elucidated only for modern temperate varieties. Here, we present a high-quality reference genome (contig N50 of 15.78?megabases) of the maize small-kernel inbred line, which is derived from a tropical landrace. Using haplotype maps derived from B73, Mo17 and SK, we identified 80,614 polymorphic structural variants across 521 diverse lines. Approximately 22% of these variants could not be detected by traditional single-nucleotide-polymorphism-based approaches, and some of them could affect gene expression and trait performance. To illustrate the utility of the diverse SK line, we used it to perform map-based cloning of a major effect quantitative trait locus controlling kernel weight-a key trait selected during maize improvement. The underlying candidate gene ZmBARELY ANY MERISTEM1d provides a target for increasing crop yields.
Adaptation and Phenotypic Diversification in Arabidopsis through Loss-of-Function Mutations in Protein-Coding Genes.
According to the less-is-more hypothesis, gene loss is an engine for evolutionary change. Loss-of-function (LoF) mutations resulting in the natural knockout of protein-coding genes not only provide information about gene function but also play important roles in adaptation and phenotypic diversification. Although the less-is-more hypothesis was proposed two decades ago, it remains to be explored on a large scale. In this study, we identified 60,819 LoF variants in 1071 Arabidopsis (Arabidopsis thaliana) genomes and found that 34% of Arabidopsis protein-coding genes annotated in the Columbia-0 genome do not have any LoF variants. We found that nucleotide diversity, transposable element density, and gene family size are strongly correlated with the presence of LoF variants. Intriguingly, 0.9% of LoF variants with minor allele frequency larger than 0.5% are associated with climate change. In addition, in the Yangtze River basin population, 1% of genes with LoF mutations were under positive selection, providing important insights into the contribution of LoF mutations to adaptation. In particular, our results demonstrate that LoF mutations shape diverse phenotypic traits. Overall, our results highlight the importance of the LoF variants for the adaptation and phenotypic diversification of plants. © 2019 American Society of Plant Biologists. All rights reserved.