Species-specific, new, or “orphan” genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.© 2018 Werner et al.; Published by Cold Spring Harbor Laboratory Press.
Rose is the world’s most important ornamental plant, with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Roses are outbred and can have various ploidy levels. Our objectives were to develop a high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short reads, and anchoring to a high-density genetic map, and to study the genome structure and genetic basis of major ornamental traits. We produced a doubled haploid rose line (‘HapOB’) from Rosa chinensis ‘Old Blush’ and generated a rose genome assembly anchored to seven pseudo-chromosomes (512?Mb with N50 of 3.4?Mb and 564 contigs). The length of 512?Mb represents 90.1-96.1% of the estimated haploid genome size of rose. Of the assembly, 95% is contained in only 196 contigs. The anchoring was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features, including the pericentromeric regions, through annotation of transposable element families and positioned centromeric repeats using fluorescent in situ hybridization. The rose genome displays extensive synteny with the Fragaria vesca genome, and we delineated only two major rearrangements. Genetic diversity was analysed using resequencing data of seven diploid and one tetraploid Rosa species selected from various sections of the genus. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and the number of flower petals. A rose APETALA2/TOE homologue is proposed to be the major regulator of petal number in rose. This reference sequence is an important resource for studying polyploidization, meiosis and developmental processes, as we demonstrated for flower and prickle development. It will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.
Whole-genome resequencing and pan-transcriptome reconstruction highlight the impact of genomic structural Variation on secondary metabolite gene clusters in the grapevine Esca pathogen Phaeoacremonium minimum.
The Ascomycete fungus Phaeoacremonium minimum is one of the primary causal agents of Esca, a widespread and damaging grapevine trunk disease. Variation in virulence among Pm. minimum isolates has been reported, but the underlying genetic basis of the phenotypic variability remains unknown. The goal of this study was to characterize intraspecific genetic diversity and explore its potential impact on virulence functions associated with secondary metabolism, cellular transport, and cell wall decomposition. We generated a chromosome-scale genome assembly, using single molecule real-time sequencing, and resequenced the genomes and transcriptomes of multiple isolates to identify sequence and structural polymorphisms. Numerous insertion and deletion events were found for a total of about 1 Mbp in each isolate. Structural variation in this extremely gene dense genome frequently caused presence/absence polymorphisms of multiple adjacent genes, mostly belonging to biosynthetic clusters associated with secondary metabolism. Because of the observed intraspecific diversity in gene content due to structural variation we concluded that a transcriptome reference developed from a single isolate is insufficient to represent the virulence factor repertoire of the species. We therefore compiled a pan-transcriptome reference of Pm. minimum comprising a non-redundant set of 15,245 protein-coding sequences. Using naturally infected field samples expressing Esca symptoms, we demonstrated that mapping of meta-transcriptomics data on a multi-species reference that included the Pm. minimum pan-transcriptome allows the profiling of an expanded set of virulence factors, including variable genes associated with secondary metabolism and cellular transport.
Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)
Selection of highly productive algal strains is crucial for establishing economically viable biomass and biopro- duct cultivation systems. Characterization of algal genomes, including understanding strain-specific differences in genome content and architecture is a critical step in this process. Using genomic analyses, we demonstrate significant differences between three strains of Chlorella sorokiniana (strain 1228, UTEX 1230, and DOE1412). We found that unique, strain-specific genes comprise a substantial proportion of each genome, and genomic regions with> 80% local nucleotide identity constitute <15% of each genome among the strains, indicating substantial strain specific evolution. Furthermore, cataloging of meiosis and other sex-related genes in C. sor- okiniana strains suggests strategic breeding could be utilized to improve biomass and bioproduct yields if a sexual cycle can be characterized. Finally, preliminary investigation of epigenetic machinery suggests the pre- sence of potentially unique transcriptional regulation in each strain. Our data demonstrate that these three C. sorokiniana strains represent significantly different genomic content. Based on these findings, we propose in- dividualized assessment of each strain for potential performance in cultivation systems.
Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement
Sweetpotato [Ipomoea batatas (L.) Lam.] is a globally important staple food crop, especially for sub-Saharan Africa. Agronomic improvement of sweetpotato has lagged behind other major food crops due to a lack of genomic and genetic resources and inherent challenges in breeding a heterozygous, clonally propagated polyploid. Here, we report the genome sequences of its two diploid relatives, I. trifida and I. triloba, and show that these high-quality genome assemblies are robust references for hexaploid sweetpotato. Comparative and phylogenetic analyses reveal insights into the ancient whole-genome triplication history of Ipomoea and evolutionary relationships within the Batatas complex. Using resequencing data from 16 genotypes widely used in African breeding programs, genes and alleles associated with carotenoid biosynthesis in storage roots are identified, which may enable efficient breeding of varieties with high provitamin A content. These resources will facilitate genome-enabled breeding in this important food security crop.