Cauliflower is an important variety of Brassica oleracea and is planted worldwide. Here, the high-quality genome sequence of cauliflower was reported. The assembled cauliflower genome was 584.60?Mb in size, with a contig N50 of 2.11?Mb, and contained 47,772 genes; 56.65% of the genome was composed of repetitive sequences. Among these sequences, long terminal repeats (LTRs) were the most abundant (32.71% of the genome), followed by transposable elements (TEs) (12.62%). Comparative genomic analysis confirmed that after an ancient paleohexaploidy (?) event, cauliflower underwent two whole-genome duplication (WGD) events shared with Arabidopsis and an additional whole-genome triplication (WGT) event shared with other Brassica species. The present cultivated cauliflower diverged from the ancestral B. oleracea species ~3.0 million years ago (Mya). The speciation of cauliflower (~2.0?Mya) was later than that of B. oleracea L. var. capitata (approximately 2.6?Mya) and other Brassica species (over 2.0?Mya). Chromosome no. 03 of cauliflower shared the most syntenic blocks with the A, B, and C genomes of Brassica species and its eight other chromosomes, implying that chromosome no. 03 might be the most ancient one in the cauliflower genome, which was consistent with the chromosome being inherited from the common ancestor of Brassica species. In addition, 2,718 specific genes, 228 expanded genes, 2 contracted genes, and 1,065 positively selected genes in cauliflower were identified and functionally annotated. These findings provide new insights into the genomic diversity of Brassica species and serve as a valuable reference for molecular breeding of cauliflower.
Journal: Horticulture research