The seed is the pharmaceutical and breeding organ of Cassia obtusifolia, a well-known medical herb containing aurantio-obtusin (a kind of anthraquinone), food, and landscape. In order to understand the molecular mechanism of the biosynthesis of aurantio-obtusin, seed formation and development, and stress response of C. obtusifolia, it is necessary to understand the genomics information. Although previous seed transcriptome of C. obtusifolia has been carried out by short-read next-generation sequencing (NGS) technology, the vast majority of the resulting unigenes did not represent full-length cDNA sequences and supply enough gene expression profile information of the various organs or tissues. In this study, fifteen cDNA libraries, which were constructed from the seed, root, stem, leaf, and flower (three repetitions with each organ) of C. obtusifolia, were sequenced using hybrid approach combining single-molecule real-time (SMRT) and NGS platform. More than 4,315,774 long reads with 9.66 Gb sequencing data and 361,427,021 short reads with 108.13 Gb sequencing data were generated by SMRT and NGS platform, respectively. 67,222 consensus isoforms were clustered from the reads and 81.73% (61,016) of which were longer than 1000 bp. Furthermore, the 67,222 consensus isoforms represented 58,106 nonredundant transcripts, 98.25% (57,092) of which were annotated and 25,573 of which were assigned to specific metabolic pathways by KEGG. CoDXS and CoDXR genes were directly used for functional characterization to validate the accuracy of sequences obtained from transcriptome. A total of 658 seed-specific transcripts indicated their special roles in physiological processes in seed. Analysis of transcripts which were involved in the early stage of anthraquinone biosynthesis suggested that the aurantio-obtusin in C. obtusifolia was mainly generated from isochorismate and Mevalonate/methylerythritol phosphate (MVA/MEP) pathway, and three reactions catalyzed by Menaquinone-specific isochorismate synthase (ICS), 1-deoxy-d-xylulose-5-phosphate synthase (DXS) and isopentenyl diphosphate (IPPS) might be the limited steps. Several seed-specific CYPs, SAM-dependent methyltransferase, and UDP-glycosyltransferase (UDPG) supplied promising candidate genes in the late stage of anthraquinone biosynthesis. In addition, four seed-specific transcriptional factors including three MYB Transcription Factor (MYB) and one MADS-box Transcription Factor (MADS) transcriptional factors) and alternative splicing might be involved with seed formation and development. Meanwhile, most members of Hsp20 genes showed high expression level in seed and flower; seven of which might have chaperon activities under various abiotic stresses. Finally, the expressional patterns of genes with particular interests showed similar trends in both transcriptome assay and qRT-PCR. In conclusion, this is the first full-length transcriptome sequencing reported in Caesalpiniaceae family, and thus providing a more complete insight into aurantio-obtusin biosynthesis, seed formation and development, and stress response as well in C. obtusifolia.
Journal: International journal of molecular sciences