Reconstruction and annotation of transcripts, particularly for a species without reference genome, plays a critical role in gene discovery, investigation of genomic signatures, and genome annotation in the pre-genomic era. This study generated 33,330 full-length transcripts of diploid M. anguillicaudatus using PacBio SMRT Sequencing. A total of 6,918 gene families were identified with two or more isoforms, and 26,683 complete ORFs with an average length of 1,497?bp were detected. Totally, 1,208 high-confidence lncRNAs were identified, and most of these appeared to be precursor transcripts of miRNAs or snoRNAs. Phylogenetic tree of the Misgurnus species was inferred based on the 1,905 single copy orthologous genes. The tetraploid and diploid M. anguillicaudatus grouped into a clade, and M. bipartitus showed a closer relationship with the M. anguillicaudatus. The overall evolutionary rates of tetraploid M. anguillicaudatus were significantly higher than those of other Misgurnus species. Meanwhile, 28 positively selected genes were identified in M. anguillicaudatus clade. These positively selected genes may play critical roles in the adaptation to various habitat environments for M. anguillicaudatus. This study could facilitate further exploration of the genomic signatures of M. anguillicaudatus and provide potential insights into unveiling the evolutionary history of tetraploid loach.
Journal: Scientific reports