The lakes on the Qinghai-Tibet Plateau (QTP) are the largest and highest lake group in the world. Gymnocypris selincuoensis is the only cyprinid fish living in lake Selincuo, the largest lake on QTP. However, its genetic resource is still blank, limiting studies on molecular and genetic analysis. In this study, the transcriptome of G. selincuoensis was first generated by using PacBio Iso-Seq and Illumina RNA-seq. A full-length (FL) transcriptome with 75,435 transcripts was obtained by Iso-Seq with N50 length of 3,870 bp. Among all transcripts, 75,016 were annotated to public databases, 64,710 contain complete open reading frames and 2,811 were long non-coding RNAs. Based on all- vs.-all BLAST, 2,069 alternative splicing events were detected, and 80% of them were validated by reverse transcription polymerase chain reaction (RT-PCR). Tissue gene expression atlas showed that the number of detected expressed transcripts ranged from 37,397 in brain to 19,914 in muscle, with 10,488 transcripts detected in all seven tissues. Comparative genomic analysis with other cyprinid fishes identified 77 orthologous genes with potential positive selection (Ka/Ks > 0.3). A total of 56,696 perfect simple sequence repeats were identified from FL transcripts. Our results provide valuable genetic resources for further studies on adaptive evolution, gene expression and population genetics in G. selincuoensis and other congeneric fishes. © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Journal: DNA research