Photosynthetic euglenids are major components of aquatic ecosystems and relatives of trypanosomes. Euglena gracilis has considerable biotechnological potential and great adaptability, but exploitation remains hampered by the absence of a comprehensive gene catalogue. We address this by genome, RNA and protein sequencing: the E. gracilis genome is >2Gb, with 36,526 predicted proteins. Large lineage-specific paralog families are present, with evidence for flexibility in environmental monitoring, divergent mechanisms for metabolic control, and novel solutions for adaptation to extreme environments. Contributions from photosynthetic eukaryotes to the nuclear genome, consistent with the shopping bag model are found, together with transitions between kinetoplastid and canonical systems. Control of protein expression is almost exclusively post-transcriptional. These data are a major advance in understanding the nuclear genomes of euglenids and provide a platform for investigating the contributions of E. gracilis and its relatives to the biosphere.
Journal: BioRxiv
DOI: 10.1101/228015
Year: 2017