Developmentally programmed genome rearrangements are rare in vertebrates, but have been reported in scattered lineages including the bandicoot, hagfish, lamprey, and zebra finch (Taeniopygia guttata) . In the finch, a well-studied animal model for neuroendocrinology and vocal learning , one such programmed genome rearrangement involves a germline-restricted chromosome, or GRC, which is found in germlines of both sexes but eliminated from mature sperm [3, 4]. Transmitted only through the oocyte, it displays uniparental female-driven inheritance, and early in embryonic development is apparently eliminated from all somatic tissue in both sexes [3, 4]. The GRC comprises the longest finch chromosome at over 120 million base pairs , and previously the only known GRC-derived sequence was repetitive and non-coding . Because the zebra finch genome project was sourced from male muscle (somatic) tissue , the remaining genomic sequence and protein-coding content of the GRC remain unknown. Here we report the first protein-coding gene from the GRC: a member of the a-soluble N-ethylmaleimide sensitive fusion protein (NSF) attachment protein (a-SNAP) family hitherto missing from zebra finch gene annotations. In addition to the GRC-encoded a-SNAP, we find an additional paralogous a-SNAP residing in the somatic genome (a somatolog)-making the zebra finch the first example in which a-SNAP is not a single-copy gene. We show divergent, sex-biased expression for the paralogs and also that positive selection is detectable across the bird a-SNAP lineage, including the GRC-encoded a-SNAP. This study presents the identification and evolutionary characterization of the first protein-coding GRC gene in any organism. Copyright © 2018 Elsevier Ltd. All rights reserved.
Journal: Current biology