In comparison to humans and chimpanzees, gorillas show low diversity at MHC class I genes (Gogo), as reflected by an overall reduced level of allelic variation as well as the absence of a functionally important sequence motif that interacts with killer cell immunoglobulin-like receptors (KIR). Here, we use recently generated large-scale genomic sequence data for a reassessment of allelic diversity at Gogo-C, the gorilla orthologue of HLA-C. Through the combination of long-range amplifications and long-read sequencing technology, we obtained, among the 35 gorillas reanalyzed, three novel full-length genomic sequences including a coding region sequence that has not been previously described. The newly identified Gogo-C*03:01 allele has a divergent recombinant structure that sets it apart from other Gogo-C alleles. Domain-by-domain phylogenetic analysis shows that Gogo-C*03:01 has segments in common with Gogo-B*07, the additional B-like gene that is present on some gorilla MHC haplotypes. Identified in ~ 50% of the gorillas analyzed, the Gogo-C*03:01 allele exclusively encodes the C1 epitope among Gogo-C allotypes, indicating its important function in controlling natural killer cell (NK cell) responses via KIR. We further explored the hypothesis whether gorillas experienced a selective sweep which may have resulted in a general reduction of the gorilla MHC class I repertoire. Our results provide little support for a selective sweep but rather suggest that the overall low Gogo class I diversity can be best explained by drastic demographic changes gorillas experienced in the ancient and recent past.
Next-generation sequencing requires sufficient DNA to be available. If limited, whole-genome amplification is applied to generate additional amounts of DNA. Such amplification often results in many chimeric DNA fragments, in particular artificial palindromic sequences, which limit the usefulness of long sequencing reads.Here, we present Pacasus, a tool for correcting such errors. Two datasets show that it markedly improves read mapping and de novo assembly, yielding results similar to these that would be obtained with non-amplified DNA.With Pacasus long-read technologies become available for sequencing targets with very small amounts of DNA, such as single cells or even single chromosomes.
Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models. Although regions of large, high-identity segmental duplications remain largely unresolved, this comprehensive assembly provides new biological insight into genetic diversity, structural variation, gene loss, and representation of repeat structures within the gorilla genome. The approach provides a path forward for the routine assembly of mammalian genomes at a level approaching that of the current quality of the human genome. Copyright © 2016, American Association for the Advancement of Science.
Comparisons of MHC gene content and diversity among closely related species can provide insights into the evolutionary mechanisms shaping immune system variation. After chimpanzees and bonobos, gorillas are humans’ closest living relatives; but in contrast, relatively little is known about the structure and variation of gorilla MHC class I genes (Gogo). Here, we combined long-range amplifications and long-read sequencing technology to analyze full-length MHC class I genes in 35 gorillas. We obtained 50 full-length genomic sequences corresponding to 15 Gogo-A alleles, 4 Gogo-Oko alleles, 21 Gogo-B alleles, and 10 Gogo-C alleles including 19 novel coding region sequences. We identified two previously undetected MHC class I genes related to Gogo-A and Gogo-B, respectively, thereby illustrating the potential of this approach for efficient and highly accurate MHC genotyping. Consistent with their phylogenetic position within the hominid family, individual gorilla MHC haplotypes share characteristics with humans and chimpanzees as well as orangutans suggesting a complex history of the MHC class I genes in humans and the great apes. However, the overall MHC class I diversity appears to be low further supporting the hypothesis that gorillas might have experienced a reduction of their MHC repertoire.
The evolutionary dynamics of repeat sequences is quite complex, with some duplicates never having differentiated from each other. Two models can explain the complex evolutionary process for repeated genes—concerted and birth-and-death, of which the latter is driven by duplications maintained by selection. Copy number variations caused by random duplications and losses in repeat regions may modulate molecular pathways and therefore affect phenotypic characteristics in a population, resulting in individuals that are able to adapt to new environments. In this study, we investigated the filaggrin gene (FLG), which codes for filaggrin—an important component of the outer layers of mammalian skin—and contains tandem repeats that exhibit copy number variation between and within species. To examine which model best fits the evolutionary pathway for the complete tandem repeats within a single exon of FLG, we determined the repeat sequences in crab-eating macaque (Macaca fascicularis), orangutan (Pongo abelii), gorilla (Gorilla gorilla), and chimpanzee (Pan troglodytes) and compared these with the sequence in human (Homo sapiens).
Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.
A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y.
The mammalian Y Chromosome sequence, critical for studying male fertility and dispersal, is enriched in repeats and palindromes, and thus, is the most difficult component of the genome to assemble. Previously, expensive and labor-intensive BAC-based techniques were used to sequence the Y for a handful of mammalian species. Here, we present a much faster and more affordable strategy for sequencing and assembling mammalian Y Chromosomes of sufficient quality for most comparative genomics analyses and for conservation genetics applications. The strategy combines flow sorting, short- and long-read genome and transcriptome sequencing, and droplet digital PCR with novel and existing computational methods. It can be used to reconstruct sex chromosomes in a heterogametic sex of any species. We applied our strategy to produce a draft of the gorilla Y sequence. The resulting assembly allowed us to refine gene content, evaluate copy number of ampliconic gene families, locate species-specific palindromes, examine the repetitive element content, and produce sequence alignments with human and chimpanzee Y Chromosomes. Our results inform the evolution of the hominine (human, chimpanzee, and gorilla) Y Chromosomes. Surprisingly, we found the gorilla Y Chromosome to be similar to the human Y Chromosome, but not to the chimpanzee Y Chromosome. Moreover, we have utilized the assembled gorilla Y Chromosome sequence to design genetic markers for studying the male-specific dispersal of this endangered species. © 2016 Tomaszkiewicz et al.; Published by Cold Spring Harbor Laboratory Press.