New High-Resolution Genome Assemblies Expand Our Understanding of Human-Ape Differences
Monday, June 11, 2018
Ever since researchers sequenced the chimpanzee genome in 2005, they have known that humans share the vast majority of our DNA sequence with chimps, making them our closest living relatives. So what, exactly, sets us apart?
While prior ape genome assemblies were helpful in finding single nucleotide changes, many researchers speculate that a variation type that is more difficult to resolve, structural differences in regulatory DNA or in the copy number of gene families, play important roles in species adaptation. Large-scale efforts to sequence and assemble more ape genomes over the last 13 years have expanded our knowledge, but many structural variations (SVs) that distinguish the great apes remain unresolved. Additionally, the currently available draft ape genome assemblies, which contain tens to hundreds of thousands of gaps, are often compared against the much higher-quality human genome reference, introducing bias that “humanizes” the ape assemblies.
Now, an effort led by scientists at the University of Washington has closed most of those gaps by producing ab initio chimpanzee and orangutan genome assemblies where most genes are complete and novel gene models are identified.
In a recently published Science paper, first author Zev N. Kronenberg of the UW Genome Sciences department and presently at Phase Genomics, lead author Evan E. Eichler, of UW and the Howard Hughes Medical Institute along with a multi-institutional team describe how they coupled PacBio long-read sequence assembly and Iso-Seq cDNA sequencing with a multi-platform scaffolding approach to characterize lineage-specific and shared great ape genetic variation ranging from single base-pair to megabase-sized variants.
The team sequenced four genomes—two human, one chimpanzee and one orangutan—to high depth (>65-fold coverage) using SMRT Sequencing data, and generated ~3 Gb assemblies for each species where the majority of the euchromatic DNA mapped to <1,000 large contigs. They then scaffolded the chimpanzee and orangutan genomes without guidance from the human reference genome. By using the same exact methods for assembly, these ape genomes along with the Eichler group’s long-read assembly of the gorilla genome could finally be compared to one another and the human genome on a more level playing field.
“Recent advances in sequencing and mapping technologies now make more detailed investigations possible, not only of individual species but also entire clades of species,” the authors write. “We generated new great ape genome assemblies displaying improved sequence contiguity by orders of magnitude, leading to a more comprehensive understanding of the evolution of structural variation.”
Comparing these new high quality genome assemblies to 86 recently sequenced great ape genomes and a diverse set of human genomes from the Simons Genome Diversity Panel, they identified 17,789 fixed human-specific structural variants, including 11,897 human-specific insertions and 5,892 human-specific deletions. These figures double the number of predicted genic and putative regulatory changes that emerged in humans since divergence from nonhuman apes. Among this set, they focused on SVs that potentially disrupt genes or regulatory sequence, identifying 1,145 human-specific SVs with potential functional effects.
“Unbiased genome scaffolding led to the discovery of novel and more complex subcytogenetic differences between human and other great ape chromosomes that were previously missed,” the authors write. “Projecting these onto the human genome shows potential hotspots of structural variation by size or number of events.”
Among the discoveries were fixed human-specific structural variants enriched near genes that are downregulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.
“Differential gene expression, especially in cortical radial glia, has been hypothesized to be a critical effector of brain size and a likely target of unique aspects of human brain evolution,” they write.
The authors identify several potential avenues for future investigation, such as structural variants that alter the human versions of the genes ZNHIT6, GLI3, and two key cell cycle regulators, CDC25C and WEE1. The publication also offers a significant resource to the great ape research community by annotating the ape genes and identifying full length mRNA isoforms with Iso-Seq data combined with short read RNA-seq.
The ape genomes still have some holes in comparison to human due to “upgrades” to the human reference genome using BAC-based long-read sequencing to resolve difficult, biologically relevant genomic regions such as segmental duplications. Eichler has long championed this approach and in a press release that accompanies the Science publication, he says “Our goal is to generate multiple ape genomes with as high quality as the human genome. Only then will we be able to truly understand the genetic differences that make us uniquely human.”