In New Tibetan Genome Assembly, Variants for Living at Altitude and the Imprint of Archaic DNA
Monday, October 14, 2019
A recent bioRxiv preprint reports efforts to sequence the genome of a Tibetan individual and detect the genetic underpinning of adaptive traits associated with tolerating high altitude. The authors used SMRT Sequencing to achieve extremely high contiguity and accuracy, and incorporated scaffolding and other complementary technologies to build a robust assembly.
The results are reported in the preprint, “De novo assembly of a Tibetan genome and identification of novel structural variants associated with high altitude adaptation.” Lead author Ouzhuluobu, senior author Bing Su, and collaborators discuss their evaluation of the new genome assembly as well as key findings from it. They chose to focus on a Tibetan person because of the population’s unique and long-term residence in “one of the most extreme environments on earth”— the Tibetan Plateau, at an average elevation exceeding 4.5 kilometers.
The team’s genome assembly, named “ZF1”, is the first for a Tibetan individual. Using the assembly, the scientists identified 6,500 structural variants that were not detected in two other long-read Asian genome assemblies. “[Genes near] ZF1-specific SVs are enriched in GTPase activity that is required for activation of the hypoxic pathway,” the authors report. In addition, they found a “163-bp intronic deletion in the MKL1 gene showing large divergence between highland Tibetans and lowland Han Chinese.” They note, “This deletion is significantly associated with lower systolic pulmonary arterial pressure, one of the key adaptive physiological traits in Tibetans.”
Previous studies had suggested that the Tibetan population may have more genomic content from archaic hominid species, such as the Denisovans, than other modern populations. “To take advantage of the de novo ZF1 assembly, we performed a genome-wide search of archaic sharing non-reference sequences (NRSs) and compared the results with the two de novo assembled Asian genomes (AK1 and HX1),” the authors report. “We found a total length of 39.6 Mb and 45.9 Mb sequences shared with those of Altai Neanderthal and Denisovan, corresponding to 1.32% and 1.53% of the entire ZF1 genome respectively. These archaic proportions are much higher than that in AK1 (0.82% and 0.70%) or HX1 (0.98% and 0.85%).” One of the archaic shared regions is a 662 bp insertion associated with improved lung function.
“The high-quality genome allows us to better understand the sequences showing population-level or individual-level specificity where they are different or even absent from the human reference genome,” the scientists write. “Our study demonstrates the value of constructing a high-resolution reference genome of representative populations (e.g. native highlanders) for understanding the genetic basis of human adaptation to extreme environments as well as for future clinical applications in hypoxia-related illness.”