Seven years after the ALS Ice Bucket Challenge soaked the world, the pace of discovery in sporadic amyotrophic lateral sclerosis has increased tremendously, with more than $115 million dollars in donations funding research that has led to the identification of several genes implicated in both familial and sporadic cases of the neurodegenerative disease.
While the social campaigns have generated much needed awareness around the disease, there are other challenges – one of which can be addressed with long-read sequencing.
As detailed in a new, interactive case study, PacBio SMRT Sequencing is helping researchers at the University of Washington unravel repeat regions of key genes linked to ALS and other disorders.
“We’ve made great strides over the past two decades in identifying genes and loci involved in ALS, mainly via GWAS studies,” said Paul Valdmanis (@pvaldmanis), assistant professor of medical genetics.
“We’re now at an exciting time where we have these new technologies available to allow us to identify novel risk factors. Through single cell sequencing and long read sequencing, in particular, we can ID some of these regions that were previously hidden or intractable to short read sequencing.”
Starting the search
Tandem repeats (and variable number tandem repeats, VNTRs) are snippets of DNA that are repeated multiple times within a gene, anywhere from a handful of times to more than a hundred. Sometimes these repeat sequences expand to long stretches, and these expanded repeats have been implicated in many diseases, including 40 linked to neurological disease.
Some of the open questions about these repeats include:
● How do they expand from a short repeat copy with four or five CAGs to an allele of over 60 base pairs?
● Does it make a difference where the repeat occurs–the beginning, middle, or end of the gene?
● What role does the internal sequence of these repeats play in disease pathogenesis?
In their search for answers, Valdmanis and postdoctoral fellow Meredith Course started with a multigenerational family that had several cases of ALS. Many of the family members had variants of a gene (FUS) linked to the disease, but not all of them exhibited symptoms. Why? Is there an additional genetic modifier that influences pathogenesis?
Using old-school linkage study approaches, the Valdmanis Lab identified a region on Chromosome 18 that also seemed to play a role. All members of the family affected by ALS shared a 4 megabase segment of DNA within this region. So, they took a look inside this region to see if there were other risk factors. Then, they zoomed in further, using HiFi sequencing as their magnifying glass.
Homing in on the culprit
Through HiFi sequencing, researchers pinpointed a 69-bp VNTR in the WDR7 gene that was found to be enriched in individuals with ALS. The reference genome has about 6 copies of this repeat, but each individual in the ALS family had more than 30 copies.
They also performed multiplexed barcoded sequencing to resolve the complete internal structure of the WDR7 repeat in 288 geographically diverse individuals, and found striking variability in both repeat length and internal nucleotide composition. Some of the 69 bp repeat motifs were specifically present or absent in certain geographic populations.
They created maps to help visualize the data, and started to notice patterns emerging.
“Every time we looked at this repeat, we learned more,” Valdmanis said.
They were able to identify features associated with repeat expansion dynamics, the mechanistic consequences of repeat expansions to ALS susceptibility, and the structure of repeats in geographically diverse populations.
Further investigation of 15 samples from the Human Genome Project that had undergone long read phased sequencing suggested that the WDR7 gene was not alone in terms of the extreme variability in the length of its tandem repeats. They explored many other genes, including NWD2, VPS53, SLC22A1 and ART, and discovered various categories of repeats.
“We truly believe that long-read sequencing of tandem repeats can provide a lot of information about both human evolutionary events as well as risk factors for neurodegenerative disease. And we believe that VNTR expansions can represent novel disease risk factors not only in ALS but in other neurodegenerative diseases as well.”
Read our newest groundbreaking case study to learn more: ‘How SMRT Sequencing Helped Researchers at University of Washington Uncover a Tandem Repeat Linked to ALS’
Interested in learning more about our technology and ALS research?
Read our blog: Scientists Use PacBio Sequencing to Discover Likely Pathogenic Structural Variants Linked to ALS
Watch Marka van Blitterswijk from the Mayo Clinic present, ‘Applying Targeted Long-read Sequencing to Assess an Expanded Repeat in C9orf72’
See Meredith Course from the University of Washington present, ‘The Evolution and Function of a Large Tandem Repeat Associated with ALS’.
Visit our Neuroscience Research page to learn how PacBio sequencing provides a comprehensive understanding of the genetic basis of neurological disease.