For the first time, PacBio is enabling high-throughput, comprehensive profiling of SMN1 and SMN2 with an all-new software tool called Paraphase.
Since the first known case of spinal muscular atrophy (SMA) in 1899, researchers have been trying to discover the causes and potential treatments for this inherited neuromuscular disorder. Currently listed as the leading cause of early infant death, SMA is one of the most prevalent recessive disorders around the globe. Newborn screening and carrier screening for SMA are recommended by the American College of Medical Genetics and Genomics and the SMA Newborn Screening Alliance is working to make screening a standard practice in all European countries by 2025. However, complete understanding of the disease has been hindered by the unique problems of the genomic regions where the disease-causing genes reside.
In a new paper in the American Journal of Human Genomics, scientist from PacBio in collaboration with the Genomic Medicine Center, and Children’s Mercy Kansas City, in the U.S., the Department of Human Genetics, and Radboud University Medical Center in the Netherlands, Genomics England Ltd. in the U.K., and the Genomics England Research Consortium outline the use of a new PacBio tool, Paraphase. The paper highlights how this tool can fundamentally change the way we study SMA and what we can learn.
How informatics enables exploration of SMA at scale
Two genes play a role in the development and severity of SMA – SMN1 and SMN2. SMN1 is highly homologous to SMN2, its paralog. Residing in a genomic region with highly complex long repetitive sequences, the two genes are near identical in sequence except for a few bases, creating challenges for sequence analysis and variant calling. Both genes have variable copy numbers across populations and are analyzed with a variety of methods (typically PCR-based dosage testing combined with sequencing) – each with their limitations. Identifying silent carriers (having two copies of SMN1 on one chromosome and zero copies on the other), for example, is virtually impossible without pedigree information. These silent carriers account for about 27% of carriers in African populations.
The result? Not everyone can be screened or informed appropriately, and there is limited understanding about the variants that play a part in disease severity.
This is where PacBio’s latest tool, Paraphase, aims to change the game. The informatics method identifies full-length SMN1 and SMN2 haplotypes, determines the gene copy numbers, and calls phased variants using long-read PacBio HiFi data. In the study, authors utilized Paraphase and long-read HiFi sequencing to conduct a first-of-its-kind population study across 438 samples from five ethnic populations, identifying the major SMN1 and SMN2 sequence haplogroups. This study enabled them to measure the diversity of the region to further understand how it has evolved while also shining a light on the biological forces at play. Interestingly, the team identified two SMN1 haplotypes forming a common two-copy SMN1 allele in African populations. Testing positive for these two haplotypes in an individual with two copies of SMN1 gives a silent carrier risk of 88.5%, which is significantly higher than the currently used SNP (single nucleotide polymorphisms) marker (1.7%–3.0%), demonstrating the potential of haplotype-based screening of silent carriers.
For years, researchers have known that SMN1 and SMN2 play roles in the onset and severity of SMA but have been unable to answer exactly why and how people develop more or less severe forms of the disease. With Paraphase, there is now the opportunity to answer some of those long-standing questions and enable more accurate carrier screening.