Michael Lutz, from the Duke University Medical Center, discussed a recently published software tool that can now be used in a pipeline with SMRT Sequencing data to find structural variant biomarkers for neurodegenerative diseases with a focus on Alzheimer’s disease, ALS, and Lewy body dementia. His team is particularly interested in short sequence repeats and short tandem repeats, which have already been implicated in neurodegenerative disease.
Targeted sequencing experiments commonly rely on either PCR or hybrid capture to enrich for targets of interest. When using short read sequencing platforms, these amplicons or fragments are frequently targeted to a few hundred base pairs to accommodate the read lengths of the platform. Given PacBio’s long readlength, it is straightforward to sequence amplicons or captured fragments that are multiple kilobases in length. These long sequences are useful for easily visualizing variants that include SNPs, CNVs and other structural variants, often without assembly. We will review methods for the sequencing of long amplicons and provide examples using amplicons that range…
Long-read mRNA sequencing such as PacBio’s Iso-Seq method offer high-throughput transcriptome profiling that circumvents the transcript assembly problem by sequencing full-length cDNA. The Iso-Seq method has emerged as the most reliable technology for fully characterizing isoforms and, in turn, help shed light on underlying disease mechanisms. Here, we have utilized the Iso-Seq method to sequence an Alzheimer’s disease whole brain?sample. This is a devastating neurodegenerative disease that affects ~44 million people worldwide, making it the most common form of dementia. Studies looking into disease mechanism have shown that changes in gene expression due to alternative splicing likely contribute to the…
The PacBio Iso-Seq method produces high-quality, full-length transcripts and can characterize a whole transcriptome with a single SMRT Cell 8M. We sequenced an Alzheimer whole brain sample on a single SMRT Cell 8M on the Sequel II System. Using the Iso-Seq bioinformatics pipeline followed by SQANTI2 analysis, we detected 162,290 transcripts for 17,670 genes up to 14 kb in length. More than 60% of the transcripts are novel isoforms, the vast majority of which have supporting cage peak data and polyadenylation signals, demonstrating the utility of long-read sequencing for human disease research.
We have developed several candidate gene screening applications for both Neuromuscular and Neurological disorders. The power behind these applications comes from the use of long-read sequencing. It allows us to access previously unresolvable and even unsequencable genomic regions. SMRT Sequencing offers uniform coverage, a lack of sequence context bias, and very high accuracy. In addition, it is also possible to directly detect epigenetic signatures and characterize full-length gene transcripts through assembly-free isoform sequencing. In addition to calling the bases, SMRT Sequencing uses the kinetic information from each nucleotide to distinguish between modified and native bases.
Genes associated with several neurological disorders have been shown to be highly polymorphic. Targeted sequencing of these genes using NGS technologies is a powerful way to increase the cost-effectiveness of variant discovery and detection. However, for a comprehensive view of these target genes, it is necessary to have complete and uniform coverage across regions of interest. Unfortunately, short-read sequencing technologies are not ideal for these types of studies as they are prone to mis-mapping and often fail to span repetitive regions. Targeted sequencing with PacBio long reads provides the unique advantage of single-molecule observations of complex genomic regions. PacBio long…
Alzheimer’s disease (AD) is a devastating neurodegenerative disease that is genetically complex. Although great progress has been made in identifying fully penetrant mutations in genes such as APP, PSEN1 and PSEN2 that cause early-onset AD, these still represent a very small percentage of AD cases. Large-scale, genome-wide association studies (GWAS) have identified at least 20 additional genetic risk loci for the more common form of late-onset AD. However, the identified SNPs are typically not the actual risk variants, but are in linkage disequilibrium with the presumed causative variant (Van Cauwenberghe C, et al., The genetic landscape of Alzheimer disease: clinical…
Alzheimer’s disease (AD) is a devastating neurodegenerative disease that is genetically complex. Although great progress has been made in identifying fully penetrant mutations in genes such as APP, PSEN1 and PSEN2 that cause early-onset AD, these still represent a very small percentage of AD cases. Large-scale, genome-wide association studies (GWAS) have identified at least 20 additional genetic risk loci for the more common form of late-onset AD. However, the identified SNPs are typically not the actual causal variants, but are in linkage disequilibrium with the presumed causative variant (Van Cauwenberghe C, et al., The genetic landscape of Alzheimer disease: clinical…
Over the past decades neurological disorders have been extensively studied producing a large number of candidate genomic regions and candidate genes. The SNPs identified in these studies rarely represent the true disease-related functional variants. However, more recently a shift in focus from SNPs to larger structural variants has yielded breakthroughs in our understanding of neurological disorders.Here we have developed candidate gene screening methods that combine enrichment of long DNA fragments with long-read sequencing that is optimized for structural variation discovery. We have also developed a novel, amplification-free enrichment technique using the CRISPR/Cas9 system to target genomic regions.We sequenced gDNA and…
Mitochondrial DNA (mtDNA) is a compact, double-stranded circular genome of 16,569 bp with a cytosine-rich light (L) chain and a guanine-rich heavy (H) chain. mtDNA mutations have been increasingly recognized as important contributors to an array of human diseases such as Parkinson’s disease, Alzheimer’s disease, colorectal cancer and Kearns–Sayre syndrome. mtDNA mutations can affect all of the 1000-10,000 copies of the mitochondrial genome present in a cell (homoplasmic mutation) or only a subset of copies (heteroplasmic mutation). The ratio of normal to mutant mtDNAs within cells is a significant factor in whether mutations will result in disease, as well as…
Germline mutations ofAPP,PSEN1, andPSEN2 genes cause autosomal dominant Alzheimer disease (AD). Somatic variants of the same genes may underlie pathogenesis in sporadic AD, which is the most prevalent form of the disease. Importantly, such somatic variants may be present at very low allelic frequency, confined to the brain, and are thus very difficult or impossible to detect in blood-derived DNA. Ever-refined methodologies to identify mutations present in a fraction of the DNA of the original tissue are rapidly transforming our understanding of DNA mutation and their role in complex pathologies such as tumors. These methods stand poised to test to…
The methodology of Genome-Wide Association Screening (GWAS) has been applied for more than a decade. Translation to clinical utility has been limited, especially in Alzheimer’s Disease (AD). It has become standard practice in the analyses of more than two dozen AD GWAS studies to exclude the apolipoprotein E (APOE) region because of its extraordinary statistical support, unique thus far in complex human diseases. New genes associated with AD are proposed frequently based on SNPs associated with odds ratio (OR)