We have developed several candidate gene screening applications for both Neuromuscular and Neurological disorders. The power behind these applications comes from the use of long-read sequencing. It allows us to access previously unresolvable and even unsequencable genomic regions. SMRT Sequencing offers uniform coverage, a lack of sequence context bias, and very high accuracy. In addition, it is also possible to directly detect epigenetic signatures and characterize full-length gene transcripts through assembly-free isoform sequencing. In addition to calling the bases, SMRT Sequencing uses the kinetic information from each nucleotide to distinguish between modified and native bases.
Bipolar disorder (BD) is a phenotypically and genetically complex and debilitating neurological disorder that affects 1% of the worldwide population. There is compelling evidence from family, twin and adoption studies supporting the involvement of a genetic predisposition in BD with estimated heritability up to ~ 80%. The risk in first-degree relatives is ten times higher than in the general population. Linkage and association studies have implicated multiple putative chromosomal loci for BP susceptibility, however no disease genes have been identified to date.
Structural variation accounts for much of the variation among human genomes. Structural variants of all types are known to cause Mendelian disease and contribute to complex disease. Learn how long-read sequencing is enabling detection of the full spectrum of structural variants to advance the study of human disease, evolution and genetic diversity.
Tremendous flexibility is maintained in the human proteome via alternative splicing, and cancer genomes often subvert this flexibility to promote survival. Identification and annotation of cancer-specific mRNA isoforms is critical to understanding how mutations in the genome affect the biology of cancer cells. While microarrays and other NGS-based methods have become useful for studying transcriptomes, these technologies yield short, fragmented transcripts that remain a challenge for accurate, complete reconstruction of splice variants. The Iso-Seq method developed at PacBio offers the only solution for direct sequencing of full-length, single-molecule cDNA sequences needed to discover biomarkers for early detection and cancer stratification,…
In the past several years, single-molecule sequencing platforms, such as those by Pacific Biosciences and Oxford Nanopore Technologies, have become available to researchers and are currently being tested for clinical applications. They offer exceptionally long reads that permit direct sequencing through regions of the genome inaccessible or difficult to analyze by short-read platforms. This includes disease-causing long repetitive elements, extreme GC content regions, and complex gene loci. Similarly, these platforms enable structural variation characterization at previously unparalleled resolution and direct detection of epigenetic marks in native DNA. Here, we review how these technologies are opening up new clinical avenues that…
The discovery of mutations associated with human genetic dis- ease is an exercise in comparative genomics (see Glossary). Although there are many different strategies and approaches, the central premise is that affected persons harbor a significant excess of pathogenic DNA variants as com- pared with a group of unaffected persons (controls) that is either clinically defined1 or established by surveying large swaths of the general population.2 The more exclu- sive the variant is to the disease, the greater its penetrance, the larger its effect size, and the more relevant it becomes to both disease diagnosis and future therapeutic investigation. The…
Human tuberculosis disease (TB), caused by Mycobacterium tuberculosis (Mtb), is a complex disease, with a spectrum of outcomes. Genomic, transcriptomic and methylation studies have revealed differences between Mtb lineages, likely to impact on transmission, virulence and drug resistance. However, so far no studies have integrated sequence-based genomic, transcriptomic and methylation characterisation across a common set of samples, which is critical to understand how DNA sequence and methylation affect RNA expression and, ultimately, Mtb pathogenesis. Here we perform such an integrated analysis across 22?M. tuberculosis clinical isolates, representing ancient (lineage 1) and modern (lineages 2 and 4) strains. The results confirm…
Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and…
Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ~360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ~25-35 Mya and CFHR1 and CFHR3 ~7-13 Mya). Remarkably, all evolutionary breakpoints share a common ~4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has…
The incidence of the autoimmune disease, type 1 diabetes (T1D), has increased dramatically over the last half century in many developed countries and is particularly high in Finland and other Nordic countries. Along with genetic predisposition, environmental factors are thought to play a critical role in this increase. As with other autoimmune diseases, the gut microbiome is thought to play a potential role in controlling progression to T1D in children with high genetic risk, but we know little about how the gut microbiome develops in children with high genetic risk for T1D. In this study, the early development of the…
Despite extensive effort to reveal the genetic basis of complex phenotypic variation, studies typically explain only a fraction of trait heritability. It has been hypothesized that individually rare hidden structural variants (SVs) could account for a significant fraction of variation in complex traits. To investigate this hypothesis, we assembled 14 Drosophila melanogaster genomes and systematically identified more than 20,000 euchromatic SVs, of which ~40% are invisible to high specificity short read genotyping approaches. SVs are common in Drosophila genes, with almost one third of diploid individuals harboring an SV in genes larger than 5kb, and nearly a quarter harboring multiple…