Menu
June 1, 2021

A novel analytical pipeline for de novo haplotype phasing and amplicon analysis using SMRT Sequencing technology.

While the identification of individual SNPs has been readily available for some time, the ability to accurately phase SNPs and structural variation across a haplotype has been a challenge. With individual reads of an average length of 9 kb (P5-C3), and individual reads beyond 30 kb in length, SMRT Sequencing technology allows the identification of mutation combinations such as microdeletions, insertions, and substitutions without any predetermined reference sequence. Long- amplicon analysis is a novel protocol that identifies and reports the abundance of differing clusters of sequencing reads within a single library. Graphs generated via hierarchical clustering of individual sequencing reads are used to generate Markov models representing the consensus sequence of individual clusters found to be significantly different. Long-amplicon analysis is capable of differentiating between underlying sequences that are 99.9% similar, which is suitable for haplotyping and differentiating pseudogenes from coding transcripts. This protocol allows for the identification of structural variation in the MUC5AC gene sequence, despite the presence of a gap in the current genome assembly, and can also be used for HLA haplotyping. Clustering can also been applied to identify full length transcripts for the purpose of estimating consensus sequences and enumerating isoform types. Long-amplicon analysis allows for the elucidation of complex regions otherwise missed by other sequencing technologies, which may contribute to the diagnosis and understanding of otherwise complex diseases.


June 1, 2021

Long Amplicon Analysis: Highly accurate, full-length, phased, allele-resolved gene sequences from multiplexed SMRT Sequencing data.

The correct phasing of genetic variations is a key challenge for many applications of DNA sequencing. Allele-level resolution is strongly preferred for histocompatibility sequencing where recombined genes can exhibit different compatibilities than their parents. In other contexts, gene complementation can provide protection if deleterious mutations are found on only one allele of a gene. These problems are especially pronounced in immunological domains given the high levels of genetic diversity and recombination seen in regions like the Major Histocompatibility Complex. A new tool for analyzing Single Molecule, Real-Time (SMRT) Sequencing data – Long Amplicon Analysis (LAA) – can generate highly accurate, phased and full-length consensus sequences for multiple genes in a single sequencing run.


June 1, 2021

Mitochondrial DNA sequencing using PacBio SMRT technology

Mitochondrial DNA (mtDNA) is a compact, double-stranded circular genome of 16,569 bp with a cytosine-rich light (L) chain and a guanine-rich heavy (H) chain. mtDNA mutations have been increasingly recognized as important contributors to an array of human diseases such as Parkinson’s disease, Alzheimer’s disease, colorectal cancer and Kearns–Sayre syndrome. mtDNA mutations can affect all of the 1000-10,000 copies of the mitochondrial genome present in a cell (homoplasmic mutation) or only a subset of copies (heteroplasmic mutation). The ratio of normal to mutant mtDNAs within cells is a significant factor in whether mutations will result in disease, as well as the clinical presentation, penetrance, and severity of the phenotype. Over time, heteroplasmic mutations can become homoplastic due to differential replication and random assortment. Full characterization of the mitochondrial genome would involve detection of not only homoplastic but heteroplasmic mutations, as well as complete phasing. Previously, we sequenced human mtDNA on the PacBio RS II System with two partially overlapping amplicons. Here, we present amplification-free, full-length sequencing of linearized mtDNA using the Sequel System. Full-length sequencing allows variant phasing along the entire mitochondrial genome, identification of heteroplasmic variants, and detection of epigenetic modifications that are lost in amplicon-based methods.


June 1, 2021

High-throughput SMRT Sequencing of clinically relevant targets

Targeted sequencing with Sanger as well as short read based high throughput sequencing methods is standard practice in clinical genetic testing. However, many applications beyond SNP detection have remained somewhat obstructed due to technological challenges. With the advent of long reads and high consensus accuracy, SMRT Sequencing overcomes many of the technical hurdles faced by Sanger and NGS approaches, opening a broad range of untapped clinical sequencing opportunities. Flexible multiplexing options, highly adaptable sample preparation method and newly improved two well-developed analysis methods that generate highly-accurate sequencing results, make SMRT Sequencing an adept method for clinical grade targeted sequencing. The Circular Consensus Sequencing (CCS) analysis pipeline produces QV 30 data from each single intra-molecular multi-pass polymerase read, making it a reliable solution for detecting minor variant alleles with frequencies as low as 1 %. Long Amplicon Analysis (LAA) makes use of insert spanning full-length subreads originating from multiple individual copies of the target to generate highly accurate and phased consensus sequences (>QV50), offering a unique advantage for imputation free allele segregation and haplotype phasing. Here we present workflows and results for a range of SMRT Sequencing clinical applications. Specifically, we illustrate how the flexible multiplexing options, simple sample preparation methods and new developments in data analysis tools offered by PacBio in support of Sequel System 5.1 can come together in a variety of experimental designs to enable applications as diverse as high throughput HLA typing, mitochondrial DNA sequencing and viral vector integrity profiling of recombinant adeno-associated viral genomes (rAAV).


April 21, 2020

The replication-competent HIV-1 latent reservoir is primarily established near the time of therapy initiation.

Although antiretroviral therapy (ART) is highly effective at suppressing HIV-1 replication, the virus persists as a latent reservoir in resting CD4+ T cells during therapy. This reservoir forms even when ART is initiated early after infection, but the dynamics of its formation are largely unknown. The viral reservoirs of individuals who initiate ART during chronic infection are generally larger and genetically more diverse than those of individuals who initiate therapy during acute infection, consistent with the hypothesis that the reservoir is formed continuously throughout untreated infection. To determine when viruses enter the latent reservoir, we compared sequences of replication-competent viruses from resting peripheral CD4+ T cells from nine HIV-positive women on therapy to viral sequences circulating in blood collected longitudinally before therapy. We found that, on average, 71% of the unique viruses induced from the post-therapy latent reservoir were most genetically similar to viruses replicating just before ART initiation. This proportion is far greater than would be expected if the reservoir formed continuously and was always long lived. We conclude that ART alters the host environment in a way that allows the formation or stabilization of most of the long-lived latent HIV-1 reservoir, which points to new strategies targeted at limiting the formation of the reservoir around the time of therapy initiation.Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


April 21, 2020

PacBio amplicon sequencing for metabarcoding of mixed DNA samples from lichen herbarium specimens.

The detection and identification of species of fungi in the environment using molecular methods heavily depends on reliable reference sequence databases. However, these databases are largely incomplete in terms of taxon coverage, and a significant effort is required from herbaria and living fungal collections for the mass-barcoding of well-identified and well-curated fungal specimens or strains. Here, a PacBio amplicon sequencing approach is applied to recent lichen herbarium specimens for the sequencing of the fungal ITS barcode, allowing a higher throughput sample processing than Sanger sequencing, which often required the use of cloning. Out of 96 multiplexed samples, a full-length ITS sequence of the target lichenised fungal species was recovered for 85 specimens. In addition, sequences obtained for co-amplified fungi gave an interesting insight into the diversity of endolichenic fungi. Challenges encountered at both the laboratory and bioinformatic stages are discussed, and cost and quality are compared with Sanger sequencing. With increasing data output and reducing sequencing cost, PacBio amplicon sequencing is seen as a promising approach for the generation of reference sequences for lichenised fungi as well as the characterisation of lichen-associated fungal communities.


April 21, 2020

Construction of full-length Japanese reference panel of class I HLA genes with single-molecule, real-time sequencing.

Human leukocyte antigen (HLA) is a gene complex known for its exceptional diversity across populations, importance in organ and blood stem cell transplantation, and associations of specific alleles with various diseases. We constructed a Japanese reference panel of class I HLA genes (ToMMo HLA panel), comprising a distinct set of HLA-A, HLA-B, HLA-C, and HLA-H alleles, by single-molecule, real-time (SMRT) sequencing of 208 individuals included in the 1070 whole-genome Japanese reference panel (1KJPN). For high-quality allele reconstruction, we developed a novel pipeline, Primer-Separation Assembly and Refinement Pipeline (PSARP), in which the SMRT sequencing and additional short-read data were used. The panel consisted of 139 alleles, which were all extended from known IPD-IMGT/HLA sequences, contained 40 with novel variants, and captured more than 96.5% of allelic diversity in 1KJPN. These newly available sequences would be important resources for research and clinical applications including high-resolution HLA typing, genetic association studies, and analyzes of cis-regulatory elements.


April 21, 2020

Longitudinal HIV sequencing reveals reservoir expression leading to decay which is obscured by clonal expansion.

After initiating antiretroviral therapy (ART), a rapid decline in HIV viral load is followed by a long period of undetectable viremia. Viral outgrowth assay suggests the reservoir continues to decline slowly. Here, we use full-length sequencing to longitudinally study the proviral landscape of four subjects on ART to investigate the selective pressures influencing the dynamics of the treatment-resistant HIV reservoir. We find intact and defective proviruses that contain genetic elements favoring efficient protein expression decrease over time. Moreover, proviruses that lack these genetic elements, yet contain strong donor splice sequences, increase relatively to other defective proviruses, especially among clones. Our work suggests that HIV expression occurs to a significant extent during ART and results in HIV clearance, but this is obscured by the expansion of proviral clones. Paradoxically, clonal expansion may also be enhanced by HIV expression that leads to splicing between HIV donor splice sites and downstream human exons.


September 22, 2019

Conventional and single-molecule targeted sequencing method for specific variant detection in IKBKG while bypassing the IKBKGP1 pseudogene.

In addition to Sanger sequencing, next-generation sequencing of gene panels and exomes has emerged as a standard diagnostic tool in many laboratories. However, these captures can miss regions, have poor efficiency, or capture pseudogenes, which hamper proper diagnoses. One such example is the primary immunodeficiency-associated gene IKBKG. Its pseudogene IKBKGP1 makes traditional capture methods aspecific. We therefore developed a long-range PCR method to efficiently target IKBKG, as well as two associated genes (IRAK4 and MYD88), while bypassing the IKBKGP1 pseudogene. Sequencing accuracy was evaluated using both conventional short-read technology and a newer long-read, single-molecule sequencer. Different mapping and variant calling options were evaluated in their capability to bypass the pseudogene using both sequencing platforms. Based on these evaluations, we determined a robust diagnostic application for unambiguous sequencing and variant calling in IKBKG, IRAK4, and MYD88. This method allows rapid identification of selected primary immunodeficiency diseases in patients suffering from life-threatening invasive pyogenic bacterial infections. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.


September 22, 2019

Discovery of gorilla MHC-C expressing C1 ligand for KIR.

In comparison to humans and chimpanzees, gorillas show low diversity at MHC class I genes (Gogo), as reflected by an overall reduced level of allelic variation as well as the absence of a functionally important sequence motif that interacts with killer cell immunoglobulin-like receptors (KIR). Here, we use recently generated large-scale genomic sequence data for a reassessment of allelic diversity at Gogo-C, the gorilla orthologue of HLA-C. Through the combination of long-range amplifications and long-read sequencing technology, we obtained, among the 35 gorillas reanalyzed, three novel full-length genomic sequences including a coding region sequence that has not been previously described. The newly identified Gogo-C*03:01 allele has a divergent recombinant structure that sets it apart from other Gogo-C alleles. Domain-by-domain phylogenetic analysis shows that Gogo-C*03:01 has segments in common with Gogo-B*07, the additional B-like gene that is present on some gorilla MHC haplotypes. Identified in ~ 50% of the gorillas analyzed, the Gogo-C*03:01 allele exclusively encodes the C1 epitope among Gogo-C allotypes, indicating its important function in controlling natural killer cell (NK cell) responses via KIR. We further explored the hypothesis whether gorillas experienced a selective sweep which may have resulted in a general reduction of the gorilla MHC class I repertoire. Our results provide little support for a selective sweep but rather suggest that the overall low Gogo class I diversity can be best explained by drastic demographic changes gorillas experienced in the ancient and recent past.


September 22, 2019

Diversity of hepatitis E virus genotype 3

Summary Hepatitis E virus genotype 3 (HEV-3) can lead to chronic infection in immunocompromised patients, and ribavirin is the treatment of choice. Recently, mutations in the polymerase gene have been associated with ribavirin failure but their frequency before treatment according to HEV-3 subtypes has not been studied on a large data set. We used single-molecule real-time sequencing technology to sequence 115 new complete genomes of HEV-3 infecting French patients. We analyzed phylogenetic relationships, the length of the polyproline region, and mutations in the HEV polymerase gene. Eighty-five (74%) were in the clade HEV-3efg, 28 (24%) in HEV-3chi clade, and 2 (2%) in HEV-3ra clade. Using automated partitioning of maximum likelihood phylogenetic trees, complete genomes were classified into subtypes. Polyproline region length differs within HEV-3 clades (from 189 to 315 nt). Investigating mutations in the polymerase gene, distinct polymorphisms between HEV-3 subtypes were found (G1634R in 95% of HEV-3e, G1634K in 56% of HEV-3ra, and V1479I in all HEV-3efg, clade HEV-3ra, and HEV-3k strains). Subtype-specific polymorphisms in the HEV-3 polymerase have been identified. Our study provides new complete genome sequences of HEV-3 that could be useful for comparing strains circulating in humans and the animal reservoir.


September 22, 2019

Biparental Inheritance of Mitochondrial DNA in Humans.

Although there has been considerable debate about whether paternal mitochondrial DNA (mtDNA) transmission may coexist with maternal transmission of mtDNA, it is generally believed that mitochondria and mtDNA are exclusively maternally inherited in humans. Here, we identified three unrelated multigeneration families with a high level of mtDNA heteroplasmy (ranging from 24 to 76%) in a total of 17 individuals. Heteroplasmy of mtDNA was independently examined by high-depth whole mtDNA sequencing analysis in our research laboratory and in two Clinical Laboratory Improvement Amendments and College of American Pathologists-accredited laboratories using multiple approaches. A comprehensive exploration of mtDNA segregation in these families shows biparental mtDNA transmission with an autosomal dominantlike inheritance mode. Our results suggest that, although the central dogma of maternal inheritance of mtDNA remains valid, there are some exceptional cases where paternal mtDNA could be passed to the offspring. Elucidating the molecular mechanism for this unusual mode of inheritance will provide new insights into how mtDNA is passed on from parent to offspring and may even lead to the development of new avenues for the therapeutic treatment for pathogenic mtDNA transmission.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.