Menu
July 19, 2019

Single molecule real-time (SMRT®) DNA sequencing of HLA genes at ultra-high resolution from 126 International HLA and Immunogenetics Workshop cell lines.

The hyperpolymorphic HLA genes play important roles in disease and transplantation and act as genetic markers of migration and evolution. A panel of 107 B-lymphoblastoid cell lines (B-LCLs) was established in 1987 at the 10th International Histocompatibility Workshop as a resource for the immunogenetics community. These B-LCLs are well characterised and represent diverse ethnicities and HLA haplotypes. Here we have applied Pacific Biosciences’ Single Molecule Real-Time (SMRT) DNA sequencing to HLA type 126 B-LCL, including the 107 IHIW cells, to ultra-high resolution. Amplicon sequencing of full-length HLA class I genes (HLA-A, -B and -C) and partial length HLA class II genes (HLA-DRB1, -DQB1 and -DPB1) was performed. We typed a total of 931 HLA alleles, 895 (96%) of which were consistent with the typing in the IPD-IMGT/HLA Database (Release 3.27.0, 2017-01-20), with 595 (64%) typed at a higher resolution. Discrepant types, including novel alleles (n=10) and changes in zygosity (n=13), as well as previously unreported types (n=34) were observed. In addition, patterns of linkage disequilibrium were distinguished by four-field resolution typing of HLA-B and HLA-C. By improving and standardising the HLA typing of these B-LCLs, we have ensured their continued usefulness as a resource for the immunogenetics community in the age of next generation DNA sequencing.This article is protected by copyright. All rights reserved.


July 19, 2019

Centromere evolution and CpG methylation during vertebrate speciation.

Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read genome assembly of three inbred medaka strains that are derived from geographically isolated subpopulations and undergo speciation. Using single-molecule real-time (SMRT) sequencing, we obtain three chromosome-mapped genomes of length ~734, ~678, and ~744Mbp with a resource of twenty-two centromeric regions of length 20-345kbp. Centromeres are positionally conserved among the three strains and even between four pairs of chromosomes that were duplicated by the teleost-specific whole-genome duplication 320-350 million years ago. The centromeres do not all evolve at a similar pace; rather, centromeric monomers in non-acrocentric chromosomes evolve significantly faster than those in acrocentric chromosomes. Using methylation sensitive SMRT reads, we uncover centromeres are mostly hypermethylated but have hypomethylated sub-regions that acquire unique sequence compositions independently. These findings reveal the potential of non-acrocentric centromere evolution to contribute to speciation.


July 19, 2019

Pacific Biosciences sequencing and IMGT/HighV-QUEST analysis of full-length single chain fragment variable from an in vivo selected phage-display combinatorial Library.

Phage-display selection of immunoglobulin (IG) or antibody single chain Fragment variable (scFv) from combinatorial libraries is widely used for identifying new antibodies for novel targets. Next-generation sequencing (NGS) has recently emerged as a new method for the high throughput characterization of IG and T cell receptor (TR) immune repertoires bothin vivoandin vitro. However, challenges remain for the NGS sequencing of scFv from combinatorial libraries owing to the scFv length (>800?bp) and the presence of two variable domains [variable heavy (VH) and variable light (VL) for IG] associated by a peptide linker in a single chain. Here, we show that single-molecule real-time (SMRT) sequencing with the Pacific Biosciences RS II platform allows for the generation of full-length scFv reads obtained from anin vivoselection of scFv-phages in an animal model of atherosclerosis. We first amplified the DNA of the phagemid inserts from scFv-phages eluted from an aortic section at the third round of thein vivoselection. From this amplified DNA, 450,558 reads were obtained from 15 SMRT cells. Highly accurate circular consensus sequences from these reads were generated, filtered by quality and then analyzed by IMGT/HighV-QUEST with the functionality for scFv. Full-length scFv were identified and characterized in 348,659 reads. Full-length scFv sequencing is an absolute requirement for analyzing the associated VH and VL domains enriched during thein vivopanning rounds. In order to further validate the ability of SMRT sequencing to provide high quality, full-length scFv sequences, we tracked the reads of an scFv-phage clone P3 previously identified by biological assays and Sanger sequencing. Sixty P3 reads showed 100% identity with the full-length scFv of 767?bp, 53 of them covering the whole insert of 977?bp, which encompassed the primer sequences. The remaining seven reads were identical over a shortened length of 939?bp that excludes the vicinity of primers at both ends. Interestingly these reads were obtained from each of the 15 SMRT cells. Thus, the SMRT sequencing method and the IMGT/HighV-QUEST functionality for scFv provides a straightforward protocol for characterization of full-length scFv from combinatorial phage libraries.


July 19, 2019

Methylation in Mycobacterium tuberculosis is lineage specific with associated mutations present globally.

DNA methylation is an epigenetic modification of the genome involved in regulating crucial cellular processes, including transcription and chromosome stability. Advances in PacBio sequencing technologies can be used to robustly reveal methylation sites. The methylome of the Mycobacterium tuberculosis complex is poorly understood but may be involved in virulence, hypoxic survival and the emergence of drug resistance. In the most extensive study to date, we characterise the methylome across the 4 major lineages of M. tuberculosis and 2 lineages of M. africanum, the leading causes of tuberculosis disease in humans. We reveal lineage-specific methylated motifs and strain-specific mutations that are abundant globally and likely to explain loss of function in the respective methyltransferases. Our work provides a set of sixteen new complete reference genomes for the Mycobacterium tuberculosis complex, including complete lineage 5 genomes. Insights into lineage-specific methylomes will further elucidate underlying biological mechanisms and other important phenotypes of the epi-genome.


July 19, 2019

Cytogenomic identification and long-read single molecule real-time (SMRT) sequencing of a Bardet-Biedl Syndrome 9 (BBS9) deletion.

Bardet-Biedl syndrome (BBS) is a recessive disorder characterized by heterogeneous clinical manifestations, including truncal obesity, rod-cone dystrophy, renal anomalies, postaxial polydactyly, and variable developmental delays. At least 20 genes have been implicated in BBS, and all are involved in primary cilia function. We report a 1-year-old male child from Guyana with obesity, postaxial polydactyly on his right foot, hypotonia, ophthalmologic abnormalities, and developmental delay, which together indicated a clinical diagnosis of BBS. Clinical chromosomal microarray (CMA) testing and high-throughput BBS gene panel sequencing detected a homozygous 7p14.3 deletion of exons 1-4 of BBS9 that was encompassed by a 17.5?Mb region of homozygosity at chromosome 7p14.2-p21.1. The precise breakpoints of the deletion were delineated to a 72.8?kb region in the proband and carrier parents by third-generation long-read single molecule real-time (SMRT) sequencing (Pacific Biosciences), which suggested non-homologous end joining as a likely mechanism of formation. Long-read SMRT sequencing of the deletion breakpoints also determined that the aberration included the neighboring RP9 gene implicated in retinitis pigmentosa; however, the clinical significance of this was considered uncertain given the paucity of reported cases with unambiguous RP9 mutations. Taken together, our study characterized a BBS9 deletion, and the identification of this shared haplotype in the parents suggests that this pathogenic aberration may be a BBS founder mutation in the Guyanese population. Importantly, this informative case also highlights the utility of long-read SMRT sequencing to map nucleotide breakpoints of clinically relevant structural variants.


July 19, 2019

Sensitive detection of mitochondrial DNA variants for analysis of mitochondrial DNA-enriched extracts from frozen tumor tissue.

Large variation exists in mitochondrial DNA (mtDNA) not only between but also within individuals. Also in human cancer, tumor-specific mtDNA variation exists. In this work, we describe the comparison of four methods to extract mtDNA as pure as possible from frozen tumor tissue. Also, three state-of-the-art methods for sensitive detection of mtDNA variants were evaluated. The main aim was to develop a procedure to detect low-frequent single-nucleotide mtDNA-specific variants in frozen tumor tissue. We show that of the methods evaluated, DNA extracted from cytosol fractions following exonuclease treatment results in highest mtDNA yield and purity from frozen tumor tissue (270-fold mtDNA enrichment). Next, we demonstrate the sensitivity of detection of low-frequent single-nucleotide mtDNA variants (=1% allele frequency) in breast cancer cell lines MDA-MB-231 and MCF-7 by single-molecule real-time (SMRT) sequencing, UltraSEEK chemistry based mass spectrometry, and digital PCR. We also show de novo detection and allelic phasing of variants by SMRT sequencing. We conclude that our sensitive procedure to detect low-frequent single-nucleotide mtDNA variants from frozen tumor tissue is based on extraction of DNA from cytosol fractions followed by exonuclease treatment to obtain high mtDNA purity, and subsequent SMRT sequencing for (de novo) detection and allelic phasing of variants.


July 19, 2019

Genomic analysis of hospital plumbing reveals diverse reservoir of bacterial plasmids conferring carbapenem resistance.

The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization ofblaKPC/blaNDM-positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections withblaKPC-positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission ofLeclerciasp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings.IMPORTANCECarbapenemase-producing organisms (CPOs) are a global concern because of the morbidity and mortality associated with these resistant Gram-negative bacteria. Horizontal plasmid transfer spreads the resistance mechanism to new bacteria, and understanding the plasmid ecology of the hospital environment can assist in the design of control strategies to prevent nosocomial infections. A 5-year genomic and epidemiological survey was undertaken to study the CPOs in the patient-accessible environment, as well as in the plumbing system removed from the patient. This comprehensive survey revealed a vast, unappreciated reservoir of CPOs in wastewater, which was in contrast to the low positivity rate in both the patient population and the patient-accessible environment. While there were few patient-environmental isolate associations, there were plasmid backbones common to both populations. These results are relevant to all hospitals for which CPO colonization may not yet be defined through extensive surveillance.


July 19, 2019

A high-throughput approach for identification of nontuberculous mycobacteria in drinking water reveals relationship between water age and Mycobacterium avium.

Nontuberculous mycobacteria (NTM) frequently detected in drinking water (DW) include species associated with human infections, as well as species rarely linked to disease. Methods for improved the recovery of NTM DNA and high-throughput identification of NTM are needed for risk assessment of NTM infection through DW exposure. In this study, different methods of recovering bacterial DNA from DW were compared, revealing that a phenol-chloroform DNA extraction method yielded two to four times as much total DNA and eight times as much NTM DNA as two commercial DNA extraction kits. This method, combined with high-throughput, single-molecule real-time sequencing of NTMrpoBgenes, allowed the identification of NTM to the species, subspecies, and (in some cases) strain levels. This approach was applied to DW samples collected from 15 households serviced by a chloraminated distribution system, with homes located in areas representing short (<24 h) and long (>24 h) distribution system residence times. Multivariate statistical analysis revealed that greater water age (i.e., combined distribution system residence time and home plumbing stagnation time) was associated with a greater relative abundance ofMycobacterium aviumsubsp.avium, one of the most prevalent NTM causing infections in humans. DW from homes closer to the treatment plant (with a shorter water age) contained more diverse NTM species, includingMycobacterium abscessusandMycobacterium chelonaeOverall, our approach allows NTM identification to the species and subspecies levels and can be used in future studies to assess the risk of waterborne infection by providing insight into the similarity between environmental and infection-associated NTM.IMPORTANCEAn extraction method for improved recovery of DNA from nontuberculous mycobacteria (NTM), combined with single-molecule real-time sequencing (PacBio) of NTMrpoBgenes, was used for high-throughput characterization of NTM species and in some cases strains in drinking water (DW). The extraction procedure recovered, on average, eight times as much NTM DNA and three times as much total DNA from DW as two widely used commercial DNA extraction kits. The combined DNA extraction and sequencing approach allowed high-throughput screening of DW samples to identify NTM, revealing that the relative abundance ofMycobacterium aviumsubsp.aviumincreased with water age. Furthermore, the two-step barcoding approach developed as part of the PacBio sequencing method makes this procedure highly adaptable, allowing it to be used for other target genes and species. Copyright © 2018 Haig et al.


July 19, 2019

Dissecting the causal mechanism of X-linked Dystonia-Parkinsonism by integrating genome and transcriptome assembly.

X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders. Copyright © 2018 Elsevier Inc. All rights reserved.


July 19, 2019

Firefly genomes illuminate parallel origins of bioluminescence in beetles.

Fireflies and their luminous courtships have inspired centuries of scientific study. Today firefly luciferase is widely used in biotechnology, but the evolutionary origin of bioluminescence within beetles remains unclear. To shed light on this long-standing question, we sequenced the genomes of two firefly species that diverged over 100 million-years-ago: the North American Photinus pyralis and Japanese Aquatica lateralis. To compare bioluminescent origins, we also sequenced the genome of a related click beetle, the Caribbean Ignelater luminosus, with bioluminescent biochemistry near-identical to fireflies, but anatomically unique light organs, suggesting the intriguing hypothesis of parallel gains of bioluminescence. Our analyses support independent gains of bioluminescence in fireflies and click beetles, and provide new insights into the genes, chemical defenses, and symbionts that evolved alongside their luminous lifestyle.© 2018, Fallon et al.


July 19, 2019

Structure and distribution of centromeric retrotransposons at diploid and allotetraploid Coffea centromeric and pericentromeric regions.

Centromeric regions of plants are generally composed of large array of satellites from a specific lineage ofGypsyLTR-retrotransposons, called Centromeric Retrotransposons. Repeated sequences interact with a specific H3 histone, playing a crucial function on kinetochore formation. To study the structure and composition of centromeric regions in the genusCoffea, we annotated and classified Centromeric Retrotransposons sequences from the allotetraploidC. arabicagenome and its two diploid ancestors:Coffea canephoraandC. eugenioides. Ten distinct CRC (Centromeric Retrotransposons inCoffea) families were found. The sequence mapping and FISH experiments of CRC Reverse Transcriptase domains inC. canephora, C. eugenioides, andC. arabicaclearly indicate a strong and specific targeting mainly onto proximal chromosome regions, which can be associated also with heterochromatin. PacBio genome sequence analyses of putative centromeric regions onC. arabicaandC. canephorachromosomes showed an exceptional density of one family of CRC elements, and the complete absence of satellite arrays, contrasting with usual structure of plant centromeres. Altogether, our data suggest a specific centromere organization inCoffea, contrasting with other plant genomes.


July 19, 2019

Single molecule real time sequencing in ADTKD-MUC1 allows complete assembly of the VNTR and exact positioning of causative mutations.

Recently, the Mucin-1 (MUC1) gene has been identified as a causal gene of autosomal dominant tubulointerstitial kidney disease (ADTKD). Most causative mutations are buried within a GC-rich 60 basepair variable number of tandem repeat (VNTR), which escapes identification by massive parallel sequencing methods due to the complexity of the VNTR. We established long read single molecule real time sequencing (SMRT) targeted to the MUC1-VNTR as an alternative strategy to the snapshot assay. Our approach allows complete VNTR assembly, thereby enabling the detection of all variants residing within the VNTR and simultaneous determination of VNTR length. We present high resolution data on the VNTR architecture for a cohort of snapshot positive (n?=?9) and negative (n?=?7) ADTKD families. By SMRT sequencing we could confirm the diagnosis in all previously tested cases, reconstruct both VNTR alleles and determine the exact position of the causative variant in eight of nine families. This study demonstrates that precise positioning of the causative mutation(s) and identification of other coding and noncoding sequence variants in ADTKD-MUC1 is feasible. SMRT sequencing could provide a powerful tool to uncover potential factors encoded within the VNTR that associate with intra- and interfamilial phenotype variability of MUC1 related kidney disease.


July 19, 2019

Resolving the complete genome of Kuenenia stuttgartiensis from a membrane bioreactor enrichment using Single-Molecule Real-Time sequencing.

Anaerobic ammonium-oxidizing (anammox) bacteria are a group of strictly anaerobic chemolithoautotrophic microorganisms. They are capable of oxidizing ammonium to nitrogen gas using nitrite as a terminal electron acceptor, thereby facilitating the release of fixed nitrogen into the atmosphere. The anammox process is thought to exert a profound impact on the global nitrogen cycle and has been harnessed as an environment-friendly method for nitrogen removal from wastewater. In this study, we present the first closed genome sequence of an anammox bacterium, Kuenenia stuttgartiensis MBR1. It was obtained through Single-Molecule Real-Time (SMRT) sequencing of an enrichment culture constituting a mixture of at least two highly similar Kuenenia strains. The genome of the novel MBR1 strain is different from the previously reported Kuenenia KUST reference genome as it contains numerous structural variations and unique genomic regions. We find new proteins, such as a type 3b (sulf)hydrogenase and an additional copy of the hydrazine synthase gene cluster. Moreover, multiple copies of ammonium transporters and proteins regulating nitrogen uptake were identified, suggesting functional differences in metabolism. This assembly, including the genome-wide methylation profile, provides a new foundation for comparative and functional studies aiming to elucidate the biochemical and metabolic processes of these organisms.


July 19, 2019

Coupling of single molecule, long read sequencing with IMGT/HighV-QUEST analysis expedites identification of SIV gp140-specific antibodies from scFv phage display libraries.

The simian immunodeficiency virus (SIV)/macaque model of human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome pathogenesis is critical for furthering our understanding of the role of antibody responses in the prevention of HIV infection, and will only increase in importance as macaque immunoglobulin (IG) gene databases are expanded. We have previously reported the construction of a phage display library from a SIV-infected rhesus macaque (Macaca mulatta) using oligonucleotide primers based on human IG gene sequences. Our previous screening relied on Sanger sequencing, which was inefficient and generated only a few dozen sequences. Here, we re-analyzed this library using single molecule, real-time (SMRT) sequencing on the Pacific Biosciences (PacBio) platform to generate thousands of highly accurate circular consensus sequencing (CCS) reads corresponding to full length single chain fragment variable. CCS data were then analyzed through the international ImMunoGeneTics information system®(IMGT®)/HighV-QUEST (www.imgt.org) to identify variable genes and perform statistical analyses. Overall the library was very diverse, with 2,569 different IMGT clonotypes called for the 5,238 IGHV sequences assigned to an IMGT clonotype. Within the library, SIV-specific antibodies represented a relatively limited number of clones, with only 135 different IMGT clonotypes called from 4,594 IGHV-assigned sequences. Our data did confirm that the IGHV4 and IGHV3 gene usage was the most abundant within the rhesus antibodies screened, and that these genes were even more enriched among SIV gp140-specific antibodies. Although a broad range of VH CDR3 amino acid (AA) lengths was observed in the unpanned library, the vast majority of SIV gp140-specific antibodies demonstrated a more uniform VH CDR3 length (20 AA). This uniformity was far less apparent when VH CDR3 were classified according to their clonotype (range: 9-25 AA), which we believe is more relevant for specific antibody identification. Only 174 IGKV and 588 IGLV clonotypes were identified within the VL sequences associated with SIV gp140-specific VH. Together, these data strongly suggest that the combination of SMRT sequencing with the IMGT/HighV-QUEST querying tool will facilitate and expedite our understanding of polyclonal antibody responses during SIV infection and may serve to rapidly expand the known scope of macaque V genes utilized during these responses.


July 19, 2019

Biomonitoring for traditional herbal medicinal products using DNA metabarcoding and single molecule, real-time sequencing.

Global concerns have been paid to the potential hazard of traditional herbal medicinal products (THMPs). Substandard and counterfeit THMPs, including traditional Chinese patent medicine, health foods, dietary supplements, etc. are potential threats to public health. Recent marketplace studies using DNA barcoding have determined that the current quality control methods are not sufficient for ensuring the presence of authentic herbal ingredients and detection of contaminants/adulterants. An efficient biomonitoring method for THMPs is of great needed. Herein, metabarcoding and single-molecule, real-time (SMRT) sequencing were used to detect the multiple ingredients in Jiuwei Qianghuo Wan (JWQHW), a classical herbal prescription widely used in China for the last 800 years. Reference experimental mixtures and commercial JWQHW products from the marketplace were used to confirm the method. Successful SMRT sequencing results recovered 5416 and 4342 circular-consensus sequencing (CCS) reads belonging to the ITS2 and psbA-trnH regions. The results suggest that with the combination of metabarcoding and SMRT sequencing, it is repeatable, reliable, and sensitive enough to detect species in the THMPs, and the error in SMRT sequencing did not affect the ability to identify multiple prescribed species and several adulterants/contaminants. It has the potential for becoming a valuable tool for the biomonitoring of multi-ingredient THMPs.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.