Menu
July 19, 2019

Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing.

Single Molecule, Real-Time (SMRT(®)) Sequencing (Pacific Biosciences, Menlo Park, CA, USA) provides the longest continuous DNA sequencing reads currently available. However, the relatively high error rate in the raw read data requires novel analysis methods to deconvolute sequences derived from complex samples. Here, we present a workflow of novel computer algorithms able to reconstruct viral variant genomes present in mixtures with an accuracy of >QV50. This approach relies exclusively on Continuous Long Reads (CLR), which are the raw reads generated during SMRT Sequencing. We successfully implement this workflow for simultaneous sequencing of mixtures containing up to forty different >9 kb HIV-1 full genomes. This was achieved using a single SMRT Cell for each mixture and desktop computing power. This novel approach opens the possibility of solving complex sequencing tasks that currently lack a solution. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 19, 2019

Assembly and diploid architecture of an individual human genome via single-molecule technologies.

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.


July 19, 2019

Microplitis demolitor bracovirus proviral loci and clustered replication genes exhibit distinct DNA amplification patterns during replication.

Polydnaviruses are large, double-stranded DNA viruses that are beneficial symbionts of parasitoid wasps. Polydnaviruses in the genus Bracovirus (BVs) persist in wasps as proviruses, and their genomes consist of two functional components referred to as proviral segments and nudivirus-like genes. Prior studies established that the DNA domains where proviral segments reside are amplified during replication and that segments within amplified loci are circularized before packaging into nucleocapsids. One DNA domain where nudivirus-like genes are located is also amplified but never packaged into virions. We recently sequenced the genome of the braconid Microplitis demolitor, which carries M. demolitor bracovirus (MdBV). Here, we took advantage of this resource to characterize the DNAs that are amplified during MdBV replication using a combination of Illumina and Pacific Biosciences sequencing approaches. The results showed that specific nucleotide sites identify the boundaries of amplification for proviral loci. Surprisingly, however, amplification of loci 3, 4, 6, and 8 produced head-to-tail concatemeric intermediates; loci 1, 2, and 5 produced head-to-head/tail-to-tail concatemers; and locus 7 yielded no identified concatemers. Sequence differences at amplification junctions correlated with the types of amplification intermediates the loci produced, while concatemer processing gave rise to the circularized DNAs that are packaged into nucleocapsids. The MdBV nudivirus-like gene cluster was also amplified, albeit more weakly than most proviral loci and with nondiscrete boundaries. Overall, the MdBV genome exhibited three patterns of DNA amplification during replication. Our data also suggest that PacBio sequencing could be useful in studying the replication intermediates produced by other DNA viruses. Polydnaviruses are of fundamental interest because they provide a novel example of viruses evolving into beneficial symbionts. All polydnaviruses are associated with insects called parasitoid wasps, which are of additional applied interest because many are biological control agents of pest insects. Polydnaviruses in the genus Bracovirus (BVs) evolved ~100 million years ago from an ancestor related to the baculovirus-nudivirus lineage but have also established many novelties due to their symbiotic lifestyle. These include the fact that BVs are transmitted only vertically as proviruses and produce replication-defective virions that package only a portion of the viral genome. Here, we studied Microplitis demolitor bracovirus (MdBV) and report that its genome exhibits three distinct patterns of DNA amplification during replication. We also identify several previously unknown features of BV genomes that correlate with these different amplification patterns. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


July 19, 2019

Novel katG mutations causing isoniazid resistance in clinical M. tuberculosis isolates.

We report the discovery and confirmation of 23 novel mutations with previously undocumented role in isoniazid (INH) drug resistance, in catalase-peroxidase (katG) gene of Mycobacterium tuberculosis (Mtb) isolates. With these mutations, a synonymous mutation in fabG1 (g609a), and two canonical mutations, we were able to explain 98% of the phenotypic resistance observed in 366 clinical Mtb isolates collected from four high tuberculosis (TB)-burden countries: India, Moldova, Philippines, and South Africa. We conducted overlapping targeted and whole-genome sequencing for variant discovery in all clinical isolates with a variety of INH-resistant phenotypes. Our analysis showed that just two canonical mutations (katG 315AGC-ACC and inhA promoter-15C-T) identified 89.5% of resistance phenotypes in our collection. Inclusion of the 23 novel mutations reported here, and the previously documented point mutation in fabG1, increased the sensitivity of these mutations as markers of INH resistance to 98%. Only six (2%) of the 332 resistant isolates in our collection did not harbor one or more of these mutations. The third most prevalent substitution, at inhA promoter position -8, present in 39 resistant isolates, was of no diagnostic significance since it always co-occurred with katG 315. 79% of our isolates harboring novel mutations belong to genetic group 1 indicating a higher tendency for this group to go down an uncommon evolutionary path and evade molecular diagnostics. The results of this study contribute to our understanding of the mechanisms of INH resistance in Mtb isolates that lack the canonical mutations and could improve the sensitivity of next generation molecular diagnostics.


July 19, 2019

SMRT Sequencing of long tandem nucleotide repeats in SCA10 reveals unique insight of repeat expansion structure.

A large, non-coding ATTCT repeat expansion causes the neurodegenerative disorder, spinocerebellar ataxia type 10 (SCA10). In a subset of SCA10 patients, interruption motifs are present at the 5′ end of the expansion and strongly correlate with epileptic seizures. Thus, interruption motifs are a predictor of the epileptic phenotype and are hypothesized to act as a phenotypic modifier in SCA10. Yet, the exact internal sequence structure of SCA10 expansions remains unknown due to limitations in current technologies for sequencing across long extended tracts of tandem nucleotide repeats. We used the third generation sequencing technology, Single Molecule Real Time (SMRT) sequencing, to obtain full-length contiguous expansion sequences, ranging from 2.5 to 4.4 kb in length, from three SCA10 patients with different clinical presentations. We obtained sequence spanning the entire length of the expansion and identified the structure of known and novel interruption motifs within the SCA10 expansion. The exact interruption patterns in expanded SCA10 alleles will allow us to further investigate the potential contributions of these interrupting sequences to the pathogenic modification leading to the epilepsy phenotype in SCA10. Our results also demonstrate that SMRT sequencing is useful for deciphering long tandem repeats that pose as “gaps” in the human genome sequence.


July 19, 2019

The impact of next-generation sequencing technologies on HLA research.

In the past decade, the development of next-generation sequencing (NGS) has paved the way for whole-genome analysis in individuals. Research on the human leukocyte antigen (HLA), an extensively studied molecule involved in immunity, has benefitted from NGS technologies. The HLA region, a 3.6-Mb segment of the human genome at 6p21, has been associated with more than 100 different diseases, primarily autoimmune diseases. Recently, the HLA region has received much attention because severe adverse effects of various drugs are associated with particular HLA alleles. Owing to the complex nature of the HLA genes, classical direct sequencing methods cannot comprehensively elucidate the genomic makeup of HLA genes. Thus far, several high-throughput HLA-typing methods using NGS have been developed. In HLA research, NGS facilitates complete HLA sequencing and is expected to improve our understanding of the mechanisms through which HLA genes are modulated, including transcription, regulation of gene expression and epigenetics. Most importantly, NGS may also permit the analysis of HLA-omics. In this review, we summarize the impact of NGS on HLA research, with a focus on the potential for clinical applications.


July 19, 2019

Selections that isolate recombinant mitochondrial genomes in animals.

Homologous recombination is widespread and catalyzes evolution. Nonetheless, its existence in animal mitochondrial DNA is questioned. We designed selections for recombination between co-resident mitochondrial genomes in various heteroplasmic Drosophila lines. In four experimental settings, recombinant genomes became the sole or dominant genome in the progeny. Thus, selection uncovers occurrence of homologous recombination in Drosophila mtDNA and documents its functional benefit. Double-strand breaks enhanced recombination in the germ line and revealed somatic recombination. When the recombination partner was a diverged D. melanogaster genome or a genome from a different species such as D. yakuba, sequencing revealed long continuous stretches of exchange. In addition, the distribution of sequence polymorphisms in recombinants allowed us to map a selected trait to a particular region in the Drosophila mitochondrial genome. Thus, recombination can be harnessed to dissect function and evolution of mitochondrial genome.


July 19, 2019

Emergence of ebola virus escape variants in infected nonhuman primates treated with the MB-003 antibody cocktail.

MB-003, a plant-derived monoclonal antibody cocktail used effectively in treatment of Ebola virus infection in non-human primates, was unable to protect two of six animals when initiated 1 or 2 days post-infection. We characterized a mechanism of viral escape in one of the animals, after observation of two clusters of genomic mutations that resulted in five nonsynonymous mutations in the monoclonal antibody target sites. These mutations were linked to a reduction in antibody binding and later confirmed to be present in a viral isolate that was not neutralized in vitro. Retrospective evaluation of a second independent study allowed the identification of a similar case. Four SNPs in previously identified positions were found in this second fatality, suggesting that genetic drift could be a potential cause for treatment failure. These findings highlight the importance selecting different target domains for each component of the cocktail to minimize the potential for viral escape. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Mind the gap; seven reasons to close fragmented genome assemblies.

Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.


July 19, 2019

Genomic epidemiology of hypervirulent serogroup W, ST-11 Neisseria meningitidis

Neisseria meningitidis is a leading bacterial cause of sepsis and meningitis globally with dynamic strain distribution over time. Beginning with an epidemic among Hajj pilgrims in 2000, serogroup W (W) sequence type (ST) 11 emerged as a leading cause of epidemic meningitis in the African ‘meningitis belt’ and endemic cases in South America, Europe, Middle East and China. Previous genotyping studies were unable to reliably discriminate sporadic W ST-11 strains in circulation since 1970 from the Hajj outbreak strain (Hajj clone). It is also unclear what proportion of more recent W ST-11 disease clusters are caused by direct descendants of the Hajj clone. Whole genome sequences of 270 meningococcal strains isolated from patients with invasive meningococcal disease globally from 1970 to 2013 were compared using whole genome phylogenetic and major antigen-encoding gene sequence analyses. We found that all W ST-11 strains were descendants of an ancestral strain that had undergone unique capsular switching events. The Hajj clone and its descendants were distinct from other W ST-11 strains in that they shared a common antigen gene profile and had undergone recombination involving virulence genes encoding factor H binding protein, nitric oxide reductase, and nitrite reductase. These data demonstrate that recent acquisition of a distinct antigen-encoding gene profile and variations in meningococcal virulence genes was associated with the emergence of the Hajj clone. Importantly, W ST-11 strains unrelated to the Hajj outbreak contribute a significant proportion of W ST-11 cases globally. This study helps illuminate genomic factors associated with meningococcal strain emergence and evolution.


July 19, 2019

Stepwise evolution of pandrug-resistance in Klebsiella pneumoniae.

Carbapenem resistant Enterobacteriaceae (CRE) pose an urgent risk to global human health. CRE that are non-susceptible to all commercially available antibiotics threaten to return us to the pre-antibiotic era. Using Single Molecule Real Time (SMRT) sequencing we determined the complete genome of a pandrug-resistant Klebsiella pneumoniae isolate, representing the first complete genome sequence of CRE resistant to all commercially available antibiotics. The precise location of acquired antibiotic resistance elements, including mobile elements carrying genes for the OXA-181 carbapenemase, were defined. Intriguingly, we identified three chromosomal copies of an ISEcp1-blaOXA-181 mobile element, one of which has disrupted the mgrB regulatory gene, accounting for resistance to colistin. Our findings provide the first description of pandrug-resistant CRE at the genomic level, and reveal the critical role of mobile resistance elements in accelerating the emergence of resistance to other last resort antibiotics.


July 19, 2019

Variable genetic architectures produce virtually identical molecules in bacterial symbionts of fungus-growing ants.

Small molecules produced by Actinobacteria have played a prominent role in both drug discovery and organic chemistry. As part of a larger study of the actinobacterial symbionts of fungus-growing ants, we discovered a small family of three previously unreported piperazic acid-containing cyclic depsipeptides, gerumycins A-C. The gerumycins are slightly smaller versions of dentigerumycin, a cyclic depsipeptide that selectively inhibits a common fungal pathogen, Escovopsis. We had previously identified this molecule from a Pseudonocardia associated with Apterostigma dentigerum, and now we report the molecule from an associate of the more highly derived ant Trachymyrmex cornetzi. The three previously unidentified compounds, gerumycins A-C, have essentially identical structures and were produced by two different symbiotic Pseudonocardia spp. from ants in the genus Apterostigma found in both Panama and Costa Rica. To understand the similarities and differences in the biosynthetic pathways that produced these closely related molecules, the genomes of the three producing Pseudonocardia were sequenced and the biosynthetic gene clusters identified. This analysis revealed that dramatically different biosynthetic architectures, including genomic islands, a plasmid, and the use of spatially separated genetic loci, can lead to molecules with virtually identical core structures. A plausible evolutionary model that unifies these disparate architectures is presented.


July 19, 2019

Highly sensitive, non-invasive detection of colorectal cancer mutations using single molecule, third generation sequencing.

Colorectal cancer (CRC) represents one of the most prevalent and lethal malignant neoplasms and every individual of age 50 and above should undergo regular CRC screening. Currently, the most effective preventive screening procedure to detect adenomatous polyps, the precursors to CRC, is colonoscopy. Since every colorectal cancer starts as a polyp, detecting all polyps and removing them is crucial. By exactly doing that, colonoscopy reduces CRC incidence by 80%, however it is an invasive procedure that might have unpleasant and, in rare occasions, dangerous side effects. Despite numerous efforts over the past two decades, a non-invasive screening method for the general population with detection rates for adenomas and CRC similar to that of colonoscopy has not yet been established. Recent advances in next generation sequencing technologies have yet to be successfully applied to this problem, because the detection of rare mutations has been hindered by the systematic biases due to sequencing context and the base calling quality of NGS. We present the first study that applies the high read accuracy and depth of single molecule, real time, circular consensus sequencing (SMRT-CCS) to the detection of mutations in stool DNA in order to provide a non-invasive, sensitive and accurate test for CRC. In stool DNA isolated from patients diagnosed with adenocarcinoma, we are able to detect mutations at frequencies below 0.5% with no false positives. This approach establishes a foundation for a non-invasive, highly sensitive assay to screen the population for CRC and the early stage adenomas that lead to CRC.


July 19, 2019

Pangenome analysis of Bifidobacterium longum and site-directed mutagenesis through by-pass of restriction-modification systems.

Bifidobacterial genome analysis has provided insights as to how these gut commensals adapt to and persist in the human GIT, while also revealing genetic diversity among members of a given bifidobacterial (sub)species. Bifidobacteria are notoriously recalcitrant to genetic modification, which prevents exploration of their genomic functions, including those that convey (human) health benefits.PacBio SMRT sequencing was used to determine the whole genome seqeunces of two B. longum subsp. longum strains. The B. longum pan-genome was computed using PGAP v1.2 and the core B. longum phylogenetic tree was constructed using a maximum-likelihood based approach in PhyML v3.0. M.blmNCII was cloned in E. coli and an internal fragment if arfBarfB was cloned into pORI19 for insertion mutagenesis.In this study we present the complete genome sequences of two Bifidobacterium longum subsp. longum strains. Comparative analysis with thirty one publicly available B. longum genomes allowed the definition of the B. longum core and dispensable genomes. This analysis also highlighted differences in particular metabolic abilities between members of the B. longum subspecies infantis, longum and suis. Furthermore, phylogenetic analysis of the B. longum core genome indicated the existence of a novel subspecies. Methylome data, coupled to the analysis of restriction-modification systems, allowed us to substantially increase the genetic accessibility of B. longum subsp. longum NCIMB 8809 to a level that was shown to permit site-directed mutagenesis.Comparative genomic analysis of thirty three B. longum representatives revealed a closed pan-genome for this bifidobacterial species. Phylogenetic analysis of the B. longum core genome also provides evidence for a novel fifth B. longum subspecies. Finally, we improved genetic accessibility for the strain B. longum subsp. longum NCIMB 8809, which allowed the generation of a mutant of this strain.


July 19, 2019

Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.

Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly. The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16?kilobases) reads with random errors, we assembled 99% (244?megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4?megabases. Oropetium is an example of a ‘near-complete’ draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.