Next-generation sequencing (NGS) technologies enable new insights into the diversity of virus populations within their hosts. Diversity estimation is currently restricted to single-nucleotide variants or to local fragments of no more than a few hundred nucleotides defined by the length of sequence reads. To study complex heterogeneous virus populations comprehensively, novel methods are required that allow for complete reconstruction of the individual viral haplotypes. Here, we show that assembly of whole viral genomes of ~8600 nucleotides length is feasible from mixtures of heterogeneous HIV-1 strains derived from defined combinations of cloned virus strains and from clinical samples of an HIV-1 superinfected individual. Haplotype reconstruction was achieved using optimized experimental protocols and computational methods for amplification, sequencing and assembly. We comparatively assessed the performance of the three NGS platforms 454 Life Sciences/Roche, Illumina and Pacific Biosciences for this task. Our results prove and delineate the feasibility of NGS-based full-length viral haplotype reconstruction and provide new tools for studying evolution and pathogenesis of viruses.© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Epidemiological studies have suggested that consumption of beef may correlate with an increased risk of colorectal cancer. One hypothesis to explain this proposed link might be the presence of a carcinogenic infectious agent capable of withstanding cooking. Polyomaviruses are a ubiquitous family of thermostable non-enveloped DNA viruses that are known to be carcinogenic. Using virion enrichment, rolling circle amplification (RCA) and next-generation sequencing, we searched for polyomaviruses in meat samples purchased from several supermarkets. Ground beef samples were found to contain three polyomavirus species. One species, bovine polyomavirus 1 (BoPyV1), was originally discovered as a contaminant in laboratory FCS. A previously unknown species, BoPyV2, occupies the same clade as human Merkel cell polyomavirus and raccoon polyomavirus, both of which are carcinogenic in their native hosts. A third species, BoPyV3, is related to human polyomaviruses 6 and 7. Examples of additional DNA virus families, including herpesviruses, adenoviruses, circoviruses and gyroviruses were also detected either in ground beef samples or in comparison samples of ground pork and ground chicken. The results suggest that the virion enrichment/RCA approach is suitable for random detection of essentially any DNA virus with a detergent-stable capsid. It will be important for future studies to address the possibility that animal viruses commonly found in food might be associated with disease.
Towards better precision medicine: PacBio single-molecule long reads resolve the interpretation of HIV drug resistant mutation profiles at explicit quasispecies (haplotype) level.
Development of HIV-1 drug resistance mutations (HDRMs) is one of the major reasons for the clinical failure of antiretroviral therapy. Treatment success rates can be improved by applying personalized anti-HIV regimens based on a patient’s HDRM profile. However, the sensitivity and specificity of the HDRM profile is limited by the methods used for detection. Sanger-based sequencing technology has traditionally been used for determining HDRM profiles at the single nucleotide variant (SNV) level, but with a sensitivity of only = 20% in the HIV population of a patient. Next Generation Sequencing (NGS) technologies offer greater detection sensitivity (~ 1%) and larger scope (hundreds of samples per run). However, NGS technologies produce reads that are too short to enable the detection of the physical linkages of individual SNVs across the haplotype of each HIV strain present. In this article, we demonstrate that the single-molecule long reads generated using the Third Generation Sequencer (TGS), PacBio RS II, along with the appropriate bioinformatics analysis method, can resolve the HDRM profile at a more advanced quasispecies level. The case studies on patients’ HIV samples showed that the quasispecies view produced using the PacBio method offered greater detection sensitivity and was more comprehensive for understanding HDRM situations, which is complement to both Sanger and NGS technologies. In conclusion, the PacBio method, providing a promising new quasispecies level of HDRM profiling, may effect an important change in the field of HIV drug resistance research.
Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity estimation and global haplotype reconstruction. Challenges posed by aligning reads, as well as the impact of reference biases on diversity estimates are also discussed. In addition, we address some of the experimental approaches designed to improve the biological signal-to-noise ratio. In the future, computational methods for the analysis of heterogeneous virus populations are likely to continue being complemented by technological developments. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
The arthropod-borne Zika virus (ZIKV) is currently causing a major international public health threat in the Americas. This study describes the isolation of ZIKV from the plasma of a 29-year-old female traveler that developed typical symptoms, like rash, fever and headache upon return from Suriname. The complete genome sequence including the 5′ and 3′ untranslated regions was determined and phylogenetic analysis showed the isolate clustering within the Asian lineage, close to other viruses that have recently been isolated in the Americas. In addition, the viral quasispecies composition was analyzed by single molecule real time sequencing, which suggested a mutation frequency of 1.4?×?10(-4) for this ZIKV isolate. Continued passaging of the virus in cell culture led to the selection of variants with mutations in NS1 and the E protein. The latter might influence virus binding to cell surface heparan sulfate.
A comparative study on the characterization of hepatitis B virus quasispecies by clone-based sequencing and third-generation sequencing.
Hepatitis B virus (HBV) has a high mutation rate due to the extremely high replication rate and the proofreading deficiency during reverse transcription. The generated variants with genetic heterogeneity are described as viral quasispecies (QS). Clone-based sequencing (CBS) is thought to be the ‘gold standard’ for assessing QS complexity and diversity of HBV, but an important issue about CBS is cost-effectiveness and laborious. In this study, we investigated the utility of the third-generation sequencing (TGS) DNA sequencing to characterize genetic heterogeneity of HBV QS and assessed the possible contribution of TGS technology in HBV QS studies. Parallel experiments including 3 control samples, which consisted of HBV full gene genotype B and genotype C plasmids, and 10 patients samples were performed by using CBS and TGS to analyze HBV whole-genome QS. Characterization of QS heterogeneity was conducted by using comprehensive statistical analysis. The results showed that TGS had a high consistency with CBS when measuring the complexity and diversity of QS. In addition, to detect rare variants, there were strong advantages conferred by TGS. In summary, TGS was considered to be practicable in HBV QS studies and it might have a relevant role in the clinical management of HBV infection in the future.
The hepatitis viruses represent a major public health problem worldwide. Procedures for characterization of the genomic composition of their populations, accurate diagnosis, identification of multiple infections, and information on inhibitor-escape mutants for treatment decisions are needed. Deep sequencing methodologies are extremely useful for these viruses since they replicate as complex and dynamic quasispecies swarms whose complexity and mutant composition are biologically relevant traits. Population complexity is a major challenge for disease prevention and control, but also an opportunity to distinguish among related but phenotypically distinct variants that might anticipate disease progression and treatment outcome. Detailed characterization of mutant spectra should permit choosing better treatment options, given the increasing number of new antiviral inhibitors available. In the present review we briefly summarize our experience on the use of deep sequencing for the management of hepatitis virus infections, particularly for hepatitis B and C viruses, and outline some possible new applications of deep sequencing for these important human pathogens. Copyright © 2016 Elsevier B.V. All rights reserved.
The quasispecies model is ubiquitous in the study of viruses. While having lead to a number of insights that have stood the test of time, the quasispecies model has mostly been discussed in a theoretical fashion with little support of data. With next-generation sequencing (NGS), this situation is changing and a wealth of data can now be produced in a time- and cost-efficient manner. NGS can, after removal of technical errors, yield an exceedingly detailed picture of the viral population structure. The widespread availability of cross-sectional data can be used to study fitness landscapes of viral populations in the quasispecies model. This chapter highlights methods that estimate the strength of selection in selective sweeps, assesses marginal fitness effects of quasispecies, and finally infers the fitness landscape of a viral quasispecies, all on the basis of NGS data.
The HIV-1 envelope interacts with coreceptors CCR5 and CXCR4 in a dynamic, multi-step process, its molecular details not clearly delineated. Use of CCR5 antagonists results in tropism shift and therapeutic failure. Here we describe a novel approach using full-length patient-derived gp160 quasispecies libraries cloned into HIV-1 molecular clones, their separation based on phenotypic tropism in vitro, and deep sequencing of the resultant variants for structure-function analyses. Analysis of functionally validated envelope sequences from patients who failed CCR5 antagonist therapy revealed determinants strongly associated with coreceptor specificity, especially at the gp120-gp41 and gp41-gp41 interaction surfaces that invite future research on the roles of subunit interaction and envelope trimer stability in coreceptor usage. This study identifies important structure-function relationships in HIV-1 envelope, and demonstrates proof of concept for a new integrated analysis method that facilitates laboratory discovery of resistant mutants to aid in development of other therapeutic agents. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Viruses are pathogens that cause infectious diseases. The swarm of virions is subject to the host’s immune pressure and possibly antiviral therapy. It may escape this selective pressure and gain selective advantage by acquiring one or more of the genomic alterations: single-nucleotide variants (SNVs), loss or gain of one or more amino acids, large deletions, for example, due to alternative splicing, or recombination of different strains. Genotypic antiretroviral drug resistance testing is performed via sequencing. Next-generation sequencing (NGS) technologies revolutionized assessing viral genetic diversity experimentally. In viral quasispecies analysis, there are two main goals: the identification of low-frequency variants and haplotype assembly on a whole-genome scale. PacBio performs single-molecule sequencing. This chapter elaborates human haplotyping and its relationship to probabilistic viral haplotype reconstruction methods. Viral quasispecies assembly has the potential to replace the current de facto diversity estimation by SNV calling. With advances in library preparation, increasing sensitivity of sequencing platforms, and more sophisticated models, it might be possible to detect all or most viral strains in a single individual.