Background: HIV-1 proviruses in peripheral blood mononuclear cells (PBMCs) are felt to be an important reservoir of HIV-1 infection. Given that this pool represents an archival library, it can be used to study virus evolution and CD4+ T cell survival. Accurate study of this pool is burdened by difficulties encountered in sequencing a full-length proviral genome, typically accomplished by assembling overlapping pieces and imputing the full genome. Methodology: Cryopreserved PBMCs collected from a total of 8 HIV+ patients from 1997-2001 were used for genomic DNA extraction. Patients had been receiving cART for 2-8 years at the time samples were obtained. 7 patients had pVL >50 copies/mL (mean: 312,282, range: 18,372-683,400) and 1 had pVL <50. Genomic DNA was subjected to limiting dilution prior to amplification of near-full-length genomes by a newly developed nested PCR. The predicted size of the PCR product was 9.0 kb, spanning from the 5’ LTR through the 3’ LTR. Single molecules were sequenced as near-full-length amplicons directly from PCR products without shearing using commercially available P4-C2 reagents and standard protocols on a PacBio RS II instrument. Quality of the genomes was validated by clonal positive controls and synthetic mixtures. Results: Near-full-length provirus genome sequences were successfully obtained from all 8 patients as continuous long reads from single molecules. PacBio sequencing required approximately 10% of the PCR product needed for Sanger sequencing and generated 325 MB per 3-hour run including 1,800 full-length intact genome reads on average. One patient’s sample was not at a limiting dilution and analysis revealed multiple subspecies. For 8 near-fulllength provirus genomes derived from the other 7 patients, large internal deletions were noted in 2 proviruses; APOBEC-mediated hypermutations were seen in 2 proviruses; and 4 proviruses appeared to be intact genomes. All of the defective proviruses showed a complete absence of resistance mutations in either RT or protease, even after 2-8 years of cART. On the contrary, all of the intact proviruses contained evidence of ART-resistance associated mutations suggesting that they represented relatively recent variants. Conclusions: Combining a novel protocol for full-length limiting dilution amplification of proviruses with PacBio SMRT sequencing allowed for the generation of near-full-length genomes with good quality and an ability to detect minor variants at the 1-10% level. Preliminary data analyses suggest that defective proviruses may represent archival variants that persist long-term in host cells, while intact proviruses within the PBMC pool showing evidence of active virus replication may represent more recent variants.
Geminiviruses cause damaging diseases in several important crop species. However, limited progress has been made in developing crop varieties resistant to these highly diverse DNA viruses. Recently, the bacterial CRISPR/Cas9 system has been transferred to plants to target and confer immunity to geminiviruses. In this study, we use CRISPR-Cas9 interference in the staple food crop cassava with the aim of engineering resistance to African cassava mosaic virus, a member of a widespread and important family (Geminiviridae) of plant-pathogenic DNA viruses.Our results show that the CRISPR system fails to confer effective resistance to the virus during glasshouse inoculations. Further, we find that between 33 and 48% of edited virus genomes evolve a conserved single-nucleotide mutation that confers resistance to CRISPR-Cas9 cleavage. We also find that in the model plant Nicotiana benthamiana the replication of the novel, mutant virus is dependent on the presence of the wild-type virus.Our study highlights the risks associated with CRISPR-Cas9 virus immunity in eukaryotes given that the mutagenic nature of the system generates viral escapes in a short time period. Our in-depth analysis of virus populations also represents a template for future studies analyzing virus escape from anti-viral CRISPR transgenics. This is especially important for informing regulation of such actively mutagenic applications of CRISPR-Cas9 technology in agriculture.
One of the most crucial steps in the life cycle of a retrovirus is the integration of the viral DNA (vDNA) copy of the RNA genome into the genome of an infected host cell. Integration provides for efficient viral gene expression as well as for the segregation of viral genomes to daughter cells upon cell division. Some integrated viruses are not well expressed, and cells latently infected with human immunodeficiency virus type 1 (HIV-1) can resist the action of potent antiretroviral drugs and remain dormant for decades. Intensive research has been dedicated to understanding the catalytic mechanism of integration, as well as the viral and cellular determinants that influence integration site distribution throughout the host genome. In this review, we summarize the evolution of techniques that have been used to recover and map retroviral integration sites, from the early days that first indicated that integration could occur in multiple cellular DNA locations, to current technologies that map upwards of millions of unique integration sites from single in vitro integration reactions or cell culture infections. We further review important insights gained from the use of such mapping techniques, including the monitoring of cell clonal expansion in patients treated with retrovirus-based gene therapy vectors, or patients with acquired immune deficiency syndrome (AIDS) on suppressive antiretroviral therapy (ART). These insights span from integrase (IN) enzyme sequence preferences within target DNA (tDNA) at the sites of integration, to the roles of host cellular proteins in mediating global integration distribution, to the potential relationship between genomic location of vDNA integration site and retroviral latency.
Dynamic regulation of HIV-1 mRNA populations analyzed by single-molecule enrichment and long-read sequencing.
Alternative RNA splicing greatly expands the repertoire of proteins encoded by genomes. Next-generation sequencing (NGS) is attractive for studying alternative splicing because of the efficiency and low cost per base, but short reads typical of NGS only report mRNA fragments containing one or few splice junctions. Here, we used single-molecule amplification and long-read sequencing to study the HIV-1 provirus, which is only 9700 bp in length, but encodes nine major proteins via alternative splicing. Our data showed that the clinical isolate HIV-1(89.6) produces at least 109 different spliced RNAs, including a previously unappreciated ~1 kb class of messages, two of which encode new proteins. HIV-1 message populations differed between cell types, longitudinally during infection, and among T cells from different human donors. These findings open a new window on a little studied aspect of HIV-1 replication, suggest therapeutic opportunities and provide advanced tools for the study of alternative splicing.
Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.
Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio’s single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.