In this PacBio User Group Meeting presentation, PacBio scientist Kristin Mars speaks about recent updates, such as the single-day library prep that’s now possible with the Iso-Seq Express workflow. She…
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity. Copyright © 2018 Elsevier Inc. All rights reserved.
The macaque simian or simian/human immunodeficiency virus (SIV/SHIV) challenge model has been widely used to inform and guide human vaccine trials. Substantial advances have been made recently in the application of repeated-low-dose challenge (RLD) approach to assess SIV/SHIV vaccine efficacies (VE). Some candidate HIV vaccines have shown protective effects in preclinical studies using the macaque SIV/SHIV model but the model’s true predictive value for screening potential HIV vaccine candidates needs to be evaluated further. Here, we review key parameters used in the RLD approach and discuss their relevance for evaluating VE to improve preclinical studies of candidate HIV vaccines.Crown Copyright © 2019. Published by Elsevier Ltd. All rights reserved.
The development of clustered regularly interspaced short-palindromic repeat (CRISPR)-Cas systems for genome editing has transformed the way life science research is conducted and holds enormous potential for the treatment of disease as well as for many aspects of biotech- nology. Here, I provide a personal perspective on the development of CRISPR-Cas9 for genome editing within the broader context of the field and discuss our work to discover novel Cas effectors and develop them into additional molecular tools. The initial demonstra- tion of Cas9-mediated genome editing launched the development of many other technologies, enabled new lines of biological inquiry, and motivated a deeper examination of natural CRISPR-Cas systems, including the discovery of new types of CRISPR-Cas systems. These new discoveries in turn spurred further technological developments. I review these exciting discoveries and technologies as well as provide an overview of the broad array of applications of these technologies in basic research and in the improvement of human health. It is clear that we are only just beginning to unravel the potential within microbial diversity, and it is quite likely that we will continue to discover other exciting phenomena, some of which it may be possible to repurpose as molecular technologies. The transformation of mysterious natural phenomena to powerful tools, however, takes a collective effort to discover, characterize, and engineer them, and it has been a privilege to join the numerous researchers who have contributed to this transformation of CRISPR-Cas systems.
Long-read sequencing, CENP-A ChIP, and chromatin fiber imaging reveal the composition and organization of Drosophila melanogaster centromeres, which have long remained elusive despite the high quality of this species’ genome. assembly.
Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids.We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58 insertions that were absent from the reference assembly. Insertions were found throughout the dog genome including within and near gene models. By comparison of orthologous occupied sites, we characterized element prevalence across 332 genomes including all nine extant canid species, revealing evolutionary patterns of ERV-Fc segregation among species as well as subpopulations.Sequence analysis revealed common disruptive mutations, suggesting a predominant form of ERV-Fc spread by trans complementation of defective proviruses. ERV-Fc activity included multiple circulating variants that infected canid ancestors from the last 20 million to within 1.6 million years, with recent bursts of germline invasion in the sublineage leading to wolves and dogs.
One of the most crucial steps in the life cycle of a retrovirus is the integration of the viral DNA (vDNA) copy of the RNA genome into the genome of an infected host cell. Integration provides for efficient viral gene expression as well as for the segregation of viral genomes to daughter cells upon cell division. Some integrated viruses are not well expressed, and cells latently infected with human immunodeficiency virus type 1 (HIV-1) can resist the action of potent antiretroviral drugs and remain dormant for decades. Intensive research has been dedicated to understanding the catalytic mechanism of integration, as well as the viral and cellular determinants that influence integration site distribution throughout the host genome. In this review, we summarize the evolution of techniques that have been used to recover and map retroviral integration sites, from the early days that first indicated that integration could occur in multiple cellular DNA locations, to current technologies that map upwards of millions of unique integration sites from single in vitro integration reactions or cell culture infections. We further review important insights gained from the use of such mapping techniques, including the monitoring of cell clonal expansion in patients treated with retrovirus-based gene therapy vectors, or patients with acquired immune deficiency syndrome (AIDS) on suppressive antiretroviral therapy (ART). These insights span from integrase (IN) enzyme sequence preferences within target DNA (tDNA) at the sites of integration, to the roles of host cellular proteins in mediating global integration distribution, to the potential relationship between genomic location of vDNA integration site and retroviral latency.
Viruses of the subfamily Orthoretrovirinaeare defined by the ability to reverse transcribe an RNA genome into DNA that integrates into the host cell genome during the intracellular virus life cycle. Exogenous retroviruses (XRVs) are horizontally transmitted between host individuals, with disease outcome depending on interactions between the retrovirus and the host organism. When retroviruses infect germ line cells of the host, they may become endogenous retroviruses (ERVs), which are permanent elements in the host germ line that are subject to vertical transmission. These ERVs sometimes remain infectious and can themselves give rise to XRVs. This review integrates recent developments in the phylogenetic classification of retroviruses and the identification of retroviral receptors to elucidate the origins and evolution of XRVs and ERVs. We consider whether ERVs may recurrently pressure XRVs to shift receptor usage to sidestep ERV interference. We discuss how related retroviruses undergo alternative fates in different host lineages after endogenization, with koala retrovirus (KoRV) receiving notable interest as a recent invader of its host germ line. KoRV is heritable but also infectious, which provides insights into the early stages of germ line invasions as well as XRV generation from ERVs. The relationship of KoRV to primate and other retroviruses is placed in the context of host biogeography and the potential role of bats and rodents as vectors for interspecies viral transmission. Combining studies of extant XRVs and “fossil” endogenous retroviruses in koalas and other Australasian species has broadened our understanding of the evolution of retroviruses and host-retrovirus interactions. Copyright © 2017 American Society for Microbiology.
Here, we present the complete genome sequence of a porcine endogenous retrovirus determined by Pacific Biosciences sequencing. A comparison of the genome of this isolate with those of other strains revealed the operation of a mechanism resulting in the selective accumulation of G and C bases in the viral DNA. Copyright © 2017 Szucs et al.
HIV-1 infection of primary CD4(+) T cells regulates the expression of specific HERV-K (HML-2) elements.
Endogenous retroviruses (ERVs) occupy extensive regions of the human genome. Although many of these retroviral elements have lost their ability to replicate, those whose insertion took place more recently, such as the HML-2 group of HERV-K elements, still retain intact open reading frames and the capacity to produce certain viral RNA and/or proteins. Transcription of these ERVs is, however, tightly regulated by dedicated epigenetic control mechanisms. Nonetheless, it has been reported that some pathologic states, such as viral infections and certain cancers, coincide with ERV expression suggesting transcriptional reawakening is possible. HML-2 elements are reportedly induced during HIV-1 infection, but the conserved nature of these elements has, until recently, rendered their expression profiling problematic.Here, we provide comprehensive HERV-K HML-2 expression profiles specific for productively HIV-1 infected primary human CD4(+) T cells. We combined enrichment of HIV-1 infected cells using a reporter virus expressing a surface reporter for gentle and efficient purification with long-read Single Molecule Real-Time sequencing. We show that three HML-2 proviruses, 6q25.1, 8q24.3, and 19q13.42 are up-regulated on average between 3- and 5-fold in HIV-1 infected CD4(+) T cells. One provirus, HML-2 12q24.33, in contrast, was repressed in the presence of active HIV replication.In conclusion, this report identifies the HERV-K HML-2 loci whose expression profiles differ upon HIV-1 infection in primary human CD4(+) T cells. These data will help pave the way for further studies on the influence of endogenous retroviruses on HIV-1 replication.Importance Endogenous retroviruses inhabit big portions of our genome. And although they are mainly inert some of the evolutionarily younger members maintain the ability to express both RNA as well as proteins. We have developed an approach using long-read SMRT sequencing that produces long reads, that provides us with ability to obtain detailed and accurate HERV-K HML-2 expression profiles. We have now applied this approach to study HERV-K expression in the presence and absence of productive HIV-1 infection of primary human CD4(+) T cells. In addition to using SMRT sequencing, our strategy also includes the magnetic selection of the infected cells so that levels of background expression due to uninfected cells are kept at a minimum. The results in this manuscript provide the blueprint for in-depth studies of the interactions of the authentic upregulated HERV-K HML-2 elements and HIV-1. Copyright © 2017 American Society for Microbiology.
Recent advances in sequencing technologies have transformed the field of virus discovery and virome analysis. Once mostly confined to the traditional Sanger sequencing based individual virus discovery, is now entirely replaced by high throughput sequencing (HTS) based virus metagenomics that can be used to characterize the nature and composition of entire viromes. To better harness the potential of HTS for the study of viromes, sample preparation methodologies use different approaches to exclude amplification of non-viral components that can overshadow low-titer viruses. These virus-sequence enrichment approaches mostly focus on the sample preparation methods, like enzymatic digestion of non-viral nucleic acids and size exclusion of non-viral constituents by column filtration, ultrafiltration or density gradient centrifugation. However, recently a new approach of virus-sequence enrichment called virome-capture sequencing, focused on the amplification or HTS library preparation stage, was developed to increase the ability of virome characterization. This new approach has the potential to further transform the field of virus discovery and virome analysis, but its technical complexity and sequence-dependence warrants further improvements. In this review we discuss the different methods, their applications and evolution, for selective sequencing based virome analysis and also propose refinements needed to harness the full potential of HTS for virome analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Gene activity in primary T cells infected with HIV89.6: intron retention and induction of genomic repeats.
HIV infection has been reported to alter cellular gene activity, but published studies have commonly assayed transformed cell lines and lab-adapted HIV strains, yielding inconsistent results. Here we carried out a deep RNA-Seq analysis of primary human T cells infected with the low passage HIV isolate HIV89.6.Seventeen percent of cellular genes showed altered activity 48 h after infection. In a meta-analysis including four other studies, our data differed from studies of HIV infection in cell lines but showed more parallels with infections of primary cells. We found a global trend toward retention of introns after infection, suggestive of a novel cellular response to infection. HIV89.6 infection was also associated with activation of several human endogenous retroviruses (HERVs) and retrotransposons, of interest as possible novel antigens that could serve as vaccine targets. The most highly activated group of HERVs was a subset of the ERV-9. Analysis showed that activation was associated with a particular variant of ERV-9 long terminal repeats that contains an indel near the U3-R border. These data also allowed quantification of >70 splice forms of the HIV89.6 RNA and specified the main types of chimeric HIV89.6-host RNAs. Comparison to over 100,000 integration site sequences from the same infected cell populations allowed quantification of authentic versus artifactual chimeric reads, showing that 5′ read-in, splicing out of HIV89.6 from the D4 donor and 3′ read-through were the most common HIV89.6-host cell chimeric RNA forms.Analysis of RNA abundance after infection of primary T cells with the low passage HIV89.6 isolate disclosed multiple novel features of HIV-host interactions, notably intron retention and induction of transcription of retrotransposons and endogenous retroviruses.
Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
HIV-1 interacts with human endogenous retrovirus K (HML-2) envelopes derived from human primary lymphocytes.
Human endogenous retroviruses (HERVs) are viruses that have colonized the germ line and spread through vertical passage. Only the more recently acquired HERVs, such as the HERV-K (HML-2) group, maintain coding open reading frames. Expression of HERV-Ks has been linked to different pathological conditions, including HIV infection, but our knowledge on which specific HERV-Ks are expressed in primary lymphocytes currently is very limited. To identify the most expressed HERV-Ks in an unbiased manner, we analyzed their expression patterns in peripheral blood lymphocytes using Pacific Biosciences (PacBio) single-molecule real-time (SMRT) sequencing. We observe that three HERV-Ks (KII, K102, and K18) constitute over 90% of the total HERV-K expression in primary human lymphocytes of five different donors. We also show experimentally that two of these HERV-K env sequences (K18 and K102) retain their ability to produce full-length and posttranslationally processed envelope proteins in cell culture. We show that HERV-K18 Env can be incorporated into HIV-1 but not simian immunodeficiency virus (SIV) particles. Moreover, HERV-K18 Env incorporation into HIV-1 virions is dependent on HIV-1 matrix. Taken together, we generated high-resolution HERV-K expression profiles specific for activated human lymphocytes. We found that one of the most abundantly expressed HERV-K envelopes not only makes a full-length protein but also specifically interacts with HIV-1. Our findings raise the possibility that these endogenous retroviral Env proteins could directly influence HIV-1 replication.Here, we report the HERV-K expression profile of primary lymphocytes from 5 different healthy donors. We used a novel deep-sequencing technology (PacBio SMRT) that produces the long reads necessary to discriminate the complexity of HERV-K expression. We find that primary lymphocytes express up to 32 different HERV-K envelopes, and that at least two of the most expressed Env proteins retain their ability to make a protein. Importantly, one of them, the envelope glycoprotein of HERV-K18, is incorporated into HIV-1 in an HIV matrix-specific fashion. The ramifications of such interactions are discussed, as the possibility of HIV-1 target tissue broadening and immune evasion are considered.
Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93?Gb (contig N50: 8.3?Mb, scaffold N50: 22.0?Mb, including 39.3?Mb N-bases), together with 206?Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8?Mb of HX1-specific sequences, including 4.1?Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.