April 21, 2020  |  

Transcriptional initiation of a small RNA, not R-loop stability, dictates the frequency of pilin antigenic variation in Neisseria gonorrhoeae.

Neisseria gonorrhoeae, the sole causative agent of gonorrhea, constitutively undergoes diversification of the Type IV pilus. Gene conversion occurs between one of the several donor silent copies located in distinct loci and the recipient pilE gene, encoding the major pilin subunit of the pilus. A guanine quadruplex (G4) DNA structure and a cis-acting sRNA (G4-sRNA) are located upstream of the pilE gene and both are required for pilin antigenic variation (Av). We show that the reduced sRNA transcription lowers pilin Av frequencies. Extended transcriptional elongation is not required for Av, since limiting the transcript to 32 nt allows for normal Av frequencies. Using chromatin immunoprecipitation (ChIP) assays, we show that cellular G4s are less abundant when sRNA transcription is lower. In addition, using ChIP, we demonstrate that the G4-sRNA forms a stable RNA:DNA hybrid (R-loop) with its template strand. However, modulating R-loop levels by controlling RNase HI expression does not alter G4 abundance quantified through ChIP. Since pilin Av frequencies were not altered when modulating R-loop levels by controlling RNase HI expression, we conclude that transcription of the sRNA is necessary, but stable R-loops are not required to promote pilin Av. © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

Rapid antigen diversification through mitotic recombination in the human malaria parasite Plasmodium falciparum.

Malaria parasites possess the remarkable ability to maintain chronic infections that fail to elicit a protective immune response, characteristics that have stymied vaccine development and cause people living in endemic regions to remain at risk of malaria despite previous exposure to the disease. These traits stem from the tremendous antigenic diversity displayed by parasites circulating in the field. For Plasmodium falciparum, the most virulent of the human malaria parasites, this diversity is exemplified by the variant gene family called var, which encodes the major surface antigen displayed on infected red blood cells (RBCs). This gene family exhibits virtually limitless diversity when var gene repertoires from different parasite isolates are compared. Previous studies indicated that this remarkable genome plasticity results from extensive ectopic recombination between var genes during mitotic replication; however, the molecular mechanisms that direct this process to antigen-encoding loci while the rest of the genome remains relatively stable were not determined. Using targeted DNA double-strand breaks (DSBs) and long-read whole-genome sequencing, we show that a single break within an antigen-encoding region of the genome can result in a cascade of recombination events leading to the generation of multiple chimeric var genes, a process that can greatly accelerate the generation of diversity within this family. We also found that recombinations did not occur randomly, but rather high-probability, specific recombination products were observed repeatedly. These results provide a molecular basis for previously described structured rearrangements that drive diversification of this highly polymorphic gene family.


April 21, 2020  |  

Application of long read sequencing to determine expressed antigen diversity in Trypanosoma brucei infections.

Antigenic variation is employed by many pathogens to evade the host immune response, and Trypanosoma brucei has evolved a complex system to achieve this phenotype, involving sequential use of variant surface glycoprotein (VSG) genes encoded from a large repertoire of ~2,000 genes. T. brucei express multiple, sometimes closely related, VSGs in a population at any one time, and the ability to resolve and analyse this diversity has been limited. We applied long read sequencing (PacBio) to VSG amplicons generated from blood extracted from batches of mice sacrificed at time points (days 3, 6, 10 and 12) post-infection with T. brucei TREU927. The data showed that long read sequencing is reliable for resolving variant differences between VSGs, and demonstrated that there is significant expressed diversity (449 VSGs detected across 20 mice) and across the timeframe of study there was a clear semi-reproducible pattern of expressed diversity (median of 27 VSGs per sample at day 3 post infection (p.i.), 82 VSGs at day 6 p.i., 187 VSGs at day 10 p.i. and 132 VSGs by day 12 p.i.). There was also consistent detection of one VSG dominating expression across replicates at days 3 and 6, and emergence of a second dominant VSG across replicates by day 12. The innovative application of ecological diversity analysis to VSG reads enabled characterisation of hierarchical VSG expression in the dataset, and resulted in a novel method for analysing such patterns of variation. Additionally, the long read approach allowed detection of mosaic VSG expression from very few reads-the earliest in infection that such events have been detected. Therefore, our results indicate that long read analysis is a reliable tool for resolving diverse gene expression profiles, and provides novel insights into the complexity and nature of VSG expression in trypanosomes, revealing significantly higher diversity than previously shown and the ability to identify mosaic gene formation early during the infection process.


April 21, 2020  |  

Antigenic variation in the Lyme spirochete: detailed functional assessment of recombinational switching at vlsE in the JD1 strain of Borrelia burgdorferi.

Borrelia burgdorferi is a causative agent of Lyme disease and establishes long-term infection in mammalian hosts. Persistence is promoted by the VlsE antigenic variation system, which generates combinatorial diversity of VlsE through unidirectional, segmental gene conversion from an array of silent cassettes. Here we explore the variants generated by the vls system of strain JD1, which has divergent sequence and structural elements from the type strain B31, the only B. burgdorferi strain in which recombinational switching at vlsE has been studied in detail. We first completed the sequencing of the vls region in JD1, uncovering a previously unreported 114 bp inverted repeat sequence upstream of vlsE. A five-week infection of WT and SCID mice was used for PacBio long read sequencing along with our recently developed VAST pipeline to analyze recombinational switching at vlsE from 40,000 sequences comprising 226,000 inferred recombination events. We show that antigenic variation in B31 and JD1 is highly similar, despite the lack of 17 bp direct repeats in JD1, a somewhat different arrangement of the silent cassettes, divergent inverted repeat sequences and general divergence in the vls sequences. We also present data that strongly suggest that dimerization is required for in vivo functionality of VlsE. © 2018 John Wiley & Sons Ltd.


April 21, 2020  |  

In-depth analysis of the genome of Trypanosoma evansi, an etiologic agent of surra.

Trypanosoma evansi is the causative agent of the animal trypanosomiasis surra, a disease with serious economic burden worldwide. The availability of the genome of its closely related parasite Trypanosoma brucei allows us to compare their genetic and evolutionarily shared and distinct biological features. The complete genomic sequence of the T. evansi YNB strain was obtained using a combination of genomic and transcriptomic sequencing, de novo assembly, and bioinformatic analysis. The genome size of the T. evansi YNB strain was 35.2 Mb, showing 96.59% similarity in sequence and 88.97% in scaffold alignment with T. brucei. A total of 8,617 protein-coding genes, accounting for 31% of the genome, were predicted. Approximately 1,641 alternative splicing events of 820 genes were identified, with a majority mediated by intron retention, which represented a major difference in post-transcriptional regulation between T. evansi and T. brucei. Disparities in gene copy number of the variant surface glycoprotein, expression site-associated genes, microRNAs, and RNA-binding protein were clearly observed between the two parasites. The results revealed the genomic determinants of T. evansi, which encoded specific biological characteristics that distinguished them from other related trypanosome species.


April 21, 2020  |  

Deciphering bacterial epigenomes using modern sequencing technologies.

Prokaryotic DNA contains three types of methylation: N6-methyladenine, N4-methylcytosine and 5-methylcytosine. The lack of tools to analyse the frequency and distribution of methylated residues in bacterial genomes has prevented a full understanding of their functions. Now, advances in DNA sequencing technology, including single-molecule, real-time sequencing and nanopore-based sequencing, have provided new opportunities for systematic detection of all three forms of methylated DNA at a genome-wide scale and offer unprecedented opportunities for achieving a more complete understanding of bacterial epigenomes. Indeed, as the number of mapped bacterial methylomes approaches 2,000, increasing evidence supports roles for methylation in regulation of gene expression, virulence and pathogen-host interactions.


October 23, 2019  |  

Optimized CRISPR-Cas9 genome editing for Leishmania and its use to target a multigene family, induce chromosomal translocation, and study DNA break repair mechanisms.

CRISPR-Cas9-mediated genome editing has recently been adapted for Leishmania spp. parasites, the causative agents of human leishmaniasis. We have optimized this genome-editing tool by selecting for cells with CRISPR-Cas9 activity through cotargeting the miltefosine transporter gene; mutation of this gene leads to miltefosine resistance. This cotargeting strategy integrated into a triple guide RNA (gRNA) expression vector was used to delete all 11 copies of the A2 multigene family; this was not previously possible with the traditional gene-targeting method. We found that the Leishmania donovani rRNA promoter is more efficient than the U6 promoter in driving gRNA expression, and sequential transfections of the oligonucleotide donor significantly eased the isolation of edited mutants. A gRNA and Cas9 coexpression vector was developed that was functional in all tested Leishmania species, including L. donovani, L. major, and L. mexicana. By simultaneously targeting sites from two different chromosomes, all four types of targeted chromosomal translocations were generated, regardless of the polycistronic transcription direction from the parent chromosomes. It was possible to use this CRISPR system to create a single conserved amino acid substitution (A189G) mutation for both alleles of RAD51, a DNA recombinase involved in homology-directed repair. We found that RAD51 is essential for L. donovani survival based on direct observation of the death of mutants with both RAD51 alleles disrupted, further confirming that this CRISPR system can reveal gene essentiality. Evidence is also provided that microhomology-mediated end joining (MMEJ) plays a major role in double-strand DNA break repair in L. donovani. IMPORTANCELeishmania parasites cause human leishmaniasis. To accelerate characterization of Leishmania genes for new drug and vaccine development, we optimized and simplified the CRISPR-Cas9 genome-editing tool for Leishmania. We show that co-CRISPR targeting of the miltefosine transporter gene and serial transfections of an oligonucleotide donor significantly eased isolation of edited mutants. This cotargeting strategy was efficiently used to delete all 11 members of the A2 virulence gene family. This technical advancement is valuable, since there are many gene clusters and supernumerary chromosomes in the various Leishmania species and isolates. We simplified this CRISPR system by developing a gRNA and Cas9 coexpression vector which could be used to delete genes in various Leishmania species. This CRISPR system could also be used to generate specific chromosomal translocations, which will help in the study of Leishmania gene expression and transcription control. This study also provides new information about double-strand DNA break repair mechanisms in Leishmania.


September 22, 2019  |  

Discovery of a divergent HPIV4 from respiratory secretions using second and third generation metagenomic sequencing.

Molecular detection of viruses has been aided by high-throughput sequencing, permitting the genomic characterization of emerging strains. In this study, we comprehensively screened 500 respiratory secretions from children with upper and/or lower respiratory tract infections for viral pathogens. The viruses detected are described, including a divergent human parainfluenza virus type 4 from GS FLX pyrosequencing of 92 specimens. Complete full-genome characterization of the virus followed, using Single Molecule, Real-Time (SMRT) sequencing. Subsequent “primer walking” combined with Sanger sequencing validated the RS platform’s utility in viral sequencing from complex clinical samples. Comparative genomics reveals the divergent strain clusters with the only completely sequenced HPIV4a subtype. However, it also exhibits various structural features present in one of the HPIV4b reference strains, opening questions regarding their lifecycle and evolutionary relationships among these viruses. Clinical data from patients infected with the strain, as well as viral prevalence estimates using real-time PCR, is also described.


September 22, 2019  |  

Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads.

Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming. Next-generation sequencing technologies such as Illumina and 454 achieve high throughput but their short read lengths are a critical barrier to accurate assembly of highly similar gene isoforms, and may result in ambiguities and false joining during sequence assembly. More recently, the third generation sequencer represented by the PacBio platform offers sufficient throughput and long reads covering the full length of typical genes, thus providing a potential to reliably profile gene isoforms. However, the PacBio long reads are error-prone and cannot be effectively analyzed by traditional assembly programs.We present a clustering-based analysis pipeline integrated with PacBio sequencing data for profiling highly similar gene isoforms. This approach was first evaluated in comparison to de novo assembly of 454 reads using a benchmark admixture containing 10 known, cloned msg genes encoding the major surface glycoprotein of Pneumocystis jirovecii. All 10 msg isoforms were successfully reconstructed with the expected length (~1.5 kb) and correct sequence by the new approach, while 454 reads could not be correctly assembled using various assembly programs. When using an additional benchmark admixture containing 22 known P. jirovecii msg isoforms, this approach accurately reconstructed all but 4 these isoforms in their full-length (~3 kb); these 4 isoforms were present in low concentrations in the admixture. Finally, when applied to the original clinical sample from which the 22 known msg isoforms were cloned, this approach successfully identified not only all known isoforms accurately (~3 kb each) but also 48 novel isoforms.PacBio sequencing integrated with the clustering-based analysis pipeline achieves high-throughput and high-resolution discrimination of highly similar sequences, and can serve as a new approach for genome-wide characterization of gene isoforms and other highly repetitive sequences.


September 22, 2019  |  

Single-cell (meta-)genomics of a dimorphic Candidatus Thiomargarita nelsonii reveals genomic plasticity.

The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group I intron also carried a MITE sequence that, like the hupL MITE family, occurs broadly across the genome. The presence of a high degree of mobile elements in genes central to Thiomargarita’s core metabolism has not been previously reported in free-living bacteria and suggests a highly mutable genome.


September 22, 2019  |  

Evidence of the red-queen hypothesis from accelerated rates of evolution of genes involved in biotic interactions in Pneumocystis.

Pneumocystis species are ascomycete fungi adapted to live inside the lungs of mammals. These ascomycetes show extensive stenoxenism, meaning that each species of Pneumocystis infects a single species of host. Here, we study the effect exerted by natural selection on gene evolution in the genomes of three Pneumocystis species. We show that genes involved in host interaction evolve under positive selection. In the first place, we found strong evidence of episodic diversifying selection in Major surface glycoproteins (Msg). These proteins are located on the surface of Pneumocystis and are used for host attachment and probably for immune system evasion. Consistent with their function as antigens, most sites under diversifying selection in Msg code for residues with large relative surface accessibility areas. We also found evidence of positive selection in part of the cell machinery used to export Msg to the cell surface. Specifically, we found that genes participating in glycosylphosphatidylinositol (GPI) biosynthesis show an increased rate of nonsynonymous substitutions (dN) versus synonymous substitutions (dS). GPI is a molecule synthesized in the endoplasmic reticulum that is used to anchor proteins to membranes. We interpret the aforementioned findings as evidence of selective pressure exerted by the host immune system on Pneumocystis species, shaping the evolution of Msg and several proteins involved in GPI biosynthesis. We suggest that genome evolution in Pneumocystis is well described by the Red-Queen hypothesis whereby genes relevant for biotic interactions show accelerated rates of evolution.


September 22, 2019  |  

Plasmodium knowlesi: a superb in vivo nonhuman primate model of antigenic variation in malaria.

Antigenic variation in malaria was discovered in Plasmodium knowlesi studies involving longitudinal infections of rhesus macaques (M. mulatta). The variant proteins, known as the P. knowlesi Schizont Infected Cell Agglutination (SICA) antigens and the P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) antigens, expressed by the SICAvar and var multigene families, respectively, have been studied for over 30 years. Expression of the SICA antigens in P. knowlesi requires a splenic component, and specific antibodies are necessary for variant antigen switch events in vivo. Outstanding questions revolve around the role of the spleen and the mechanisms by which the expression of these variant antigen families are regulated. Importantly, the longitudinal dynamics and molecular mechanisms that govern variant antigen expression can be studied with P. knowlesi infection of its mammalian and vector hosts. Synchronous infections can be initiated with established clones and studied at multi-omic levels, with the benefit of computational tools from systems biology that permit the integration of datasets and the design of explanatory, predictive mathematical models. Here we provide an historical account of this topic, while highlighting the potential for maximizing the use of P. knowlesi – macaque model systems and summarizing exciting new progress in this area of research.


September 22, 2019  |  

The genomes of Crithidia bombi and C. expoeki, common parasites of bumblebees.

Trypanosomatids (Trypanosomatidae, Kinetoplastida) are flagellated protozoa containing many parasites of medical or agricultural importance. Among those, Crithidia bombi and C. expoeki, are common parasites in bumble bees around the world, and phylogenetically close to Leishmania and Leptomonas. They have a simple and direct life cycle with one host, and partially castrate the founding queens greatly reducing their fitness. Here, we report the nuclear genome sequences of one clone of each species, extracted from a field-collected infection. Using a combination of Roche 454 FLX Titanium, Pacific Biosciences PacBio RS, and Illumina GA2 instruments for C. bombi, and PacBio for C. expoeki, we could produce high-quality and well resolved sequences. We find that these genomes are around 32 and 34 MB, with 7,808 and 7,851 annotated genes for C. bombi and C. expoeki, respectively-which is somewhat less than reported from other trypanosomatids, with few introns, and organized in polycistronic units. A large fraction of genes received plausible functional support in comparison primarily with Leishmania and Trypanosoma. Comparing the annotated genes of the two species with those of six other trypanosomatids (C. fasciculata, L. pyrrhocoris, L. seymouri, B. ayalai, L. major, and T. brucei) shows similar gene repertoires and many orthologs. Similar to other trypanosomatids, we also find signs of concerted evolution in genes putatively involved in the interaction with the host, a high degree of synteny between C. bombi and C. expoeki, and considerable overlap with several other species in the set. A total of 86 orthologous gene groups show signatures of positive selection in the branch leading to the two Crithidia under study, mostly of unknown function. As an example, we examined the initiating glycosylation pathway of surface components in C. bombi, finding it deviates from most other eukaryotes and also from other kinetoplastids, which may indicate rapid evolution in the extracellular matrix that is involved in interactions with the host. Bumble bees are important pollinators and Crithidia-infections are suspected to cause substantial selection pressure on their host populations. These newly sequenced genomes provide tools that should help better understand host-parasite interactions in these pollinator pathogens.


September 22, 2019  |  

Comparative heterochromatin profiling reveals conserved and unique epigenome signatures linked to adaptation and development of malaria parasites.

Heterochromatin-dependent gene silencing is central to the adaptation and survival of Plasmodium falciparum malaria parasites, allowing clonally variant gene expression during blood infection in humans. By assessing genome-wide heterochromatin protein 1 (HP1) occupancy, we present a comprehensive analysis of heterochromatin landscapes across different Plasmodium species, strains, and life cycle stages. Common targets of epigenetic silencing include fast-evolving multi-gene families encoding surface antigens and a small set of conserved HP1-associated genes with regulatory potential. Many P. falciparum heterochromatic genes are marked in a strain-specific manner, increasing the parasite’s adaptive capacity. Whereas heterochromatin is strictly maintained during mitotic proliferation of asexual blood stage parasites, substantial heterochromatin reorganization occurs in differentiating gametocytes and appears crucial for the activation of key gametocyte-specific genes and adaptation of erythrocyte remodeling machinery. Collectively, these findings provide a catalog of heterochromatic genes and reveal conserved and specialized features of epigenetic control across the genus Plasmodium. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

Primordial origin and diversification of plasmids in Lyme disease agent bacteria.

With approximately one-third of their genomes consisting of linear and circular plasmids, the Lyme disease agent cluster of species has the most complex genomes among known bacteria. We report here a comparative analysis of plasmids in eleven Borreliella (also known as Borrelia burgdorferi sensu lato) species.We sequenced the complete genomes of two B. afzelii, two B. garinii, and individual B. spielmanii, B. bissettiae, B. valaisiana and B. finlandensis isolates. These individual isolates carry between seven and sixteen plasmids, and together harbor 99 plasmids. We report here a comparative analysis of these plasmids, along with 70 additional Borreliella plasmids available in the public sequence databases. We identify only one new putative plasmid compatibility type (the 30th) among these 169 plasmid sequences, suggesting that all or nearly all such types have now been discovered. We find that the linear plasmids in the non-B. burgdorferi species have undergone the same kinds of apparently random, chaotic rearrangements mediated by non-homologous recombination that we previously discovered in B. burgdorferi. These rearrangements occurred independently in the different species lineages, and they, along with an expanded chromosomal phylogeny reported here, allow the identification of several whole plasmid transfer events among these species. Phylogenetic analyses of the plasmid partition genes show that a majority of the plasmid compatibility types arose early, most likely before separation of the Lyme agent Borreliella and relapsing fever Borrelia clades, and this, with occasional cross species plasmid transfers, has resulted in few if any species-specific or geographic region-specific Borreliella plasmid types.The primordial origin and persistent maintenance of the Borreliella plasmid types support their functional indispensability as well as evolutionary roles in facilitating genome diversity. The improved resolution of Borreliella plasmid phylogeny based on conserved partition-gene clusters will lead to better determination of gene orthology which is essential for prediction of biological function, and it will provide a basis for inferring detailed evolutionary mechanisms of Borreliella genomic variability including homologous gene and plasmid exchanges as well as non-homologous rearrangements.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.