April 21, 2020  |  

Transcriptional initiation of a small RNA, not R-loop stability, dictates the frequency of pilin antigenic variation in Neisseria gonorrhoeae.

Neisseria gonorrhoeae, the sole causative agent of gonorrhea, constitutively undergoes diversification of the Type IV pilus. Gene conversion occurs between one of the several donor silent copies located in distinct loci and the recipient pilE gene, encoding the major pilin subunit of the pilus. A guanine quadruplex (G4) DNA structure and a cis-acting sRNA (G4-sRNA) are located upstream of the pilE gene and both are required for pilin antigenic variation (Av). We show that the reduced sRNA transcription lowers pilin Av frequencies. Extended transcriptional elongation is not required for Av, since limiting the transcript to 32 nt allows for normal Av frequencies. Using chromatin immunoprecipitation (ChIP) assays, we show that cellular G4s are less abundant when sRNA transcription is lower. In addition, using ChIP, we demonstrate that the G4-sRNA forms a stable RNA:DNA hybrid (R-loop) with its template strand. However, modulating R-loop levels by controlling RNase HI expression does not alter G4 abundance quantified through ChIP. Since pilin Av frequencies were not altered when modulating R-loop levels by controlling RNase HI expression, we conclude that transcription of the sRNA is necessary, but stable R-loops are not required to promote pilin Av. © 2019 John Wiley & Sons Ltd.


April 21, 2020  |  

Morphological and genomic characterisation of the hybrid schistosome infecting humans in Europe reveals a complex admixture between Schistosoma haematobium and Schistosoma bovis parasites

Schistosomes cause schistosomiasis, the worldtextquoterights second most important parasitic disease after malaria. A peculiar feature of schistosomes is their ability to produce viable and fertile hybrids. Originally only present in the tropics, schistosomiasis is now also endemic in Europe. Based on two genetic markers the European species had been identified as a hybrid between the ruminant-infective Schistosoma bovis and the human-infective Schistosoma haematobium.Here we describe for the first time the genomic composition of the European schistosome hybrid (77% of S. haematobium and 23% of S. bovis origins), its morphometric parameters and its compatibility with the European vector snail and intermediate host Compatibility is a key parameter for the parasites life cycle progression. We also show that egg morphology (a classical diagnostic parameter) does not allow for differential diagnosis while genetic tests do so. Additionally, we performed genome assembly improvement and annotation of S. bovis, the parental species for which no satisfactory genome assembly was available.For the first time since the discovery of hybrid schistosomes, these results reveal at the whole genomic level a complex admixture of parental genomes highlighting (i) the high permeability of schistosomes to other speciestextquoteright alleles, and (ii) the importance of hybrid formation for pushing species boundaries not only conceptionally but also geographically.


April 21, 2020  |  

Harnessing long-read amplicon sequencing to uncover NRPS and Type I PKS gene sequence diversity in polar desert soils.

The severity of environmental conditions at Earth’s frigid zones present attractive opportunities for microbial biomining due to their heightened potential as reservoirs for novel secondary metabolites. Arid soil microbiomes within the Antarctic and Arctic circles are remarkably rich in Actinobacteria and Proteobacteria, bacterial phyla known to be prolific producers of natural products. Yet the diversity of secondary metabolite genes within these cold, extreme environments remain largely unknown. Here, we employed amplicon sequencing using PacBio RS II, a third generation long-read platform, to survey over 200 soils spanning twelve east Antarctic and high Arctic sites for natural product-encoding genes, specifically targeting non-ribosomal peptides (NRPS) and Type I polyketides (PKS). NRPS-encoding genes were more widespread across the Antarctic, whereas PKS genes were only recoverable from a handful of sites. Many recovered sequences were deemed novel due to their low amino acid sequence similarity to known protein sequences, particularly throughout the east Antarctic sites. Phylogenetic analysis revealed that a high proportion were most similar to antifungal and biosurfactant-type clusters. Multivariate analysis showed that soil fertility factors of carbon, nitrogen and moisture displayed significant negative relationships with natural product gene richness. Our combined results suggest that secondary metabolite production is likely to play an important physiological component of survival for microorganisms inhabiting arid, nutrient-starved soils. © FEMS 2019.


April 21, 2020  |  

Rapid antigen diversification through mitotic recombination in the human malaria parasite Plasmodium falciparum.

Malaria parasites possess the remarkable ability to maintain chronic infections that fail to elicit a protective immune response, characteristics that have stymied vaccine development and cause people living in endemic regions to remain at risk of malaria despite previous exposure to the disease. These traits stem from the tremendous antigenic diversity displayed by parasites circulating in the field. For Plasmodium falciparum, the most virulent of the human malaria parasites, this diversity is exemplified by the variant gene family called var, which encodes the major surface antigen displayed on infected red blood cells (RBCs). This gene family exhibits virtually limitless diversity when var gene repertoires from different parasite isolates are compared. Previous studies indicated that this remarkable genome plasticity results from extensive ectopic recombination between var genes during mitotic replication; however, the molecular mechanisms that direct this process to antigen-encoding loci while the rest of the genome remains relatively stable were not determined. Using targeted DNA double-strand breaks (DSBs) and long-read whole-genome sequencing, we show that a single break within an antigen-encoding region of the genome can result in a cascade of recombination events leading to the generation of multiple chimeric var genes, a process that can greatly accelerate the generation of diversity within this family. We also found that recombinations did not occur randomly, but rather high-probability, specific recombination products were observed repeatedly. These results provide a molecular basis for previously described structured rearrangements that drive diversification of this highly polymorphic gene family.


April 21, 2020  |  

Draft Genome Sequence of Trypanosoma equiperdum Strain IVM-t1.

Trypanosoma equiperdum primarily parasitizes the genital organs and causes dourine in equidae. We isolated a new T. equiperdum strain, T. equiperdum IVM-t1, from the urogenital tract of a horse definitively diagnosed as having dourine in Mongolia. Here, we report the whole-genome sequence, the predicted gene models, and their annotations.


April 21, 2020  |  

Application of long read sequencing to determine expressed antigen diversity in Trypanosoma brucei infections.

Antigenic variation is employed by many pathogens to evade the host immune response, and Trypanosoma brucei has evolved a complex system to achieve this phenotype, involving sequential use of variant surface glycoprotein (VSG) genes encoded from a large repertoire of ~2,000 genes. T. brucei express multiple, sometimes closely related, VSGs in a population at any one time, and the ability to resolve and analyse this diversity has been limited. We applied long read sequencing (PacBio) to VSG amplicons generated from blood extracted from batches of mice sacrificed at time points (days 3, 6, 10 and 12) post-infection with T. brucei TREU927. The data showed that long read sequencing is reliable for resolving variant differences between VSGs, and demonstrated that there is significant expressed diversity (449 VSGs detected across 20 mice) and across the timeframe of study there was a clear semi-reproducible pattern of expressed diversity (median of 27 VSGs per sample at day 3 post infection (p.i.), 82 VSGs at day 6 p.i., 187 VSGs at day 10 p.i. and 132 VSGs by day 12 p.i.). There was also consistent detection of one VSG dominating expression across replicates at days 3 and 6, and emergence of a second dominant VSG across replicates by day 12. The innovative application of ecological diversity analysis to VSG reads enabled characterisation of hierarchical VSG expression in the dataset, and resulted in a novel method for analysing such patterns of variation. Additionally, the long read approach allowed detection of mosaic VSG expression from very few reads-the earliest in infection that such events have been detected. Therefore, our results indicate that long read analysis is a reliable tool for resolving diverse gene expression profiles, and provides novel insights into the complexity and nature of VSG expression in trypanosomes, revealing significantly higher diversity than previously shown and the ability to identify mosaic gene formation early during the infection process.


April 21, 2020  |  

Efficiency of PacBio long read correction by 2nd generation Illumina sequencing.

Long sequencing reads offer unprecedented opportunities in analysis and reconstruction of complex genomic regions. However, the gain in sequence length is often traded for quality. Therefore, recently several approaches have been proposed (e.g. higher sequencing coverage, hybrid assembly or sequence correction) to enhance the quality of long sequencing reads. A simple and cost-effective approach includes use of the high quality 2nd generation sequencing data to improve the quality of long reads. We designed a dedicated testing procedure and selected universal programs for long read correction, which provide as the output sequences that can be used in further genomic and transcriptomic studies. Our results show that HALC is the best choice for correction of long PacBio reads, when both, read size and quality, are the main focus of the analysis. However, the tested tools show some unexpected behaviors, including read trimming and fragmentation.Copyright © 2017 Elsevier Inc. All rights reserved.


April 21, 2020  |  

In-depth analysis of the genome of Trypanosoma evansi, an etiologic agent of surra.

Trypanosoma evansi is the causative agent of the animal trypanosomiasis surra, a disease with serious economic burden worldwide. The availability of the genome of its closely related parasite Trypanosoma brucei allows us to compare their genetic and evolutionarily shared and distinct biological features. The complete genomic sequence of the T. evansi YNB strain was obtained using a combination of genomic and transcriptomic sequencing, de novo assembly, and bioinformatic analysis. The genome size of the T. evansi YNB strain was 35.2 Mb, showing 96.59% similarity in sequence and 88.97% in scaffold alignment with T. brucei. A total of 8,617 protein-coding genes, accounting for 31% of the genome, were predicted. Approximately 1,641 alternative splicing events of 820 genes were identified, with a majority mediated by intron retention, which represented a major difference in post-transcriptional regulation between T. evansi and T. brucei. Disparities in gene copy number of the variant surface glycoprotein, expression site-associated genes, microRNAs, and RNA-binding protein were clearly observed between the two parasites. The results revealed the genomic determinants of T. evansi, which encoded specific biological characteristics that distinguished them from other related trypanosome species.


April 21, 2020  |  

A systematic review of the Trypanosoma cruzi genetic heterogeneity, host immune response and genetic factors as plausible drivers of chronic chagasic cardiomyopathy.

Chagas disease is a complex tropical pathology caused by the kinetoplastid Trypanosoma cruzi. This parasite displays massive genetic diversity and has been classified by international consensus in at least six Discrete Typing Units (DTUs) that are broadly distributed in the American continent. The main clinical manifestation of the disease is the chronic chagasic cardiomyopathy (CCC) that is lethal in the infected individuals. However, one intriguing feature is that only 30-40% of the infected individuals will develop CCC. Some authors have suggested that the immune response, host genetic factors, virulence factors and even the massive genetic heterogeneity of T. cruzi are responsible of this clinical pattern. To date, no conclusive data support the reason why a few percentages of the infected individuals will develop CCC. Therefore, we decided to conduct a systematic review analysing the host genetic factors, immune response, cytokine production, virulence factors and the plausible association of the parasite DTUs and CCC. The epidemiological and clinical implications are herein discussed.


April 21, 2020  |  

Meiotic sex in Chagas disease parasite Trypanosoma cruzi.

Genetic exchange enables parasites to rapidly transform disease phenotypes and exploit new host populations. Trypanosoma cruzi, the parasitic agent of Chagas disease and a public health concern throughout Latin America, has for decades been presumed to exchange genetic material rarely and without classic meiotic sex. We present compelling evidence from 45 genomes sequenced from southern Ecuador that T. cruzi in fact maintains truly sexual, panmictic groups that can occur alongside others that remain highly clonal after past hybridization events. These groups with divergent reproductive strategies appear genetically isolated despite possible co-occurrence in vectors and hosts. We propose biological explanations for the fine-scale disconnectivity we observe and discuss the epidemiological consequences of flexible reproductive modes. Our study reinvigorates the hunt for the site of genetic exchange in the T. cruzi life cycle, provides tools to define the genetic determinants of parasite virulence, and reforms longstanding theory on clonality in trypanosomatid parasites.


April 21, 2020  |  

Comprehensive characterization of T-DNA integration induced chromosomal rearrangement in a birch T-DNA mutant.

Integration of T-DNA into plant genomes via Agrobacterium may interrupt gene structure and generate numerous mutants. The T-DNA caused mutants are valuable materials for understanding T-DNA integration model in plant research. T-DNA integration in plants is complex and still largely unknown. In this work, we reported that multiple T-DNA fragments caused chromosomal translocation and deletion in a birch (Betula platyphylla × B. pendula) T-DNA mutant yl.We performed PacBio genome resequencing for yl and the result revealed that two ends of a T-DNA can be integrated into plant genome independently because the two ends can be linked to different chromosomes and cause chromosomal translocation. We also found that these T-DNA were connected into tandem fragment regardless of direction before integrating into plant genome. In addition, the integration of T-DNA in yl genome also caused several chromosomal fragments deletion. We then summarized three cases for T-DNA integration model in the yl genome. (1) A T-DNA fragment is linked to the two ends of a double-stranded break (DSB); (2) Only one end of a T-DNA fragment is linked to a DSB; (3) A T-DNA fragment is linked to the ends of different DSBs. All the observations in the yl genome supported the DSB repair model.In this study, we showed a comprehensive genome analysis of a T-DNA mutant and provide a new insight into T-DNA integration in plants. These findings would be helpful for the analysis of T-DNA mutants with special phenotypes.


April 21, 2020  |  

Pentatricopeptide repeat poly(A) binding protein KPAF4 stabilizes mitochondrial mRNAs in Trypanosoma brucei.

In Trypanosoma brucei, most mitochondrial mRNAs undergo editing, and 3′ adenylation and uridylation. The internal sequence changes and terminal extensions are coordinated: pre-editing addition of the short (A) tail protects the edited transcript against 3′-5′ degradation, while post-editing A/U-tailing renders mRNA competent for translation. Participation of a poly(A) binding protein (PABP) in coupling of editing and 3′ modification processes has been inferred, but its identity and mechanism of action remained elusive. We report identification of KPAF4, a pentatricopeptide repeat-containing PABP which sequesters the A-tail and impedes mRNA degradation. Conversely, KPAF4 inhibits uridylation of A-tailed transcripts and, therefore, premature A/U-tailing of partially-edited mRNAs. This quality check point likely prevents translation of incompletely edited mRNAs. We also find that RNA editing substrate binding complex (RESC) mediates the interaction between the 5′ end-bound pyrophosphohydrolase MERS1 and 3′ end-associated KPAF4 to enable mRNA circularization. This event appears to be critical for edited mRNA stability.


October 23, 2019  |  

Optimized CRISPR-Cas9 genome editing for Leishmania and its use to target a multigene family, induce chromosomal translocation, and study DNA break repair mechanisms.

CRISPR-Cas9-mediated genome editing has recently been adapted for Leishmania spp. parasites, the causative agents of human leishmaniasis. We have optimized this genome-editing tool by selecting for cells with CRISPR-Cas9 activity through cotargeting the miltefosine transporter gene; mutation of this gene leads to miltefosine resistance. This cotargeting strategy integrated into a triple guide RNA (gRNA) expression vector was used to delete all 11 copies of the A2 multigene family; this was not previously possible with the traditional gene-targeting method. We found that the Leishmania donovani rRNA promoter is more efficient than the U6 promoter in driving gRNA expression, and sequential transfections of the oligonucleotide donor significantly eased the isolation of edited mutants. A gRNA and Cas9 coexpression vector was developed that was functional in all tested Leishmania species, including L. donovani, L. major, and L. mexicana. By simultaneously targeting sites from two different chromosomes, all four types of targeted chromosomal translocations were generated, regardless of the polycistronic transcription direction from the parent chromosomes. It was possible to use this CRISPR system to create a single conserved amino acid substitution (A189G) mutation for both alleles of RAD51, a DNA recombinase involved in homology-directed repair. We found that RAD51 is essential for L. donovani survival based on direct observation of the death of mutants with both RAD51 alleles disrupted, further confirming that this CRISPR system can reveal gene essentiality. Evidence is also provided that microhomology-mediated end joining (MMEJ) plays a major role in double-strand DNA break repair in L. donovani. IMPORTANCELeishmania parasites cause human leishmaniasis. To accelerate characterization of Leishmania genes for new drug and vaccine development, we optimized and simplified the CRISPR-Cas9 genome-editing tool for Leishmania. We show that co-CRISPR targeting of the miltefosine transporter gene and serial transfections of an oligonucleotide donor significantly eased isolation of edited mutants. This cotargeting strategy was efficiently used to delete all 11 members of the A2 virulence gene family. This technical advancement is valuable, since there are many gene clusters and supernumerary chromosomes in the various Leishmania species and isolates. We simplified this CRISPR system by developing a gRNA and Cas9 coexpression vector which could be used to delete genes in various Leishmania species. This CRISPR system could also be used to generate specific chromosomal translocations, which will help in the study of Leishmania gene expression and transcription control. This study also provides new information about double-strand DNA break repair mechanisms in Leishmania.


September 22, 2019  |  

Comparison of the mitochondrial genomes and steady state transcriptomes of two strains of the trypanosomatid parasite, Leishmania tarentolae.

U-insertion/deletion RNA editing is a post-transcriptional mitochondrial RNA modification phenomenon required for viability of trypanosomatid parasites. Small guide RNAs encoded mainly by the thousands of catenated minicircles contain the information for this editing. We analyzed by NGS technology the mitochondrial genomes and transcriptomes of two strains, the old lab UC strain and the recently isolated LEM125 strain. PacBio sequencing provided complete minicircle sequences which avoided the assembly problem of short reads caused by the conserved regions. Minicircles were identified by a characteristic size, the presence of three short conserved sequences, a region of inherently bent DNA and the presence of single gRNA genes at a fairly defined location. The LEM125 strain contained over 114 minicircles encoding different gRNAs and the UC strain only ~24 minicircles. Some LEM125 minicircles contained no identifiable gRNAs. Approximate copy numbers of the different minicircle classes in the network were determined by the number of PacBio CCS reads that assembled to each class. Mitochondrial RNA libraries from both strains were mapped against the minicircle and maxicircle sequences. Small RNA reads mapped to the putative gRNA genes but also to multiple regions outside the genes on both strands and large RNA reads mapped in many cases over almost the entire minicircle on both strands. These data suggest that minicircle transcription is complete and bidirectional, with 3′ processing yielding the mature gRNAs. Steady state RNAs in varying abundances are derived from all maxicircle genes, including portions of the repetitive divergent region. The relative extents of editing in both strains correlated with the presence of a cascade of cognate gRNAs. These data should provide the foundation for a deeper understanding of this dynamic genetic system as well as the evolutionary variation of editing in different strains.


September 22, 2019  |  

Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads.

Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming. Next-generation sequencing technologies such as Illumina and 454 achieve high throughput but their short read lengths are a critical barrier to accurate assembly of highly similar gene isoforms, and may result in ambiguities and false joining during sequence assembly. More recently, the third generation sequencer represented by the PacBio platform offers sufficient throughput and long reads covering the full length of typical genes, thus providing a potential to reliably profile gene isoforms. However, the PacBio long reads are error-prone and cannot be effectively analyzed by traditional assembly programs.We present a clustering-based analysis pipeline integrated with PacBio sequencing data for profiling highly similar gene isoforms. This approach was first evaluated in comparison to de novo assembly of 454 reads using a benchmark admixture containing 10 known, cloned msg genes encoding the major surface glycoprotein of Pneumocystis jirovecii. All 10 msg isoforms were successfully reconstructed with the expected length (~1.5 kb) and correct sequence by the new approach, while 454 reads could not be correctly assembled using various assembly programs. When using an additional benchmark admixture containing 22 known P. jirovecii msg isoforms, this approach accurately reconstructed all but 4 these isoforms in their full-length (~3 kb); these 4 isoforms were present in low concentrations in the admixture. Finally, when applied to the original clinical sample from which the 22 known msg isoforms were cloned, this approach successfully identified not only all known isoforms accurately (~3 kb each) but also 48 novel isoforms.PacBio sequencing integrated with the clustering-based analysis pipeline achieves high-throughput and high-resolution discrimination of highly similar sequences, and can serve as a new approach for genome-wide characterization of gene isoforms and other highly repetitive sequences.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.