The advantages of Pacific Biosciences (PacBio) single-molecule real-time (SMRT) technology include long reads, low systematic bias, and high consensus read accuracy. Here we use these attributes to improve on the genome annotation of the parasitic hookworm Ancylostoma ceylanicum using PacBio RNA-Seq.We sequenced 192,888 circular consensus sequences (CCS) derived from cDNAs generated using the CloneTech SMARTer system. These SMARTer-SMRT libraries were normalized and size-selected providing a robust population of expressed structural genes for subsequent genome annotation. We demonstrate PacBio mRNA sequences based genome annotation improvement, compared to genome annotation using conventional sequencing-by-synthesis alone, by identifying 1609 (9.2%) new genes, extended the length of 3965 (26.7%) genes and increased the total genomic exon length by 1.9 Mb (12.4%). Non-coding sequence representation (primarily from UTRs based on dT reverse transcription priming) was particularly improved, increasing in total length by fifteen-fold, by increasing both the length and number of UTR exons. In addition, the UTR data provided by these CCS allowed for the identification of a novel SL2 splice leader sequence for A. ceylanicum and an increase in the number and proportion of functionally annotated genes. RNA-seq data also confirmed some of the newly annotated genes and gene features.Overall, PacBio data has supported a significant improvement in gene annotation in this genome, and is an appealing alternative or complementary technique for genome annotation to the other transcript sequencing technologies.
Antigenic variation in malaria was discovered in Plasmodium knowlesi studies involving longitudinal infections of rhesus macaques (M. mulatta). The variant proteins, known as the P. knowlesi Schizont Infected Cell Agglutination (SICA) antigens and the P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) antigens, expressed by the SICAvar and var multigene families, respectively, have been studied for over 30 years. Expression of the SICA antigens in P. knowlesi requires a splenic component, and specific antibodies are necessary for variant antigen switch events in vivo. Outstanding questions revolve around the role of the spleen and the mechanisms by which the expression of these variant antigen families are regulated. Importantly, the longitudinal dynamics and molecular mechanisms that govern variant antigen expression can be studied with P. knowlesi infection of its mammalian and vector hosts. Synchronous infections can be initiated with established clones and studied at multi-omic levels, with the benefit of computational tools from systems biology that permit the integration of datasets and the design of explanatory, predictive mathematical models. Here we provide an historical account of this topic, while highlighting the potential for maximizing the use of P. knowlesi – macaque model systems and summarizing exciting new progress in this area of research.
Trypanosomatids (Trypanosomatidae, Kinetoplastida) are flagellated protozoa containing many parasites of medical or agricultural importance. Among those, Crithidia bombi and C. expoeki, are common parasites in bumble bees around the world, and phylogenetically close to Leishmania and Leptomonas. They have a simple and direct life cycle with one host, and partially castrate the founding queens greatly reducing their fitness. Here, we report the nuclear genome sequences of one clone of each species, extracted from a field-collected infection. Using a combination of Roche 454 FLX Titanium, Pacific Biosciences PacBio RS, and Illumina GA2 instruments for C. bombi, and PacBio for C. expoeki, we could produce high-quality and well resolved sequences. We find that these genomes are around 32 and 34 MB, with 7,808 and 7,851 annotated genes for C. bombi and C. expoeki, respectively-which is somewhat less than reported from other trypanosomatids, with few introns, and organized in polycistronic units. A large fraction of genes received plausible functional support in comparison primarily with Leishmania and Trypanosoma. Comparing the annotated genes of the two species with those of six other trypanosomatids (C. fasciculata, L. pyrrhocoris, L. seymouri, B. ayalai, L. major, and T. brucei) shows similar gene repertoires and many orthologs. Similar to other trypanosomatids, we also find signs of concerted evolution in genes putatively involved in the interaction with the host, a high degree of synteny between C. bombi and C. expoeki, and considerable overlap with several other species in the set. A total of 86 orthologous gene groups show signatures of positive selection in the branch leading to the two Crithidia under study, mostly of unknown function. As an example, we examined the initiating glycosylation pathway of surface components in C. bombi, finding it deviates from most other eukaryotes and also from other kinetoplastids, which may indicate rapid evolution in the extracellular matrix that is involved in interactions with the host. Bumble bees are important pollinators and Crithidia-infections are suspected to cause substantial selection pressure on their host populations. These newly sequenced genomes provide tools that should help better understand host-parasite interactions in these pollinator pathogens.
Plasmodium knowlesi, a common parasite of macaques, is recognised as a significant cause of human malaria in Malaysia. The P. knowlesi A1H1 line has been adapted to continuous culture in human erythrocytes, successfully providing an in vitro model to study the parasite. We have assembled a reference genome for the PkA1-H.1 line using PacBio long read combined with Illumina short read sequence data. Compared with the H-strain reference, the new reference has improved genome coverage and a novel description of methylation sites. The PkA1-H.1 reference will enhance the capabilities of the in vitro model to improve the understanding of P. knowlesi infection in humans. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Comparative heterochromatin profiling reveals conserved and unique epigenome signatures linked to adaptation and development of malaria parasites.
Heterochromatin-dependent gene silencing is central to the adaptation and survival of Plasmodium falciparum malaria parasites, allowing clonally variant gene expression during blood infection in humans. By assessing genome-wide heterochromatin protein 1 (HP1) occupancy, we present a comprehensive analysis of heterochromatin landscapes across different Plasmodium species, strains, and life cycle stages. Common targets of epigenetic silencing include fast-evolving multi-gene families encoding surface antigens and a small set of conserved HP1-associated genes with regulatory potential. Many P. falciparum heterochromatic genes are marked in a strain-specific manner, increasing the parasite’s adaptive capacity. Whereas heterochromatin is strictly maintained during mitotic proliferation of asexual blood stage parasites, substantial heterochromatin reorganization occurs in differentiating gametocytes and appears crucial for the activation of key gametocyte-specific genes and adaptation of erythrocyte remodeling machinery. Collectively, these findings provide a catalog of heterochromatic genes and reveal conserved and specialized features of epigenetic control across the genus Plasmodium. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
RTS,S/AS01 malaria vaccine mismatch observed among Plasmodium falciparum isolates from southern and central Africa and globally.
The RTS,S/AS01 malaria vaccine encompasses the central repeats and C-terminal of Plasmodium falciparum circumsporozoite protein (PfCSP). Although no Phase II clinical trial studies observed evidence of strain-specific immunity, recent studies show a decrease in vaccine efficacy against non-vaccine strain parasites. In light of goals to reduce malaria morbidity, anticipating the effectiveness of RTS,S/AS01 is critical to planning widespread vaccine introduction. We deep sequenced C-terminal Pfcsp from 77 individuals living along the international border in Luapula Province, Zambia and Haut-Katanga Province, the Democratic Republic of the Congo (DRC) and compared translated amino acid haplotypes to the 3D7 vaccine strain. Only 5.2% of the 193 PfCSP sequences from the Zambia-DRC border region matched 3D7 at all 84 amino acids. To further contextualize the genetic diversity sampled in this study with global PfCSP diversity, we analyzed an additional 3,809 Pfcsp sequences from the Pf3k database and constructed a haplotype network representing 15 countries from Africa and Asia. The diversity observed in our samples was similar to the diversity observed in the global haplotype network. These observations underscore the need for additional research assessing genetic diversity in P. falciparum and the impact of PfCSP diversity on RTS,S/AS01 efficacy.
The genome sequence of “Candidatus Fokinia solitaria”: Insights on reductive evolution in Rickettsiales.
Candidatus Fokinia solitaria is an obligate intracellular endosymbiont of a unicellular eukaryote, a ciliate of the genus Paramecium. Here, we present the genome sequence of this bacterium and subsequent analysis. Phylogenomic analysis confirmed the previously reported positioning of the symbiont within the “Candidatus Midichloriaceae” family (order Rickettsiales), as well as its high sequence divergence from other members of the family, indicative of fast sequence evolution. Consistently with this high evolutionary rate, a comparative genomic analysis revealed that the genome of this symbiont is the smallest of the Rickettsiales to date. The reduced genome does not present flagellar genes, nor the pathway for the biosynthesis of lipopolysaccharides (present in all the other so far sequenced members of the family “Candidatus Midichloriaceae”) or genes for the Krebs cycle (present, although not always complete, in Rickettsiales). These results indicate an evolutionary trend toward a stronger dependence on the host, in comparison with other members of the family. Two alternative scenarios are compatible with our results; “Candidatus Fokinia solitaria” could be either a recently evolved, vertically transmitted mutualist, or a parasite with a high host-specificity.
Plasmodium falciparum, the most virulent agent of human malaria, shares a recent common ancestor with the gorilla parasite Plasmodium praefalciparum. Little is known about the other gorilla- and chimpanzee-infecting species in the same (Laverania) subgenus as P. falciparum, but none of them are capable of establishing repeated infection and transmission in humans. To elucidate underlying mechanisms and the evolutionary history of this subgenus, we have generated multiple genomes from all known Laverania species. The completeness of our dataset allows us to conclude that interspecific gene transfers, as well as convergent evolution, were important in the evolution of these species. Striking copy number and structural variations were observed within gene families and one, stevor, shows a host-specific sequence pattern. The complete genome sequence of the closest ancestor of P. falciparum enables us to estimate the timing of the beginning of speciation to be 40,000-60,000 years ago followed by a population bottleneck around 4,000-6,000 years ago. Our data allow us also to search in detail for the features of P. falciparum that made it the only member of the Laverania able to infect and spread in humans.
A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.
Dodders (Cuscuta spp., Convolvulaceae) are root- and leafless parasitic plants. The physiology, ecology, and evolution of these obligate parasites are poorly understood. A high-quality reference genome of Cuscuta australis was assembled. Our analyses reveal that Cuscuta experienced accelerated molecular evolution, and Cuscuta and the convolvulaceous morning glory (Ipomoea) shared a common whole-genome triplication event before their divergence. C. australis genome harbors 19,671 protein-coding genes, and importantly, 11.7% of the conserved orthologs in autotrophic plants are lost in C. australis. Many of these gene loss events likely result from its parasitic lifestyle and the massive changes of its body plan. Moreover, comparison of the gene expression patterns in Cuscuta prehaustoria/haustoria and various tissues of closely related autotrophic plants suggests that Cuscuta haustorium formation requires mostly genes normally involved in root development. The C. australis genome provides important resources for studying the evolution of parasitism, regressive evolution, and evo-devo in plant parasites.
Plasmodium vivax-like genome sequences shed new insights into Plasmodium vivax biology and evolution.
Although Plasmodium vivax is responsible for the majority of malaria infections outside Africa, little is known about its evolution and pathway to humans. Its closest genetic relative, P. vivax-like, was discovered in African great apes and is hypothesized to have given rise to P. vivax in humans. To unravel the evolutionary history and adaptation of P. vivax to different host environments, we generated using long- and short-read sequence technologies 2 new P. vivax-like reference genomes and 9 additional P. vivax-like genotypes. Analyses show that the genomes of P. vivax and P. vivax-like are highly similar and colinear within the core regions. Phylogenetic analyses clearly show that P. vivax-like parasites form a genetically distinct clade from P. vivax. Concerning the relative divergence dating, we show that the evolution of P. vivax in humans did not occur at the same time as the other agents of human malaria, thus suggesting that the transfer of Plasmodium parasites to humans happened several times independently over the history of the Homo genus. We further identify several key genes that exhibit signatures of positive selection exclusively in the human P. vivax parasites. Two of these genes have been identified to also be under positive selection in the other main human malaria agent, P. falciparum, thus suggesting their key role in the evolution of the ability of these parasites to infect humans or their anthropophilic vectors. Finally, we demonstrate that some gene families important for red blood cell (RBC) invasion (a key step of the life cycle of these parasites) have undergone lineage-specific evolution in the human parasite (e.g., reticulocyte-binding proteins [RBPs]).
Exploring benzimidazole resistance in Haemonchus contortus by next generation sequencing and droplet digital PCR.
Anthelmintic resistance in gastrointestinal nematode (GIN) parasites of grazing ruminants is on the rise in countries across the world. Haemonchus contortus is one of most frequently encountered drug-resistant GINs in small ruminants. This blood-sucking abomasal nematode contributes to massive treatment costs and poses a serious threat to farm animal health. To prevent the establishment of resistant strains of this parasite, up-to-date molecular techniques need to be proposed which would allow for quick, cheap and accurate identification of individuals infected with resistant worms. The effort has been made in the previous decade, with the development of the pyrosequencing method to detect resistance-predicting alleles. Here we propose a novel droplet digital PCR (ddPCR) assay for rapid and precise identification of H. contortus strains as being resistant or susceptible to benzimidazole drugs based on the presence or absence of the most common resistance-conferring mutation F200Y (TAC) in the ß tubulin isotype 1 gene. The newly developed ddPCR assay was first optimized and validated utilizing DNA templates from single-worm samples, which were previously sequenced using the next generation PacBio RSII Sequencing (NGS) platform. Subsequent NGS results for faecal larval cultures were then used as a reference to compare the obtained values for fractional abundances of the resistance-determining mutant allele between ddPCR and NGS techniques in each sample. Both methods managed to produce highly similar results and ddPCR proved to be a reliable tool which, when utilized at full capacity, can be used to create a powerful mutation detection and quantification assay. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
The structure of a conserved telomeric region associated with variant antigen loci in the blood parasite Trypanosoma congolense
African trypanosomiasis is a vector-borne disease of humans and livestock caused by African trypanosomes (Trypanosoma spp.). Survival in the vertebrate bloodstream depends on antigenic variation of Variant Surface Glycoproteins (VSGs) coating the parasite surface. In T. brucei, a model for antigenic variation, monoallelic VSG expression originates from dedicated VSG expression sites (VES). Trypanosoma brucei VES have a conserved structure consisting of a telomeric VSG locus downstream of unique, repeat sequences, and an independent promoter. Additional protein-coding sequences, known as “Expression Site Associated Genes (ESAGs)”, are also often present and are implicated in diverse, bloodstream-stage functions. Trypanosoma congolense is a related veterinary pathogen, also displaying VSG-mediated antigenic variation. A T. congolense VES has not been described, making it unclear if regulation of VSG expression is conserved between species. Here, we describe a conserved telomeric region associated with VSG loci from long-read DNA sequencing of two T. congolense strains, which consists of a distal repeat, conserved noncoding elements and other genes besides the VSG; although these are not orthologous to T. brucei ESAGs. Most conserved telomeric regions are associated with accessory minichromosomes, but the same structure may also be associated with megabase chromosomes. We propose that this region represents the T. congolense VES, and through comparison with T. brucei, we discuss the parallel evolution of antigenic switching mechanisms, and unique adaptation of the T. brucei VES for developmental regulation of bloodstream-stage genes. Hence, we provide a basis for understanding antigenic switching in T. congolense and the origins of the African trypanosome VES.
The genome of tapeworm Taenia multiceps sheds light on understanding parasitic mechanism and control of coenurosis disease.
Coenurosis, caused by the larval coenurus of the tapeworm Taenia multiceps, is a fatal central nervous system disease in both sheep and humans. Though treatment and prevention options are available, the control of coenurosis still faces presents great challenges. Here, we present a high-quality genome sequence of T. multiceps in which 240 Mb (96%) of the genome has been successfully assembled using Pacbio single-molecule real-time (SMRT) and Hi-C data with a N50 length of 44.8 Mb. In total, 49.5 Mb (20.6%) repeat sequences and 13, 013 gene models were identified. We found that Taenia spp. have an expansion of transposable elements and recent small-scale gene duplications following the divergence of Taenia from Echinococcus, but not in Echinococcus genomes, and the genes underlying environmental adaptability and dosage effect tend to be over-retained in the T. multiceps genome. Moreover, we identified several genes encoding proteins involved in proglottid formation and interactions with the host central nervous system, which may contribute to the adaption of T. multiceps to its parasitic life style. Our study not only provides insights into the biology and evolution of T. multiceps, but also identifies a set of species-specific gene targets for developing novel treatment and control tools for coenurosis.
Genomic assemblies of newly sequenced Trypanosoma cruzi strains reveal new genomic expansion and greater complexity.
Chagas disease is a complex illness caused by the protozoan Trypanosoma cruzi displaying highly diverse clinical outcomes. In this sense, the genome sequence elucidation and comparison between strains may lead to disease understanding. Here, two new T. cruzi strains, have been sequenced, Y using Illumina and Bug2148 using PacBio, assembled, analyzed and compared with the T. cruzi annotated genomes available to date. The assembly stats from the new sequences show effective improvement of T. cruzi genome over the actual ones. Such as, the largest contig assembled (1.3?Mb in Bug2148) in de novo attempts and the highest mean assembly coverage (71X for Y). Our analysis reveals a new genomic expansion and greater complexity for those multi-copy gene families related to infection process and disease development, such as Trans-sialidases, Mucins and Mucin Associated Surface Proteins, among others. On one side, we demonstrate that multi-copy gene families are located near telomeric regions of the “chromosome-like” 1.3?Mb contig assembled of Bug2148, where they likely suffer high evolutive pressure. On the other hand, we identified several strain-specific single copy genes that might help to understand the differences in infectivity and physiology among strains. In summary, our results indicate that T. cruzi has a complex genomic architecture that may have promoted its evolution.