Menu
April 21, 2020  |  

Progression of the canonical reference malaria parasite genome from 2002-2019.

Here we describe the ways in which the sequence and annotation of the Plasmodium falciparum reference genome has changed since its publication in 2002. As the malaria species responsible for the most deaths worldwide, the richness of annotation and accuracy of the sequence are important resources for the P. falciparum research community as well as the basis for interpreting the genomes of subsequently sequenced species. At the time of publication in 2002 over 60% of predicted genes had unknown functions. As of March 2019, this number has been significantly decreased to 33%. The reduction is due to the inclusion of genes that were subsequently characterised experimentally and genes with significant similarity to others with known functions. In addition, the structural annotation of genes has been significantly refined; 27% of gene structures have been changed since 2002, comprising changes in exon-intron boundaries, addition or deletion of exons and the addition or deletion of genes. The sequence has also undergone significant improvements. In addition to the correction of a large number of single-base and insertion or deletion errors, a major miss-assembly between the subtelomeres of chromosome 7 and 8 has been corrected. As the number of sequenced isolates continues to grow rapidly, a single reference genome will not be an adequate basis for interpretating intra-species sequence diversity. We therefore describe in this publication a population reference genome of P. falciparum, called Pfref1. This reference will enable the community to map to regions that are not present in the current assembly. P. falciparum 3D7 will be continued to be maintained with ongoing curation ensuring continual improvements in annotation quality.


October 23, 2019  |  

Optimized CRISPR-Cas9 genome editing for Leishmania and its use to target a multigene family, induce chromosomal translocation, and study DNA break repair mechanisms.

CRISPR-Cas9-mediated genome editing has recently been adapted for Leishmania spp. parasites, the causative agents of human leishmaniasis. We have optimized this genome-editing tool by selecting for cells with CRISPR-Cas9 activity through cotargeting the miltefosine transporter gene; mutation of this gene leads to miltefosine resistance. This cotargeting strategy integrated into a triple guide RNA (gRNA) expression vector was used to delete all 11 copies of the A2 multigene family; this was not previously possible with the traditional gene-targeting method. We found that the Leishmania donovani rRNA promoter is more efficient than the U6 promoter in driving gRNA expression, and sequential transfections of the oligonucleotide donor significantly eased the isolation of edited mutants. A gRNA and Cas9 coexpression vector was developed that was functional in all tested Leishmania species, including L. donovani, L. major, and L. mexicana. By simultaneously targeting sites from two different chromosomes, all four types of targeted chromosomal translocations were generated, regardless of the polycistronic transcription direction from the parent chromosomes. It was possible to use this CRISPR system to create a single conserved amino acid substitution (A189G) mutation for both alleles of RAD51, a DNA recombinase involved in homology-directed repair. We found that RAD51 is essential for L. donovani survival based on direct observation of the death of mutants with both RAD51 alleles disrupted, further confirming that this CRISPR system can reveal gene essentiality. Evidence is also provided that microhomology-mediated end joining (MMEJ) plays a major role in double-strand DNA break repair in L. donovani. IMPORTANCELeishmania parasites cause human leishmaniasis. To accelerate characterization of Leishmania genes for new drug and vaccine development, we optimized and simplified the CRISPR-Cas9 genome-editing tool for Leishmania. We show that co-CRISPR targeting of the miltefosine transporter gene and serial transfections of an oligonucleotide donor significantly eased isolation of edited mutants. This cotargeting strategy was efficiently used to delete all 11 members of the A2 virulence gene family. This technical advancement is valuable, since there are many gene clusters and supernumerary chromosomes in the various Leishmania species and isolates. We simplified this CRISPR system by developing a gRNA and Cas9 coexpression vector which could be used to delete genes in various Leishmania species. This CRISPR system could also be used to generate specific chromosomal translocations, which will help in the study of Leishmania gene expression and transcription control. This study also provides new information about double-strand DNA break repair mechanisms in Leishmania.


October 23, 2019  |  

Transmission, evolution, and endogenization: Lessons learned from recent retroviral invasions.

Viruses of the subfamily Orthoretrovirinaeare defined by the ability to reverse transcribe an RNA genome into DNA that integrates into the host cell genome during the intracellular virus life cycle. Exogenous retroviruses (XRVs) are horizontally transmitted between host individuals, with disease outcome depending on interactions between the retrovirus and the host organism. When retroviruses infect germ line cells of the host, they may become endogenous retroviruses (ERVs), which are permanent elements in the host germ line that are subject to vertical transmission. These ERVs sometimes remain infectious and can themselves give rise to XRVs. This review integrates recent developments in the phylogenetic classification of retroviruses and the identification of retroviral receptors to elucidate the origins and evolution of XRVs and ERVs. We consider whether ERVs may recurrently pressure XRVs to shift receptor usage to sidestep ERV interference. We discuss how related retroviruses undergo alternative fates in different host lineages after endogenization, with koala retrovirus (KoRV) receiving notable interest as a recent invader of its host germ line. KoRV is heritable but also infectious, which provides insights into the early stages of germ line invasions as well as XRV generation from ERVs. The relationship of KoRV to primate and other retroviruses is placed in the context of host biogeography and the potential role of bats and rodents as vectors for interspecies viral transmission. Combining studies of extant XRVs and “fossil” endogenous retroviruses in koalas and other Australasian species has broadened our understanding of the evolution of retroviruses and host-retrovirus interactions. Copyright © 2017 American Society for Microbiology.


September 22, 2019  |  

Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads.

Gene isoforms are commonly found in both prokaryotes and eukaryotes. Since each isoform may perform a specific function in response to changing environmental conditions, studying the dynamics of gene isoforms is important in understanding biological processes and disease conditions. However, genome-wide identification of gene isoforms is technically challenging due to the high degree of sequence identity among isoforms. Traditional targeted sequencing approach, involving Sanger sequencing of plasmid-cloned PCR products, has low throughput and is very tedious and time-consuming. Next-generation sequencing technologies such as Illumina and 454 achieve high throughput but their short read lengths are a critical barrier to accurate assembly of highly similar gene isoforms, and may result in ambiguities and false joining during sequence assembly. More recently, the third generation sequencer represented by the PacBio platform offers sufficient throughput and long reads covering the full length of typical genes, thus providing a potential to reliably profile gene isoforms. However, the PacBio long reads are error-prone and cannot be effectively analyzed by traditional assembly programs.We present a clustering-based analysis pipeline integrated with PacBio sequencing data for profiling highly similar gene isoforms. This approach was first evaluated in comparison to de novo assembly of 454 reads using a benchmark admixture containing 10 known, cloned msg genes encoding the major surface glycoprotein of Pneumocystis jirovecii. All 10 msg isoforms were successfully reconstructed with the expected length (~1.5 kb) and correct sequence by the new approach, while 454 reads could not be correctly assembled using various assembly programs. When using an additional benchmark admixture containing 22 known P. jirovecii msg isoforms, this approach accurately reconstructed all but 4 these isoforms in their full-length (~3 kb); these 4 isoforms were present in low concentrations in the admixture. Finally, when applied to the original clinical sample from which the 22 known msg isoforms were cloned, this approach successfully identified not only all known isoforms accurately (~3 kb each) but also 48 novel isoforms.PacBio sequencing integrated with the clustering-based analysis pipeline achieves high-throughput and high-resolution discrimination of highly similar sequences, and can serve as a new approach for genome-wide characterization of gene isoforms and other highly repetitive sequences.


September 22, 2019  |  

Analysis of aquaporins from the euryhaline barnacle Balanus improvisus reveals differential expression in response to changes in salinity.

Barnacles are sessile macro-invertebrates, found along rocky shores in coastal areas worldwide. The euryhaline bay barnacle Balanus improvisus (Darwin, 1854) (= Amphibalanus improvisus) can tolerate a wide range of salinities, but the molecular mechanisms underlying the osmoregulatory capacity of this truly brackish species are not well understood. Aquaporins are pore-forming integral membrane proteins that facilitate transport of water, small solutes and ions through cellular membranes, and that have been shown to be important for osmoregulation in many organisms. The knowledge of the function of aquaporins in crustaceans is, however, limited and nothing is known about them in barnacles. We here present the repertoire of aquaporins from a thecostracan crustacean, the barnacle B. improvisus, based on genome and transcriptome sequencing. Our analyses reveal that B. improvisus contains eight genes for aquaporins. Phylogenetic analysis showed that they represented members of the classical water aquaporins (Aqp1, Aqp2), the aquaglyceroporins (Glp1, Glp2), the unorthodox aquaporin (Aqp12) and the arthropod-specific big brain aquaporin (Bib). Interestingly, we also found two big brain-like proteins (BibL1 and BibL2) constituting a new group of aquaporins not yet described in arthropods. In addition, we found that the two water-specific aquaporins were expressed as C-terminal splice variants. Heterologous expression of some of the aquaporins followed by functional characterization showed that Aqp1 transported water and Glp2 water and glycerol, agreeing with the predictions of substrate specificity based on 3D modeling and phylogeny. To investigate a possible role for the B. improvisus aquaporins in osmoregulation, mRNA expression changes in adult barnacles were analysed after long-term acclimation to different salinities. The most pronounced expression difference was seen for AQP1 with a substantial (>100-fold) decrease in the mantle tissue in low salinity (3 PSU) compared to high salinity (33 PSU). Our study provides a base for future mechanistic studies on the role of aquaporins in osmoregulation.


September 22, 2019  |  

Avian transcriptomics: opportunities and challenges

Recent developments in next-generation sequencing technologies have greatly facilitated the study of whole transcriptomes in model and non-model species. Studying the transcriptome and how it changes across a variety of biological conditions has had major implications for our understanding of how the genome is regulated in different contexts, and how to interpret adaptations and the phenotype of an organism. The aim of this review is to highlight the potential of these new technologies for the study of avian transcriptomics, and to summarise how transcriptomics has been applied in ornithology. A total of 81 peer-reviewed scientific articles that used transcriptomics to answer questions within a broad range of study areas in birds are used as examples throughout the review. We further provide a quick guide to highlight the most important points which need to be take into account when planning a transcriptomic study in birds, and discuss how researchers with little background in molecular biology can avoid potential pitfalls. Suggestions for further reading are supplied throughout. We also discuss possible future developments in the technology platforms used for ribonucleic acid sequencing. By summarising how these novel technologies can be used to answer questions that have long been asked by ornithologists, we hope to bridge the gap between traditional ornithology and genomics, and to stimulate more interdisciplinary research.


September 22, 2019  |  

Current progress in EBV-associated B-cell lymphomas.

Epstein-Barr virus (EBV) was the first human tumor virus discovered more than 50 years ago. EBV-associated lymphomagenesis is still a significant viral-associated disease as it involves a diverse range of pathologies, especially B-cell lymphomas. Recent development of high-throughput next-generation sequencing technologies and in vivo mouse models have significantly promoted our understanding of the fundamental molecular mechanisms which drive these cancers and allowed for the development of therapeutic intervention strategies. This review will highlight the current advances in EBV-associated B-cell lymphomas, focusing on transcriptional regulation, chromosome aberrations, in vivo studies of EBV-mediated lymphomagenesis, as well as the treatment strategies to target viral-associated lymphomas.


September 22, 2019  |  

Quantitative profiling of Drosophila melanogaster Dscam1 isoforms reveals no changes in splicing after bacterial exposure.

The hypervariable Dscam1 (Down syndrome cell adhesion molecule 1) gene can produce thousands of different ectodomain isoforms via mutually exclusive alternative splicing. Dscam1 appears to be involved in the immune response of some insects and crustaceans. It has been proposed that the diverse isoforms may be involved in the recognition of, or the defence against, diverse parasite epitopes, although evidence to support this is sparse. A prediction that can be generated from this hypothesis is that the gene expression of specific exons and/or isoforms is influenced by exposure to an immune elicitor. To test this hypothesis, we for the first time, use a long read RNA sequencing method to directly investigate the Dscam1 splicing pattern after exposing adult Drosophila melanogaster and a S2 cell line to live Escherichia coli. After bacterial exposure both models showed increased expression of immune-related genes, indicating that the immune system had been activated. However there were no changes in total Dscam1 mRNA expression. RNA sequencing further showed that there were no significant changes in individual exon expression and no changes in isoform splicing patterns in response to bacterial exposure. Therefore our studies do not support a change of D. melanogaster Dscam1 isoform diversity in response to live E. coli. Nevertheless, in future this approach could be used to identify potentially immune-related Dscam1 splicing regulation in other host species or in response to other pathogens.


September 22, 2019  |  

Improved full-length killer cell immunoglobulin-like receptor transcript discovery in Mauritian cynomolgus macaques.

Killer cell immunoglobulin-like receptors (KIRs) modulate disease progression of pathogens including HIV, malaria, and hepatitis C. Cynomolgus and rhesus macaques are widely used as nonhuman primate models to study human pathogens, and so, considerable effort has been put into characterizing their KIR genetics. However, previous studies have relied on cDNA cloning and Sanger sequencing that lack the throughput of current sequencing platforms. In this study, we present a high throughput, full-length allele discovery method utilizing Pacific Biosciences circular consensus sequencing (CCS). We also describe a new approach to Macaque Exome Sequencing (MES) and the development of the Rhexome1.0, an adapted target capture reagent that includes macaque-specific capture probe sets. By using sequence reads generated by whole genome sequencing (WGS) and MES to inform primer design, we were able to increase the sensitivity of KIR allele discovery. We demonstrate this increased sensitivity by defining nine novel alleles within a cohort of Mauritian cynomolgus macaques (MCM), a geographically isolated population with restricted KIR genetics that was thought to be completely characterized. Finally, we describe an approach to genotyping KIRs directly from sequence reads generated using WGS/MES reads. The findings presented here expand our understanding of KIR genetics in MCM by associating new genes with all eight KIR haplotypes and demonstrating the existence of at least one KIR3DS gene associated with every haplotype.


September 22, 2019  |  

The gut commensal microbiome of Drosophila melanogaster is modified by the endosymbiont Wolbachia.

Endosymbiotic Wolbachia bacteria and the gut microbiome have independently been shown to affect several aspects of insect biology, including reproduction, development, life span, stem cell activity, and resistance to human pathogens, in insect vectors. This work shows that Wolbachia bacteria, which reside mainly in the fly germline, affect the microbial species present in the fly gut in a lab-reared strain. Drosophila melanogaster hosts two main genera of commensal bacteria-Acetobacter and Lactobacillus. Wolbachia-infected flies have significantly reduced titers of Acetobacter. Sampling of the microbiome of axenic flies fed with equal proportions of both bacteria shows that the presence of Wolbachia bacteria is a significant determinant of the composition of the microbiome throughout fly development. However, this effect is host genotype dependent. To investigate the mechanism of microbiome modulation, the effect of Wolbachia bacteria on Imd and reactive oxygen species pathways, the main regulators of immune response in the fly gut, was measured. The presence of Wolbachia bacteria does not induce significant changes in the expression of the genes for the effector molecules in either pathway. Furthermore, microbiome modulation is not due to direct interaction between Wolbachia bacteria and gut microbes. Confocal analysis shows that Wolbachia bacteria are absent from the gut lumen. These results indicate that the mechanistic basis of the modulation of composition of the microbiome by Wolbachia bacteria is more complex than a direct bacterial interaction or the effect of Wolbachia bacteria on fly immunity. The findings reported here highlight the importance of considering the composition of the gut microbiome and host genetic background during Wolbachia-induced phenotypic studies and when formulating microbe-based disease vector control strategies. IMPORTANCE Wolbachia bacteria are intracellular bacteria present in the microbiome of a large fraction of insects and parasitic nematodes. They can block mosquitos’ ability to transmit several infectious disease-causing pathogens, including Zika, dengue, chikungunya, and West Nile viruses and malaria parasites. Certain extracellular bacteria present in the gut lumen of these insects can also block pathogen transmission. However, our understanding of interactions between Wolbachia and gut bacteria and how they influence each other is limited. Here we show that the presence of Wolbachia strain wMel changes the composition of gut commensal bacteria in the fruit fly. Our findings implicate interactions between bacterial species as a key factor in determining the overall composition of the microbiome and thus reveal new paradigms to consider in the development of disease control strategies.


September 22, 2019  |  

Single molecule RNA sequencing uncovers trans-splicing and improves annotations in Anopheles stephensi.

Single molecule real-time (SMRT) sequencing has recently been used to obtain full-length cDNA sequences that improve genome annotation and reveal RNA isoforms. Here, we used one such method called isoform sequencing from Pacific Biosciences (PacBio) to sequence a cDNA library from the Asian malaria mosquito Anopheles stephensi. More than 600 000 full-length cDNAs, referred to as reads of insert, were identified. Owing to the inherently high error rate of PacBio sequencing, we tested different approaches for error correction. We found that error correction using Illumina RNA sequencing (RNA-seq) generated more data than using the default SMRT pipeline. The full-length error-corrected PacBio reads greatly improved the gene annotation of Anopheles stephensi: 4867 gene models were updated and 1785 alternatively spliced isoforms were added to the annotation. In addition, six trans-splicing events, where exons from different primary transcripts were joined together, were identified in An. stephensi. All six trans-splicing events appear to be conserved in Culicidae, as they are also found in Anopheles gambiae and Aedes aegypti. The proteins encoded by trans-splicing events are also highly conserved and the orthologues of these proteins are cis-spliced in outgroup species, indicating that trans-splicing may arise as a mechanism to rescue genes that broke up during evolution.© 2017 The Royal Entomological Society.


September 22, 2019  |  

Plasmodium knowlesi: a superb in vivo nonhuman primate model of antigenic variation in malaria.

Antigenic variation in malaria was discovered in Plasmodium knowlesi studies involving longitudinal infections of rhesus macaques (M. mulatta). The variant proteins, known as the P. knowlesi Schizont Infected Cell Agglutination (SICA) antigens and the P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) antigens, expressed by the SICAvar and var multigene families, respectively, have been studied for over 30 years. Expression of the SICA antigens in P. knowlesi requires a splenic component, and specific antibodies are necessary for variant antigen switch events in vivo. Outstanding questions revolve around the role of the spleen and the mechanisms by which the expression of these variant antigen families are regulated. Importantly, the longitudinal dynamics and molecular mechanisms that govern variant antigen expression can be studied with P. knowlesi infection of its mammalian and vector hosts. Synchronous infections can be initiated with established clones and studied at multi-omic levels, with the benefit of computational tools from systems biology that permit the integration of datasets and the design of explanatory, predictive mathematical models. Here we provide an historical account of this topic, while highlighting the potential for maximizing the use of P. knowlesi – macaque model systems and summarizing exciting new progress in this area of research.


September 22, 2019  |  

Redkmer: An Assembly-Free Pipeline for the Identification of Abundant and Specific X-Chromosome Target Sequences for X-Shredding by CRISPR Endonucleases.

CRISPR-based synthetic sex ratio distorters, which operate by shredding the X-chromosome during male meiosis, are promising tools for the area-wide control of harmful insect pest or disease vector species. X-shredders have been proposed as tools to suppress insect populations by biasing the sex ratio of the wild population toward males, thus reducing its natural reproductive potential. However, to build synthetic X-shredders based on CRISPR, the selection of gRNA targets, in the form of high-copy sequence repeats on the X chromosome of a given species, is difficult, since such repeats are not accurately resolved in genome assemblies and cannot be assigned to chromosomes with confidence. We have therefore developed the redkmer computational pipeline, designed to identify short and highly abundant sequence elements occurring uniquely on the X chromosome. Redkmer was designed to use as input minimally processed whole genome sequence data from males and females. We tested redkmer with short- and long-read whole genome sequence data of Anopheles gambiae, the major vector of human malaria, in which the X-shredding paradigm was originally developed. Redkmer established long reads as chromosomal proxies with excellent correlation to the genome assembly and used them to rank X-candidate kmers for their level of X-specificity and abundance. Among these, a high-confidence set of 25-mers was identified, many belonging to previously known X-chromosome repeats of Anopheles gambiae, including the ribosomal gene array and the selfish elements harbored within it. Data from a control strain, in which these repeats are shared with the Y chromosome, confirmed the elimination of these kmers during filtering. Finally, we show that redkmer output can be linked directly to gRNA selection and off-target prediction. In addition, the output of redkmer, including the prediction of chromosomal origin of single-molecule long reads and chromosome specific kmers, could also be used for the characterization of other biologically relevant sex chromosome sequences, a task that is frequently hampered by the repetitiveness of sex chromosome sequence content.


September 22, 2019  |  

Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza.

The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young ‘AA’ subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 ‘Miracle Rice’, which relieved famine and drove the Green Revolution in Asia 50 years ago.


September 22, 2019  |  

A hybrid-hierarchical genome assembly strategy to sequence the invasive golden mussel Limnoperna fortunei.

For more than 25 years, the golden mussel Limnoperna fortunei has aggressively invaded South American freshwaters, having travelled more than 5,000 km upstream across five countries. Along the way, the golden mussel has outcompeted native species and economically harmed aquaculture, hydroelectric powers, and ship transit. We have sequenced the complete genome of the golden mussel to understand the molecular basis of its invasiveness and search for ways to control it.We assembled the 1.6 Gb genome into 20548 scaffolds with an N50 length of 312 Kb using a hybrid and hierarchical assembly strategy from short and long DNA reads and transcriptomes. A total of 60717 coding genes were inferred from a customized transcriptome-trained AUGUSTUS run. We also compared predicted protein sets with those of complete molluscan genomes, revealing an exacerbation of protein-binding domains in L. fortunei. Conclusions: We built one of the best bivalve genome assemblies available using a cost-effective approach using Illumina pair-end, mate pair, and PacBio long reads. We expect that the continuous and careful annotation of L. fortunei’s genome will contribute to the investigation of bivalve genetics, evolution, and invasiveness, as well as to the development of biotechnological tools for aquatic pest control.© The Authors 2017. Published by Oxford University Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.