Menu
September 22, 2019  |  

Genome-wide identification and analysis of the ALTERNATIVE OXIDASE gene family in diploid and hexaploid wheat.

A comprehensive understanding of wheat responses to environmental stress will contribute to the long-term goal of feeding the planet. ALERNATIVE OXIDASE (AOX) genes encode proteins involved in a bypass of the electron transport chain and are also known to be involved in stress tolerance in multiple species. Here, we report the identification and characterization of the AOX gene family in diploid and hexaploid wheat. Four genes each were found in the diploid ancestors Triticum urartu, and Aegilops tauschii, and three in Aegilops speltoides. In hexaploid wheat (Triticum aestivum), 20 genes were identified, some with multiple splice variants, corresponding to a total of 24 proteins for those with observed transcription and translation. These proteins were classified as AOX1a, AOX1c, AOX1e or AOX1d via phylogenetic analysis. Proteins lacking most or all signature AOX motifs were assigned to putative regulatory roles. Analysis of protein-targeting sequences suggests mixed localization to the mitochondria and other organelles. In comparison to the most studied AOX from Trypanosoma brucei, there were amino acid substitutions at critical functional domains indicating possible role divergence in wheat or grasses in general. In hexaploid wheat, AOX genes were expressed at specific developmental stages as well as in response to both biotic and abiotic stresses such as fungal pathogens, heat and drought. These AOX expression patterns suggest a highly regulated and diverse transcription and expression system. The insights gained provide a framework for the continued and expanded study of AOX genes in wheat for stress tolerance through breeding new varieties, as well as resistance to AOX-targeted herbicides, all of which can ultimately be used synergistically to improve crop yield.


September 22, 2019  |  

Full-length transcriptome survey and expression analysis of Cassia obtusifolia to discover putative genes related to aurantio-obtusin biosynthesis, seed formation and development, and stress response.

The seed is the pharmaceutical and breeding organ of Cassia obtusifolia, a well-known medical herb containing aurantio-obtusin (a kind of anthraquinone), food, and landscape. In order to understand the molecular mechanism of the biosynthesis of aurantio-obtusin, seed formation and development, and stress response of C. obtusifolia, it is necessary to understand the genomics information. Although previous seed transcriptome of C. obtusifolia has been carried out by short-read next-generation sequencing (NGS) technology, the vast majority of the resulting unigenes did not represent full-length cDNA sequences and supply enough gene expression profile information of the various organs or tissues. In this study, fifteen cDNA libraries, which were constructed from the seed, root, stem, leaf, and flower (three repetitions with each organ) of C. obtusifolia, were sequenced using hybrid approach combining single-molecule real-time (SMRT) and NGS platform. More than 4,315,774 long reads with 9.66 Gb sequencing data and 361,427,021 short reads with 108.13 Gb sequencing data were generated by SMRT and NGS platform, respectively. 67,222 consensus isoforms were clustered from the reads and 81.73% (61,016) of which were longer than 1000 bp. Furthermore, the 67,222 consensus isoforms represented 58,106 nonredundant transcripts, 98.25% (57,092) of which were annotated and 25,573 of which were assigned to specific metabolic pathways by KEGG. CoDXS and CoDXR genes were directly used for functional characterization to validate the accuracy of sequences obtained from transcriptome. A total of 658 seed-specific transcripts indicated their special roles in physiological processes in seed. Analysis of transcripts which were involved in the early stage of anthraquinone biosynthesis suggested that the aurantio-obtusin in C. obtusifolia was mainly generated from isochorismate and Mevalonate/methylerythritol phosphate (MVA/MEP) pathway, and three reactions catalyzed by Menaquinone-specific isochorismate synthase (ICS), 1-deoxy-d-xylulose-5-phosphate synthase (DXS) and isopentenyl diphosphate (IPPS) might be the limited steps. Several seed-specific CYPs, SAM-dependent methyltransferase, and UDP-glycosyltransferase (UDPG) supplied promising candidate genes in the late stage of anthraquinone biosynthesis. In addition, four seed-specific transcriptional factors including three MYB Transcription Factor (MYB) and one MADS-box Transcription Factor (MADS) transcriptional factors) and alternative splicing might be involved with seed formation and development. Meanwhile, most members of Hsp20 genes showed high expression level in seed and flower; seven of which might have chaperon activities under various abiotic stresses. Finally, the expressional patterns of genes with particular interests showed similar trends in both transcriptome assay and qRT-PCR. In conclusion, this is the first full-length transcriptome sequencing reported in Caesalpiniaceae family, and thus providing a more complete insight into aurantio-obtusin biosynthesis, seed formation and development, and stress response as well in C. obtusifolia.


September 22, 2019  |  

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.


September 22, 2019  |  

Order of removal of conventional and nonconventional introns from nuclear transcripts of Euglena gracilis.

Nuclear genes of euglenids and marine diplonemids harbor atypical, nonconventional introns which are not observed in the genomes of other eukaryotes. Nonconventional introns do not have the conserved borders characteristic for spliceosomal introns or the sequence complementary to U1 snRNA at the 5′ end. They form a stable secondary structure bringing together both exon/intron junctions, nevertheless, this conformation does not resemble the form of self-splicing or tRNA introns. In the genes studied so far, frequent nonconventional introns insertions at new positions have been observed, whereas conventional introns have been either found at the conserved positions, or simply lost. In this work, we examined the order of intron removal from Euglena gracilis transcripts of the tubA and gapC genes, which contain two types of introns: nonconventional and spliceosomal. The relative order of intron excision was compared for pairs of introns belonging to different types. Furthermore, intermediate products of splicing were analyzed using the PacBio Next Generation Sequencing system. The analysis led to the main conclusion that nonconventional introns are removed in a rapid way but later than spliceosomal introns. Moreover, the observed accumulation of transcripts with conventional introns removed and nonconventional present may suggest the existence of a time gap between the two types of splicing.


September 22, 2019  |  

Recent insights into the tick microbiome gained through next-generation sequencing.

The tick microbiome comprises communities of microorganisms, including viruses, bacteria and eukaryotes, and is being elucidated through modern molecular techniques. The advent of next-generation sequencing (NGS) technologies has enabled the genes and genomes within these microbial communities to be explored in a rapid and cost-effective manner. The advantages of using NGS to investigate microbiomes surpass the traditional non-molecular methods that are limited in their sensitivity, and conventional molecular approaches that are limited in their scalability. In recent years the number of studies using NGS to investigate the microbial diversity and composition of ticks has expanded. Here, we provide a review of NGS strategies for tick microbiome studies and discuss the recent findings from tick NGS investigations, including the bacterial diversity and composition, influential factors, and implications of the tick microbiome.


September 22, 2019  |  

Evidence of the red-queen hypothesis from accelerated rates of evolution of genes involved in biotic interactions in Pneumocystis.

Pneumocystis species are ascomycete fungi adapted to live inside the lungs of mammals. These ascomycetes show extensive stenoxenism, meaning that each species of Pneumocystis infects a single species of host. Here, we study the effect exerted by natural selection on gene evolution in the genomes of three Pneumocystis species. We show that genes involved in host interaction evolve under positive selection. In the first place, we found strong evidence of episodic diversifying selection in Major surface glycoproteins (Msg). These proteins are located on the surface of Pneumocystis and are used for host attachment and probably for immune system evasion. Consistent with their function as antigens, most sites under diversifying selection in Msg code for residues with large relative surface accessibility areas. We also found evidence of positive selection in part of the cell machinery used to export Msg to the cell surface. Specifically, we found that genes participating in glycosylphosphatidylinositol (GPI) biosynthesis show an increased rate of nonsynonymous substitutions (dN) versus synonymous substitutions (dS). GPI is a molecule synthesized in the endoplasmic reticulum that is used to anchor proteins to membranes. We interpret the aforementioned findings as evidence of selective pressure exerted by the host immune system on Pneumocystis species, shaping the evolution of Msg and several proteins involved in GPI biosynthesis. We suggest that genome evolution in Pneumocystis is well described by the Red-Queen hypothesis whereby genes relevant for biotic interactions show accelerated rates of evolution.


September 22, 2019  |  

Identification of a biosynthetic gene cluster for the polyene macrolactam sceliphrolactam in a Streptomyces strain isolated from mangrove sediment.

Streptomyces are a genus of Actinobacteria capable of producing structurally diverse natural products. Here we report the isolation and characterization of a biosynthetically talented Streptomyces (Streptomyces sp. SD85) from tropical mangrove sediments. Whole-genome sequencing revealed that Streptomyces sp. SD85 harbors at least 52 biosynthetic gene clusters (BGCs), which constitute 21.2% of the 8.6-Mb genome. When cultivated under lab conditions, Streptomyces sp. SD85 produces sceliphrolactam, a 26-membered polyene macrolactam with unknown biosynthetic origin. Genome mining yielded a putative sceliphrolactam BGC (sce) that encodes a type I modular polyketide synthase (PKS) system, several ß-amino acid starter biosynthetic enzymes, transporters, and transcriptional regulators. Using the CRISPR/Cas9-based gene knockout method, we demonstrated that the sce BGC is essential for sceliphrolactam biosynthesis. Unexpectedly, the PKS system encoded by sce is short of one module required for assembling the 26-membered macrolactam skeleton according to the collinearity rule. With experimental data disfavoring the involvement of a trans-PKS module, the biosynthesis of sceliphrolactam seems to be best rationalized by invoking a mechanism whereby the PKS system employs an iterative module to catalyze two successive chain extensions with different outcomes. The potential violation of the collinearity rule makes the mechanism distinct from those of other polyene macrolactams.


September 22, 2019  |  

The genomes of Crithidia bombi and C. expoeki, common parasites of bumblebees.

Trypanosomatids (Trypanosomatidae, Kinetoplastida) are flagellated protozoa containing many parasites of medical or agricultural importance. Among those, Crithidia bombi and C. expoeki, are common parasites in bumble bees around the world, and phylogenetically close to Leishmania and Leptomonas. They have a simple and direct life cycle with one host, and partially castrate the founding queens greatly reducing their fitness. Here, we report the nuclear genome sequences of one clone of each species, extracted from a field-collected infection. Using a combination of Roche 454 FLX Titanium, Pacific Biosciences PacBio RS, and Illumina GA2 instruments for C. bombi, and PacBio for C. expoeki, we could produce high-quality and well resolved sequences. We find that these genomes are around 32 and 34 MB, with 7,808 and 7,851 annotated genes for C. bombi and C. expoeki, respectively-which is somewhat less than reported from other trypanosomatids, with few introns, and organized in polycistronic units. A large fraction of genes received plausible functional support in comparison primarily with Leishmania and Trypanosoma. Comparing the annotated genes of the two species with those of six other trypanosomatids (C. fasciculata, L. pyrrhocoris, L. seymouri, B. ayalai, L. major, and T. brucei) shows similar gene repertoires and many orthologs. Similar to other trypanosomatids, we also find signs of concerted evolution in genes putatively involved in the interaction with the host, a high degree of synteny between C. bombi and C. expoeki, and considerable overlap with several other species in the set. A total of 86 orthologous gene groups show signatures of positive selection in the branch leading to the two Crithidia under study, mostly of unknown function. As an example, we examined the initiating glycosylation pathway of surface components in C. bombi, finding it deviates from most other eukaryotes and also from other kinetoplastids, which may indicate rapid evolution in the extracellular matrix that is involved in interactions with the host. Bumble bees are important pollinators and Crithidia-infections are suspected to cause substantial selection pressure on their host populations. These newly sequenced genomes provide tools that should help better understand host-parasite interactions in these pollinator pathogens.


September 22, 2019  |  

Repeat-driven generation of antigenic diversity in a major human pathogen, Trypanosoma cruzi

Trypanosoma cruzi, a zoonotic kinetoplastid protozoan with a complex genome, is the causative agent of American trypanosomiasis (Chagas disease). The parasite uses a highly diverse repertoire of surface molecules, with roles in cell invasion, immune evasion and pathogenesis. Thus far, the genomic regions containing these genes have been impossible to resolve and it has been impossible to study the structure and function of the several thousand repetitive genes encoding the surface molecules of the parasite. We here present an improved genome assembly of a T. cruzi clade I (TcI) strain using high coverage PacBio single molecule sequencing, together with Illumina sequencing of 34 T. cruzi TcI isolates and clones from different geographic locations, sample sources and clinical outcomes. Resolution of the surface molecule gene structure reveals an unusual duality in the organisation of the parasite genome, a core genomic region syntenous with related protozoa flanked by unique and highly plastic subtelomeric regions encoding surface antigens. The presence of abundant interspersed retrotransposons in the subtelomeres suggests that these elements are involved in a recombination mechanism for the generation of antigenic variation and evasion of the host immune response. The comparative genomic analysis of the cohort of TcI strains revealed multiple cases of such recombination events involving surface molecule genes and has provided new insights into T. cruzi population structure.


September 22, 2019  |  

Comparative genomics analysis of plasmid pPV989-94 from a clinical isolate of Pantoea vagans PV989.

Pantoea vagans, a gram-negative bacterium from the genus Pantoea and family Enterobacteriaceae, is present in various natural environments and considered to be plant endophytes. We isolated the Pantoea vagans PV989 strain from the clinic and sequenced its whole genome. Besides a chromosome DNA molecule, it also harboured three large plasmids. A comparative genomics analysis was performed for the smallest plasmid, pPV989-94. It can be divided into four regions, including three conservative regions related to replication (R1), transfer conjugation (R2), and transfer leading (R3), and one variable region (R4). Further analysis showed that pPV989-94 is most similar to plasmids LA637P2 and pEA68 of Erwinia amylovora strains isolated from fruit trees. These three plasmids share three conservative regions (R1, R2, and R3). Interestingly, a fragment (R4′) in R4, mediated by phage integrase and phage integrase family site-specific recombinase and encoding 9 genes related to glycometabolism, resistance, and DNA repair, was unique in pPV989-94. Homologues of R4′ were found in other plasmids or chromosomes, suggesting that horizontal gene transfer (HGT) occurred among different bacteria of various species or genera. The acquired functional genes may play important roles in the adaptation of bacteria to different hosts or environmental conditions.


September 22, 2019  |  

A molecular window into the biology and epidemiology of Pneumocystis spp.

Pneumocystis, a unique atypical fungus with an elusive lifestyle, has had an important medical history. It came to prominence as an opportunistic pathogen that not only can cause life-threatening pneumonia in patients with HIV infection and other immunodeficiencies but also can colonize the lungs of healthy individuals from a very early age. The genus Pneumocystis includes a group of closely related but heterogeneous organisms that have a worldwide distribution, have been detected in multiple mammalian species, are highly host species specific, inhabit the lungs almost exclusively, and have never convincingly been cultured in vitro, making Pneumocystis a fascinating but difficult-to-study organism. Improved molecular biologic methodologies have opened a new window into the biology and epidemiology of Pneumocystis. Advances include an improved taxonomic classification, identification of an extremely reduced genome and concomitant inability to metabolize and grow independent of the host lungs, insights into its transmission mode, recognition of its widespread colonization in both immunocompetent and immunodeficient hosts, and utilization of strain variation to study drug resistance, epidemiology, and outbreaks of infection among transplant patients. This review summarizes these advances and also identifies some major questions and challenges that need to be addressed to better understand Pneumocystis biology and its relevance to clinical care. Copyright © 2018 American Society for Microbiology.


September 22, 2019  |  

Integrating long-range connectivity information into de Bruijn graphs.

The de Bruijn graph is a simple and efficient data structure that is used in many areas of sequence analysis including genome assembly, read error correction and variant calling. The data structure has a single parameter k, is straightforward to implement and is tractable for large genomes with high sequencing depth. It also enables representation of multiple samples simultaneously to facilitate comparison. However, unlike the string graph, a de Bruijn graph does not retain long range information that is inherent in the read data. For this reason, applications that rely on de Bruijn graphs can produce sub-optimal results given their input data.We present a novel assembly graph data structure: the Linked de Bruijn Graph (LdBG). Constructed by adding annotations on top of a de Bruijn graph, it stores long range connectivity information through the graph. We show that with error-free data it is possible to losslessly store and recover sequence from a Linked de Bruijn graph. With assembly simulations we demonstrate that the LdBG data structure outperforms both our de Bruijn graph and the String Graph Assembler (SGA). Finally we apply the LdBG to Klebsiella pneumoniae short read data to make large (12 kbp) variant calls, which we validate using PacBio sequencing data, and to characterize the genomic context of drug-resistance genes.Linked de Bruijn Graphs and associated algorithms are implemented as part of McCortex, which is available under the MIT license at https://github.com/mcveanlab/mccortex.Supplementary data are available at Bioinformatics online.


September 22, 2019  |  

The genome of Naegleria lovaniensis, the basis for a comparative approach to unravel pathogenicity factors of the human pathogenic amoeba N. fowleri.

Members of the genus Naegleria are free-living eukaryotes with the capability to transform from the amoeboid form into resting cysts or moving flagellates in response to environmental conditions. More than 40 species have been characterized, but only Naegleria fowleri (N. fowleri) is known as a human pathogen causing primary amoebic meningoencephalitis (PAM), a fast progressing and mostly fatal disease of the central nervous system. Several studies report an involvement of phospholipases and other molecular factors, but the mechanisms involved in pathogenesis are still poorly understood. To gain a better understanding of the relationships within the genus of Naegleria and to investigate pathogenicity factors of N. fowleri, we characterized the genome of its closest non-pathogenic relative N. lovaniensis.To gain insights into the taxonomy of Naegleria, we sequenced the genome of N. lovaniensis using long read sequencing technology. The assembly of the data resulted in a 30 Mb genome including the circular mitochondrial sequence. Unravelling the phylogenetic relationship using OrthoMCL protein clustering and maximum likelihood methods confirms the close relationship of N. lovaniensis and N. fowleri. To achieve an overview of the diversity of Naegleria proteins and to assess characteristics of the human pathogen N. fowleri, OrthoMCL protein clustering including data of N. fowleri, N. lovaniensis and N. gruberi was performed. GO enrichment analysis shows an association of N. fowleri specific proteins to the GO terms “Membrane” and “Protein Secretion.”In this study, we characterize the hitherto unknown genome of N. lovaniensis. With the description of the 30 Mb genome, a further piece is added to reveal the complex taxonomic relationship of Naegleria. Further, the whole genome sequencing data confirms the hypothesis of the close relationship between N. fowleri and N. lovaniensis. Therefore, the genome of N. lovaniensis provides the basis for further comparative approaches on the molecular and genomic level to unravel pathogenicity factors of its closest human pathogenic relative N. fowleri and possible treatment options for the rare but mostly fatal primary meningoencephalitis.


September 22, 2019  |  

The structure of a conserved telomeric region associated with variant antigen loci in the blood parasite Trypanosoma congolense

African trypanosomiasis is a vector-borne disease of humans and livestock caused by African trypanosomes (Trypanosoma spp.). Survival in the vertebrate bloodstream depends on antigenic variation of Variant Surface Glycoproteins (VSGs) coating the parasite surface. In T. brucei, a model for antigenic variation, monoallelic VSG expression originates from dedicated VSG expression sites (VES). Trypanosoma brucei VES have a conserved structure consisting of a telomeric VSG locus downstream of unique, repeat sequences, and an independent promoter. Additional protein-coding sequences, known as “Expression Site Associated Genes (ESAGs)”, are also often present and are implicated in diverse, bloodstream-stage functions. Trypanosoma congolense is a related veterinary pathogen, also displaying VSG-mediated antigenic variation. A T. congolense VES has not been described, making it unclear if regulation of VSG expression is conserved between species. Here, we describe a conserved telomeric region associated with VSG loci from long-read DNA sequencing of two T. congolense strains, which consists of a distal repeat, conserved noncoding elements and other genes besides the VSG; although these are not orthologous to T. brucei ESAGs. Most conserved telomeric regions are associated with accessory minichromosomes, but the same structure may also be associated with megabase chromosomes. We propose that this region represents the T. congolense VES, and through comparison with T. brucei, we discuss the parallel evolution of antigenic switching mechanisms, and unique adaptation of the T. brucei VES for developmental regulation of bloodstream-stage genes. Hence, we provide a basis for understanding antigenic switching in T. congolense and the origins of the African trypanosome VES.


September 22, 2019  |  

Genomic assemblies of newly sequenced Trypanosoma cruzi strains reveal new genomic expansion and greater complexity.

Chagas disease is a complex illness caused by the protozoan Trypanosoma cruzi displaying highly diverse clinical outcomes. In this sense, the genome sequence elucidation and comparison between strains may lead to disease understanding. Here, two new T. cruzi strains, have been sequenced, Y using Illumina and Bug2148 using PacBio, assembled, analyzed and compared with the T. cruzi annotated genomes available to date. The assembly stats from the new sequences show effective improvement of T. cruzi genome over the actual ones. Such as, the largest contig assembled (1.3?Mb in Bug2148) in de novo attempts and the highest mean assembly coverage (71X for Y). Our analysis reveals a new genomic expansion and greater complexity for those multi-copy gene families related to infection process and disease development, such as Trans-sialidases, Mucins and Mucin Associated Surface Proteins, among others. On one side, we demonstrate that multi-copy gene families are located near telomeric regions of the “chromosome-like” 1.3?Mb contig assembled of Bug2148, where they likely suffer high evolutive pressure. On the other hand, we identified several strain-specific single copy genes that might help to understand the differences in infectivity and physiology among strains. In summary, our results indicate that T. cruzi has a complex genomic architecture that may have promoted its evolution.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.