Gene prediction Archives - Page 16 of 57

September 22, 2019

Intraspecific comparative genomics of isolates of the Norway spruce pathogen (Heterobasidion parviporum) and identification of its potential virulence factors.

Heterobasidion parviporum is an economically most important fungal forest pathogen in northern Europe, causing root and butt rot disease of Norway spruce (Picea abies (L.) Karst.). The mechanisms underlying the pathogenesis and virulence of this species remain elusive. No reference genome to facilitate functional analysis is available for this species.To better understand the virulence factor at both phenotypic and genomic level, we characterized 15 H. parviporum isolates originating from different locations across Finland for virulence, vegetative growth, sporulation and saprotrophic wood decay. Wood decay capability and latitude of fungal origins exerted interactive effects on their virulence and appeared important for H. parviporum virulence. We sequenced the most virulent isolate, the first full genome sequences of H. parviporum as a reference genome, and re-sequenced the remaining 14 H. parviporum isolates. Genome-wide alignments and intrinsic polymorphism analysis showed that these isolates exhibited overall high genomic similarity with an average of at least 96% nucleotide identity when compared to the reference, yet had remarkable intra-specific level of polymorphism with a bias for CpG to TpG mutations. Reads mapping coverage analysis enabled the classification of all predicted genes into five groups and uncovered two genomic regions exclusively present in the reference with putative contribution to its higher virulence. Genes enriched for copy number variations (deletions and duplications) and nucleotide polymorphism were involved in oxidation-reduction processes and encoding domains relevant to transcription factors. Some secreted protein coding genes based on the genome-wide selection pressure, or the presence of variants were proposed as potential virulence candidates.Our study reported on the first reference genome sequence for this Norway spruce pathogen (H. parviporum). Comparative genomics analysis gave insight into the overall genomic variation among this fungal species and also facilitated the identification of several secreted protein coding genes as putative virulence factors for the further functional analysis. We also analyzed and identified phenotypic traits potentially linked to its virulence.

September 22, 2019

Genomic insights into the Acidobacteria reveal strategies for their success in terrestrial environments.

Members of the phylum Acidobacteria are abundant and ubiquitous across soils. We performed a large-scale comparative genome analysis spanning subdivisions 1, 3, 4, 6, 8 and 23 (n?=?24) with the goal to identify features to help explain their prevalence in soils and understand their ecophysiology. Our analysis revealed that bacteriophage integration events along with transposable and mobile elements influenced the structure and plasticity of these genomes. Low- and high-affinity respiratory oxygen reductases were detected in multiple genomes, suggesting the capacity for growing across different oxygen gradients. Among many genomes, the capacity to use a diverse collection of carbohydrates, as well as inorganic and organic nitrogen sources (such as via extracellular peptidases), was detected – both advantageous traits in environments with fluctuating nutrient environments. We also identified multiple soil acidobacteria with the potential to scavenge atmospheric concentrations of H2 , now encompassing mesophilic soil strains within the subdivision 1 and 3, in addition to a previously identified thermophilic strain in subdivision 4. This large-scale acidobacteria genome analysis reveal traits that provide genomic, physiological and metabolic versatility, presumably allowing flexibility and versatility in the challenging and fluctuating soil environment.© 2018 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

September 22, 2019

Insights into the evolution of host association through the isolation and characterization of a novel human periodontal pathobiont, Desulfobulbus oralis.

The human oral microbiota encompasses representatives of many bacterial lineages that have not yet been cultured. Here we describe the isolation and characterization of previously uncultured Desulfobulbus oralis, the first human-associated representative of its genus. As mammalian-associated microbes rarely have free-living close relatives, D. oralis provides opportunities to study how bacteria adapt and evolve within a host. This sulfate-reducing deltaproteobacterium has adapted to the human oral subgingival niche by curtailing its physiological repertoire, losing some biosynthetic abilities and metabolic independence, and by dramatically reducing environmental sensing and signaling capabilities. The genes that enable free-living Desulfobulbus to synthesize the potent neurotoxin methylmercury were also lost by D. oralis, a notably positive outcome of host association. However, horizontal gene acquisitions from other members of the microbiota provided novel mechanisms of interaction with the human host, including toxins like leukotoxin and hemolysins. Proteomic and transcriptomic analysis revealed that most of those factors are actively expressed, including in the subgingival environment, and some are secreted. Similar to other known oral pathobionts, D. oralis can trigger a proinflammatory response in oral epithelial cells, suggesting a direct role in the development of periodontal disease.IMPORTANCE Animal-associated microbiota likely assembled as a result of numerous independent colonization events by free-living microbes followed by coevolution with their host and other microbes. Through specific adaptation to various body sites and physiological niches, microbes have a wide range of contributions, from beneficial to disease causing. Desulfobulbus oralis provides insights into genomic and physiological transformations associated with transition from an open environment to a host-dependent lifestyle and the emergence of pathogenicity. Through a multifaceted mechanism triggering a proinflammatory response, D. oralis is a novel periodontal pathobiont. Even though culture-independent approaches can provide insights into the potential role of the human microbiome “dark matter,” cultivation and experimental characterization remain important to studying the roles of individual organisms in health and disease.

September 22, 2019

Repeat-driven generation of antigenic diversity in a major human pathogen, Trypanosoma cruzi

Trypanosoma cruzi, a zoonotic kinetoplastid protozoan with a complex genome, is the causative agent of American trypanosomiasis (Chagas disease). The parasite uses a highly diverse repertoire of surface molecules, with roles in cell invasion, immune evasion and pathogenesis. Thus far, the genomic regions containing these genes have been impossible to resolve and it has been impossible to study the structure and function of the several thousand repetitive genes encoding the surface molecules of the parasite. We here present an improved genome assembly of a T. cruzi clade I (TcI) strain using high coverage PacBio single molecule sequencing, together with Illumina sequencing of 34 T. cruzi TcI isolates and clones from different geographic locations, sample sources and clinical outcomes. Resolution of the surface molecule gene structure reveals an unusual duality in the organisation of the parasite genome, a core genomic region syntenous with related protozoa flanked by unique and highly plastic subtelomeric regions encoding surface antigens. The presence of abundant interspersed retrotransposons in the subtelomeres suggests that these elements are involved in a recombination mechanism for the generation of antigenic variation and evasion of the host immune response. The comparative genomic analysis of the cohort of TcI strains revealed multiple cases of such recombination events involving surface molecule genes and has provided new insights into T. cruzi population structure.

September 22, 2019

Long-read genome sequence and assembly of Leptopilina boulardi: a specialist Drosophila parasitoid

Background: Leptopilina boulardi is a specialist parasitoid belonging to the order Hymenoptera, which attacks the larval stages of Drosophila. The Leptopilina genus has enormous value in the biological control of pests as well as in understanding several aspects of host-parasitoid biology. However, none of the members of Figitidae family has their genomes sequenced. In order to improve the understanding of the parasitoid wasps by generating genomic resources, we sequenced the whole genome of L. boulardi. Findings: Here, we report a high quality genome of L. boulardi, assembled from 70Gb of Illumina reads and 10.5Gb of PacBio reads, forming a total coverage of 230X. The 375Mb draft genome has an N50 of 275Kb with 6315 scaffolds >500bp, and encompasses >95% complete BUSCOs. The GC% of the genome is 28.26%, and RepeatMasker identified 868105 repeat elements covering 43.9% of the assembly. A total of 25259 protein-coding genes were predicted using a combination of ab-initio and RNA-Seq based methods, with an average gene size of 3.9Kb. 78.11% of the predicted genes could be annotated with at least one function. Conclusion: Our study provides a highly reliable assembly of this parasitoid wasp, which will be a valuable resource to researchers studying parasitoids. In particular, it can help delineate the host-parasitoid mechanisms that are part of the Drosophila-Leptopilina model system.

September 22, 2019

Comparative genome analysis reveals a complex population structure of Legionella pneumophila subspecies.

The majority of Legionnaires’ disease (LD) cases are caused by Legionella pneumophila, a genetically heterogeneous species composed of at least 17 serogroups. Previously, it was demonstrated that L. pneumophila consists of three subspecies: pneumophila, fraseri and pascullei. During an LD outbreak investigation in 2012, we detected that representatives of both subspecies fraseri and pascullei colonized the same water system and that the outbreak-causing strain was a new member of the least represented subspecies pascullei. We used partial sequence based typing consensus patterns to mine an international database for additional representatives of fraseri and pascullei subspecies. As a result, we identified 46 sequence types (STs) belonging to subspecies fraseri and two STs belonging to subspecies pascullei. Moreover, a recent retrospective whole genome sequencing analysis of isolates from New York State LD clusters revealed the presence of a fourth L. pneumophila subspecies that we have termed raphaeli. This subspecies consists of 15 STs. Comparative analysis was conducted using the genomes of multiple members of all four L. pneumophila subspecies. Whereas each subspecies forms a distinct phylogenetic clade within the L. pneumophila species, they share more average nucleotide identity with each other than with other Legionella species. Unique genes for each subspecies were identified and could be used for rapid subspecies detection. Improved taxonomic classification of L. pneumophila strains may help identify environmental niches and virulence attributes associated with these genetically distinct subspecies. Published by Elsevier B.V.

September 22, 2019

Comparative genomics of smut pathogens: Insights from orphans and positively selected genes into host specialization.

Host specialization is a key evolutionary process for the diversification and emergence of new pathogens. However, the molecular determinants of host range are poorly understood. Smut fungi are biotrophic pathogens that have distinct and narrow host ranges based on largely unknown genetic determinants. Hence, we aimed to expand comparative genomics analyses of smut fungi by including more species infecting different hosts and to define orphans and positively selected genes to gain further insights into the genetics basis of host specialization. We analyzed nine lineages of smut fungi isolated from eight crop and non-crop hosts: maize, barley, sugarcane, wheat, oats, Zizania latifolia (Manchurian rice), Echinochloa colona (a wild grass), and Persicaria sp. (a wild dicot plant). We assembled two new genomes: Ustilago hordei (strain Uhor01) isolated from oats and U. tritici (strain CBS 119.19) isolated from wheat. The smut genomes were of small sizes, ranging from 18.38 to 24.63 Mb. U. hordei species experienced genome expansions due to the proliferation of transposable elements and the amount of these elements varied among the two strains. Phylogenetic analysis confirmed that Ustilago is not a monophyletic genus and, furthermore, detected misclassification of the U. tritici specimen. The comparison between smut pathogens of crop and non-crop hosts did not reveal distinct signatures, suggesting that host domestication did not play a dominant role in shaping the evolution of smuts. We found that host specialization in smut fungi likely has a complex genetic basis: different functional categories were enriched in orphans and lineage-specific selected genes. The diversification and gain/loss of effector genes are probably the most important determinants of host specificity.

September 22, 2019

Genome sequence, assembly and characterization of two Metschnikowia fructicola strains used as biocontrol agents of postharvest diseases.

The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term) and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue.

September 22, 2019

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) ‘Hongyang’ draft genome has 164 Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models.A second genome of an A. chinensis (genotype Red5) was fully sequenced. This new sequence resulted in a 554.0 Mb assembly with all but 6 Mb assigned to pseudo-chromosomes. Pseudo-chromosomal comparisons showed a considerable number of translocation events have occurred following a whole genome duplication (WGD) event some consistent with centromeric Robertsonian-like translocations. RNA sequencing data from 12 tissues and ab initio analysis informed a genome-wide manual annotation, using the WebApollo tool. In total, 33,044 gene loci represented by 33,123 isoforms were identified, named and tagged for quality of evidential support. Of these 3114 (9.4%) were identical to a protein within ‘Hongyang’ The Kiwifruit Information Resource (KIR v2). Some proportion of the differences will be varietal polymorphisms. However, as most computationally predicted Red5 models required manual re-annotation this proportion is expected to be small. The quality of the new gene models was tested by fully sequencing 550 cloned ‘Hort16A’ cDNAs and comparing with the predicted protein models for Red5 and both the original ‘Hongyang’ assembly and the revised annotation from KIR v2. Only 48.9% and 63.5% of the cDNAs had a match with 90% identity or better to the original and revised ‘Hongyang’ annotation, respectively, compared with 90.9% to the Red5 models.Our study highlights the need to take a cautious approach to draft genomes and computationally predicted genes. Our use of the manual annotation tool WebApollo facilitated manual checking and correction of gene models enabling improvement of computational prediction. This utility was especially relevant for certain types of gene families such as the EXPANSIN like genes. Finally, this high quality gene set will supply the kiwifruit and general plant community with a new tool for genomics and other comparative analysis.

September 22, 2019

Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

Pyrenophora tritici-repentis (Ptr) is a necrotrophic fungal pathogen that causes the major wheat disease, tan spot. We set out to provide essential genomics-based resources in order to better understand the pathogenicity mechanisms of this important pathogen.Here, we present eight new Ptr isolate genomes, assembled and annotated; representing races 1, 2 and 5, and a new race. We report a high quality Ptr reference genome, sequenced by PacBio technology with Illumina paired-end data support and optical mapping. An estimated 98% of the genome coverage was mapped to 10 chromosomal groups, using a two-enzyme hybrid approach. The final reference genome was 40.9 Mb and contained a total of 13,797 annotated genes, supported by transcriptomic and proteogenomics data sets.Whole genome comparative analysis revealed major chromosomal segmental rearrangements and fusions, highlighting intraspecific genome plasticity in this species. Furthermore, the Ptr race classification was not supported at the whole genome level, as phylogenetic analysis did not cluster the ToxA producing isolates. This expansion of available Ptr genomics resources will directly facilitate research aimed at controlling tan spot disease.

September 22, 2019

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Unlike most other fungi, molds of the genus Trichoderma (Hypocreales, Ascomycota) are aggressive parasites of other fungi and efficient decomposers of plant biomass. Although nutritional shifts are common among hypocrealean fungi, there are no examples of such broad substrate versatility as that observed in Trichoderma. A phylogenomic analysis of 23 hypocrealean fungi (including nine Trichoderma spp. and the related Escovopsis weberi) revealed that the genus Trichoderma has evolved from an ancestor with limited cellulolytic capability that fed on either fungi or arthropods. The evolutionary analysis of Trichoderma genes encoding plant cell wall-degrading carbohydrate-active enzymes and auxiliary proteins (pcwdCAZome, 122 gene families) based on a gene tree / species tree reconciliation demonstrated that the formation of the genus was accompanied by an unprecedented extent of lateral gene transfer (LGT). Nearly one-half of the genes in Trichoderma pcwdCAZome (41%) were obtained via LGT from plant-associated filamentous fungi belonging to different classes of Ascomycota, while no LGT was observed from other potential donors. In addition to the ability to feed on unrelated fungi (such as Basidiomycota), we also showed that Trichoderma is capable of endoparasitism on a broad range of Ascomycota, including extant LGT donors. This phenomenon was not observed in E. weberi and rarely in other mycoparasitic hypocrealean fungi. Thus, our study suggests that LGT is linked to the ability of Trichoderma to parasitize taxonomically related fungi (up to adelphoparasitism in strict sense). This may have allowed primarily mycotrophic Trichoderma fungi to evolve into decomposers of plant biomass.

September 22, 2019

Transposable element genomic fissuring in Pyrenophora teres is associated with genome expansion and dynamics of host-pathogen genetic interactions.

Pyrenophora teres, P. teres f. teres (PTT) and P. teres f. maculata (PTM) cause significant diseases in barley, but little is known about the large-scale genomic differences that may distinguish the two forms. Comprehensive genome assemblies were constructed from long DNA reads, optical and genetic maps. As repeat masking in fungal genomes influences the final gene annotations, an accurate and reproducible pipeline was developed to ensure comparability between isolates. The genomes of the two forms are highly collinear, each composed of 12 chromosomes. Genome evolution in P. teres is characterized by genome fissuring through the insertion and expansion of transposable elements (TEs), a process that isolates blocks of genic sequence. The phenomenon is particularly pronounced in PTT, which has a larger, more repetitive genome than PTM and more recent transposon activity measured by the frequency and size of genome fissures. PTT has a longer cultivated host association and, notably, a greater range of host-pathogen genetic interactions compared to other Pyrenophora spp., a property which associates better with genome size than pathogen lifestyle. The two forms possess similar complements of TE families with Tc1/Mariner and LINE-like Tad-1 elements more abundant in PTT. Tad-1 was only detectable as vestigial fragments in PTM and, within the forms, differences in genome sizes and the presence and absence of several TE families indicated recent lineage invasions. Gene differences between P. teres forms are mainly associated with gene-sparse regions near or within TE-rich regions, with many genes possessing characteristics of fungal effectors. Instances of gene interruption by transposons resulting in pseudogenization were detected in PTT. In addition, both forms have a large complement of secondary metabolite gene clusters indicating significant capacity to produce an array of different molecules. This study provides genomic resources for functional genetics to help dissect factors underlying the host-pathogen interactions.

September 22, 2019

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb.The first draft genome sequences of P. ginseng cultivar “Chunpoong” were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page.This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

September 22, 2019

Draft genome of the Peruvian scallop Argopecten purpuratus.

The Peruvian scallop, Argopecten purpuratus, is mainly cultured in southern Chile and Peru was introduced into China in the last century. Unlike other Argopecten scallops, the Peruvian scallop normally has a long life span of up to 7 to 10 years. Therefore, researchers have been using it to develop hybrid vigor. Here, we performed whole genome sequencing, assembly, and gene annotation of the Peruvian scallop, with an important aim to develop genomic resources for genetic breeding in scallops.A total of 463.19-Gb raw DNA reads were sequenced. A draft genome assembly of 724.78 Mb was generated (accounting for 81.87% of the estimated genome size of 885.29 Mb), with a contig N50 size of 80.11 kb and a scaffold N50 size of 1.02 Mb. Repeat sequences were calculated to reach 33.74% of the whole genome, and 26,256 protein-coding genes and 3,057 noncoding RNAs were predicted from the assembly.We generated a high-quality draft genome assembly of the Peruvian scallop, which will provide a solid resource for further genetic breeding and for the analysis of the evolutionary history of this economically important scallop.

September 22, 2019

Complete genome analysis of Gluconacetobacter xylinus CGMCC 2955 for elucidating bacterial cellulose biosynthesis and metabolic regulation.

Complete genome sequence of Gluconacetobacter xylinus CGMCC 2955 for fine control of bacterial cellulose (BC) synthesis is presented here. The genome, at 3,563,314?bp, was found to contain 3,193 predicted genes without gaps. There are four BC synthase operons (bcs), among which only bcsI is structurally complete, comprising bcsA, bcsB, bcsC, and bcsD. Genes encoding key enzymes in glycolytic, pentose phosphate, and BC biosynthetic pathways and in the tricarboxylic acid cycle were identified. G. xylinus CGMCC 2955 has a complete glycolytic pathway because sequence data analysis revealed that this strain possesses a phosphofructokinase (pfk)-encoding gene, which is absent in most BC-producing strains. Furthermore, combined with our previous results, the data on metabolism of various carbon sources (monosaccharide, ethanol, and acetate) and their regulatory mechanism of action on BC production were explained. Regulation of BC synthase (Bcs) is another effective method for precise control of BC biosynthesis, and cyclic diguanylate (c-di-GMP) is the key activator of BcsA-BcsB subunit of Bcs. The quorum sensing (QS) system was found to positively regulate phosphodiesterase, which decomposed c-di-GMP. Thus, in this study, we demonstrated the presence of QS in G. xylinus CGMCC 2955 and proposed a possible regulatory mechanism of QS action on BC production.

Auto Tag: Gene prediction

Intraspecific comparative genomics of isolates of the Norway spruce pathogen (Heterobasidion parviporum) and identification of its potential virulence factors.

Genomic insights into the Acidobacteria reveal strategies for their success in terrestrial environments.

Insights into the evolution of host association through the isolation and characterization of a novel human periodontal pathobiont, Desulfobulbus oralis.

Repeat-driven generation of antigenic diversity in a major human pathogen, Trypanosoma cruzi

Long-read genome sequence and assembly of Leptopilina boulardi: a specialist Drosophila parasitoid

Comparative genome analysis reveals a complex population structure of Legionella pneumophila subspecies.

Comparative genomics of smut pathogens: Insights from orphans and positively selected genes into host specialization.

Genome sequence, assembly and characterization of two Metschnikowia fructicola strains used as biocontrol agents of postharvest diseases.

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Transposable element genomic fissuring in Pyrenophora teres is associated with genome expansion and dynamics of host-pathogen genetic interactions.

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

Draft genome of the Peruvian scallop Argopecten purpuratus.

Complete genome analysis of Gluconacetobacter xylinus CGMCC 2955 for elucidating bacterial cellulose biosynthesis and metabolic regulation.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert