Bioinformatics Archives - Page 82 of 267

September 22, 2019

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Unlike most other fungi, molds of the genus Trichoderma (Hypocreales, Ascomycota) are aggressive parasites of other fungi and efficient decomposers of plant biomass. Although nutritional shifts are common among hypocrealean fungi, there are no examples of such broad substrate versatility as that observed in Trichoderma. A phylogenomic analysis of 23 hypocrealean fungi (including nine Trichoderma spp. and the related Escovopsis weberi) revealed that the genus Trichoderma has evolved from an ancestor with limited cellulolytic capability that fed on either fungi or arthropods. The evolutionary analysis of Trichoderma genes encoding plant cell wall-degrading carbohydrate-active enzymes and auxiliary proteins (pcwdCAZome, 122 gene families) based on a gene tree / species tree reconciliation demonstrated that the formation of the genus was accompanied by an unprecedented extent of lateral gene transfer (LGT). Nearly one-half of the genes in Trichoderma pcwdCAZome (41%) were obtained via LGT from plant-associated filamentous fungi belonging to different classes of Ascomycota, while no LGT was observed from other potential donors. In addition to the ability to feed on unrelated fungi (such as Basidiomycota), we also showed that Trichoderma is capable of endoparasitism on a broad range of Ascomycota, including extant LGT donors. This phenomenon was not observed in E. weberi and rarely in other mycoparasitic hypocrealean fungi. Thus, our study suggests that LGT is linked to the ability of Trichoderma to parasitize taxonomically related fungi (up to adelphoparasitism in strict sense). This may have allowed primarily mycotrophic Trichoderma fungi to evolve into decomposers of plant biomass.

September 22, 2019

Transposable element genomic fissuring in Pyrenophora teres is associated with genome expansion and dynamics of host-pathogen genetic interactions.

Pyrenophora teres, P. teres f. teres (PTT) and P. teres f. maculata (PTM) cause significant diseases in barley, but little is known about the large-scale genomic differences that may distinguish the two forms. Comprehensive genome assemblies were constructed from long DNA reads, optical and genetic maps. As repeat masking in fungal genomes influences the final gene annotations, an accurate and reproducible pipeline was developed to ensure comparability between isolates. The genomes of the two forms are highly collinear, each composed of 12 chromosomes. Genome evolution in P. teres is characterized by genome fissuring through the insertion and expansion of transposable elements (TEs), a process that isolates blocks of genic sequence. The phenomenon is particularly pronounced in PTT, which has a larger, more repetitive genome than PTM and more recent transposon activity measured by the frequency and size of genome fissures. PTT has a longer cultivated host association and, notably, a greater range of host-pathogen genetic interactions compared to other Pyrenophora spp., a property which associates better with genome size than pathogen lifestyle. The two forms possess similar complements of TE families with Tc1/Mariner and LINE-like Tad-1 elements more abundant in PTT. Tad-1 was only detectable as vestigial fragments in PTM and, within the forms, differences in genome sizes and the presence and absence of several TE families indicated recent lineage invasions. Gene differences between P. teres forms are mainly associated with gene-sparse regions near or within TE-rich regions, with many genes possessing characteristics of fungal effectors. Instances of gene interruption by transposons resulting in pseudogenization were detected in PTT. In addition, both forms have a large complement of secondary metabolite gene clusters indicating significant capacity to produce an array of different molecules. This study provides genomic resources for functional genetics to help dissect factors underlying the host-pathogen interactions.

September 22, 2019

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb.The first draft genome sequences of P. ginseng cultivar “Chunpoong” were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page.This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

September 22, 2019

A survey of Type III restriction-modification systems reveals numerous, novel epigenetic regulators controlling phase-variable regulons; phasevarions.

Many bacteria utilize simple DNA sequence repeats as a mechanism to randomly switch genes on and off. This process is called phase variation. Several phase-variable N6-adenine DNA-methyltransferases from Type III restriction-modification systems have been reported in bacterial pathogens. Random switching of DNA methyltransferases changes the global DNA methylation pattern, leading to changes in gene expression. These epigenetic regulatory systems are called phasevarions – phase-variable regulons. The extent of these phase-variable genes in the bacterial kingdom is unknown. Here, we interrogated a database of restriction-modification systems, REBASE, by searching for all simple DNA sequence repeats in mod genes that encode Type III N6-adenine DNA-methyltransferases. We report that 17.4% of Type III mod genes (662/3805) contain simple sequence repeats. Of these, only one-fifth have been previously identified. The newly discovered examples are widely distributed and include many examples in opportunistic pathogens as well as in environmental species. In many cases, multiple phasevarions exist in one genome, with examples of up to 4 independent phasevarions in some species. We found several new types of phase-variable mod genes, including the first example of a phase-variable methyltransferase in pathogenic Escherichia coli. Phasevarions are a common epigenetic regulation contingency strategy used by both pathogenic and non-pathogenic bacteria.

September 22, 2019

Draft genome of the Peruvian scallop Argopecten purpuratus.

The Peruvian scallop, Argopecten purpuratus, is mainly cultured in southern Chile and Peru was introduced into China in the last century. Unlike other Argopecten scallops, the Peruvian scallop normally has a long life span of up to 7 to 10 years. Therefore, researchers have been using it to develop hybrid vigor. Here, we performed whole genome sequencing, assembly, and gene annotation of the Peruvian scallop, with an important aim to develop genomic resources for genetic breeding in scallops.A total of 463.19-Gb raw DNA reads were sequenced. A draft genome assembly of 724.78 Mb was generated (accounting for 81.87% of the estimated genome size of 885.29 Mb), with a contig N50 size of 80.11 kb and a scaffold N50 size of 1.02 Mb. Repeat sequences were calculated to reach 33.74% of the whole genome, and 26,256 protein-coding genes and 3,057 noncoding RNAs were predicted from the assembly.We generated a high-quality draft genome assembly of the Peruvian scallop, which will provide a solid resource for further genetic breeding and for the analysis of the evolutionary history of this economically important scallop.

September 22, 2019

Identification and pathogenomic analysis of an Escherichia coli strain producing a novel Shiga toxin 2 subtype.

Shiga toxin (Stx) is the key virulent factor in Shiga toxin-producing Escherichia coli (STEC). To date, three Stx1 subtypes and seven Stx2 subtypes have been described in E. coli, which differed in receptor preference and toxin potency. Here, we identified a novel Stx2 subtype designated Stx2h in E. coli strains isolated from wild marmots in the Qinghai-Tibetan plateau, China. Stx2h shares 91.9% nucleic acid sequence identity and 92.9% amino acid identity to the nearest Stx2 subtype. The expression of Stx2h in type strain STEC299 was inducible by mitomycin C, and culture supernatant from STEC299 was cytotoxic to Vero cells. The Stx2h converting prophage was unique in terms of insertion site and genetic composition. Whole genome-based phylo- and patho-genomic analysis revealed STEC299 was closer to other pathotypes of E. coli than STEC, and possesses virulence factors from other pathotypes. Our finding enlarges the pool of Stx2 subtypes and highlights the extraordinary genomic plasticity of E. coli strains. As the emergence of new Shiga toxin genotypes and new Stx-producing pathotypes pose a great threat to the public health, Stx2h should be further included in E. coli molecular typing, and in epidemiological surveillance of E. coli infections.

September 22, 2019

Genome evolution across 1,011 Saccharomyces cerevisiae isolates.

Large-scale population genomic surveys are essential to explore the phenotypic diversity of natural populations. Here we report the whole-genome sequencing and phenotyping of 1,011 Saccharomyces cerevisiae isolates, which together provide an accurate evolutionary picture of the genomic variants that shape the species-wide phenotypic landscape of this yeast. Genomic analyses support a single ‘out-of-China’ origin for this species, followed by several independent domestication events. Although domesticated isolates exhibit high variation in ploidy, aneuploidy and genome content, genome evolution in wild isolates is mainly driven by the accumulation of single nucleotide polymorphisms. A common feature is the extensive loss of heterozygosity, which represents an essential source of inter-individual variation in this mainly asexual species. Most of the single nucleotide polymorphisms, including experimentally identified functional polymorphisms, are present at very low frequencies. The largest numbers of variants identified by genome-wide association are copy-number changes, which have a greater phenotypic effect than do single nucleotide polymorphisms. This resource will guide future population genomics and genotype-phenotype studies in this classic model system.

September 22, 2019

Epigenetic landscape influences the liver cancer genome architecture.

The accumulations of different types of genetic alterations such as nucleotide substitutions, structural rearrangements and viral genome integrations and epigenetic alterations contribute to carcinogenesis. Here, we report correlation between the occurrence of epigenetic features and genetic aberrations by whole-genome bisulfite, whole-genome shotgun, long-read, and virus capture sequencing of 373 liver cancers. Somatic substitutions and rearrangement breakpoints are enriched in tumor-specific hypo-methylated regions with inactive chromatin marks and actively transcribed highly methylated regions in the cancer genome. Individual mutation signatures depend on chromatin status, especially, signatures with a higher transcriptional strand bias occur within active chromatic areas. Hepatitis B virus (HBV) integration sites are frequently detected within inactive chromatin regions in cancer cells, as a consequence of negative selection for integrations in active chromatin regions. Ultra-high structural instability and preserved unmethylation of integrated HBV genomes are observed. We conclude that both precancerous and somatic epigenetic features contribute to the cancer genome architecture.

September 22, 2019

RTS,S/AS01 malaria vaccine mismatch observed among Plasmodium falciparum isolates from southern and central Africa and globally.

The RTS,S/AS01 malaria vaccine encompasses the central repeats and C-terminal of Plasmodium falciparum circumsporozoite protein (PfCSP). Although no Phase II clinical trial studies observed evidence of strain-specific immunity, recent studies show a decrease in vaccine efficacy against non-vaccine strain parasites. In light of goals to reduce malaria morbidity, anticipating the effectiveness of RTS,S/AS01 is critical to planning widespread vaccine introduction. We deep sequenced C-terminal Pfcsp from 77 individuals living along the international border in Luapula Province, Zambia and Haut-Katanga Province, the Democratic Republic of the Congo (DRC) and compared translated amino acid haplotypes to the 3D7 vaccine strain. Only 5.2% of the 193 PfCSP sequences from the Zambia-DRC border region matched 3D7 at all 84 amino acids. To further contextualize the genetic diversity sampled in this study with global PfCSP diversity, we analyzed an additional 3,809 Pfcsp sequences from the Pf3k database and constructed a haplotype network representing 15 countries from Africa and Asia. The diversity observed in our samples was similar to the diversity observed in the global haplotype network. These observations underscore the need for additional research assessing genetic diversity in P. falciparum and the impact of PfCSP diversity on RTS,S/AS01 efficacy.

September 22, 2019

IMSindel: An accurate intermediate-size indel detection tool incorporating de novo assembly and gapped global-local alignment with split read analysis.

Insertions and deletions (indels) have been implicated in dozens of human diseases through the radical alteration of gene function by short frameshift indels as well as long indels. However, the accurate detection of these indels from next-generation sequencing data is still challenging. This is particularly true for intermediate-size indels (=50?bp), due to the short DNA sequencing reads. Here, we developed a new method that predicts intermediate-size indels using BWA soft-clipped fragments (unmatched fragments in partially mapped reads) and unmapped reads. We report the performance comparison of our method, GATK, PINDEL and ScanIndel, using whole exome sequencing data from the same samples. False positive and false negative counts were determined through Sanger sequencing of all predicted indels across these four methods. The harmonic mean of the recall and precision, F-measure, was used to measure the performance of each method. Our method achieved the highest F-measure of 0.84 in one sample, compared to 0.56 for GATK, 0.52 for PINDEL and 0.46 for ScanIndel. Similar results were obtained in additional samples, demonstrating that our method was superior to the other methods for detecting intermediate-size indels. We believe that this methodology will contribute to the discovery of intermediate-size indels associated with human disease.

September 22, 2019

Genomic analysis of oral Campylobacter concisus strains identified a potential bacterial molecular marker associated with active Crohn’s disease.

Campylobacter concisus is an oral bacterium that is associated with inflammatory bowel disease (IBD) including Crohn’s disease (CD) and ulcerative colitis (UC). C. concisus consists of two genomospecies (GS) and diverse strains. This study aimed to identify molecular markers to differentiate commensal and IBD-associated C. concisus strains. The genomes of 63 oral C. concisus strains isolated from patients with IBD and healthy controls were examined, of which 38 genomes were sequenced in this study. We identified a novel secreted enterotoxin B homologue, Csep1. The csep1 gene was found in 56% of GS2 C. concisus strains, presented in the plasmid pICON or the chromosome. A six-nucleotide insertion at the position 654-659?bp in csep1 (csep1-6bpi) was found. The presence of csep1-6bpi in oral C. concisus strains isolated from patients with active CD (47%, 7/15) was significantly higher than that in strains from healthy controls (0/29, P?=?0.0002), and the prevalence of csep1-6bpi positive C. concisus strains was significantly higher in patients with active CD (67%, 4/6) as compared to healthy controls (0/23, P?=?0.0006). Proteomics analysis detected the Csep1 protein. A csep1 gene hot spot in the chromosome of different C. concisus strains was found. The pICON plasmid was only found in GS2 strains isolated from the two relapsed CD patients with small bowel complications. This study reports a C. concisus molecular marker (csep1-6bpi) that is associated with active CD.

September 22, 2019

SvABA: genome-wide detection of structural variants and indels by local assembly.

Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA’s performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ~4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs.© 2018 Wala et al.; Published by Cold Spring Harbor Laboratory Press.

September 22, 2019

Conserved genomic and amino acid traits of cold adaptation in subzero-growing Arctic permafrost bacteria.

Permafrost accounts for 27% of all soil ecosystems and harbors diverse microbial communities. Our understanding of microorganisms in permafrost, their activities and adaptations, remains limited. Using five subzero-growing (cryophilic) permafrost bacteria, we examined features of cold adaptation through comparative genomic analyses with mesophilic relatives. The cryophiles possess genes associated with cold adaptation, including cold shock proteins, RNA helicases, and oxidative stress and carotenoid synthesis enzymes. Higher abundances of genes associated with compatible solutes were observed, important for osmoregulation in permafrost brine veins. Most cryophiles in our study have higher transposase copy numbers than mesophiles. We investigated amino acid (AA) modifications in the cryophiles favoring increased protein flexibility at cold temperatures. Although overall there were few differences with the mesophiles, we found evidence of cold adaptation, with significant differences in proline, serine, glycine and aromaticity, in several cryophiles. The use of cold/hot AA ratios of >1, used in previous studies to indicate cold adaptation, was found to be inadequate on its own. Comparing the average of all cryophiles to all mesophiles, we found that overall cryophiles had a higher ratio of cold adapted proteins for serine (more serine), and to a lesser extent, proline and acidic residues (fewer prolines/acidic residues).

September 22, 2019

Evolution of sequence type 4821 clonal complex meningococcal strains in China from prequinolone to quinolone era, 1972-2013.

The expansion of hypervirulent sequence type 4821 clonal complex (CC4821) lineage Neisseria meningitidis bacteria has led to a shift in meningococcal disease epidemiology in China, from serogroup A (MenA) to MenC. Knowledge of the evolution and genetic origin of the emergent MenC strains is limited. In this study, we subjected 76 CC4821 isolates collected across China during 1972-1977 and 2005-2013 to phylogenetic analysis, traditional genotyping, or both. We show that successive recombination events within genes encoding surface antigens and acquisition of quinolone resistance mutations possibly played a role in the emergence of CC4821 as an epidemic clone in China. MenC and MenB CC4821 strains have spread across China and have been detected in several countries in different continents. Capsular switches involving serogroups B and C occurred among epidemic strains, raising concerns regarding possible increases in MenB disease, given that vaccines in use in China do not protect against MenB.

September 22, 2019

Complete genome sequences of seven Vibrio anguillarum strains as derived from PacBio sequencing.

We report here the complete genome sequences of seven Vibrio anguillarum strains isolated from multiple geographic locations, thus increasing the total number of genomes of finished quality to 11. The genomes were de novo assembled from long-sequence PacBio reads. Including draft genomes, a total of 44?V. anguillarum genomes are currently available in the genome databases. They represent an important resource in the study of, for example, genetic variations and for identifying virulence determinants. In this article, we present the genomes and basic genome comparisons of the 11 complete genomes, including a BRIG analysis, and pan genome calculation. We also describe some structural features of superintegrons on chromosome 2?s, and associated insertion sequence (IS) elements, including 18 new ISs (ISVa3?-?ISVa20), both of importance in the complement of V. anguillarum genomes.

Auto Tag: Bioinformatics

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Transposable element genomic fissuring in Pyrenophora teres is associated with genome expansion and dynamics of host-pathogen genetic interactions.

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

A survey of Type III restriction-modification systems reveals numerous, novel epigenetic regulators controlling phase-variable regulons; phasevarions.

Draft genome of the Peruvian scallop Argopecten purpuratus.

Identification and pathogenomic analysis of an Escherichia coli strain producing a novel Shiga toxin 2 subtype.

Genome evolution across 1,011 Saccharomyces cerevisiae isolates.

Epigenetic landscape influences the liver cancer genome architecture.

RTS,S/AS01 malaria vaccine mismatch observed among Plasmodium falciparum isolates from southern and central Africa and globally.

IMSindel: An accurate intermediate-size indel detection tool incorporating de novo assembly and gapped global-local alignment with split read analysis.

Genomic analysis of oral Campylobacter concisus strains identified a potential bacterial molecular marker associated with active Crohn’s disease.

SvABA: genome-wide detection of structural variants and indels by local assembly.

Conserved genomic and amino acid traits of cold adaptation in subzero-growing Arctic permafrost bacteria.

Evolution of sequence type 4821 clonal complex meningococcal strains in China from prequinolone to quinolone era, 1972-2013.

Complete genome sequences of seven Vibrio anguillarum strains as derived from PacBio sequencing.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert