De novo assembly Archives - Page 68 of 324

September 22, 2019

A large-scale comparative metagenomic study reveals the functional interactions in six bloom-forming Microcystis-epibiont communities.

Cyanobacterial blooms are worldwide issues of societal concern and scientific interest. Lake Taihu and Lake Dianchi, two of the largest lakes in China, have been suffering from annual Microcystis-based blooms over the past two decades. These two eutrophic lakes differ in both nutrient load and environmental parameters, where Microcystis microbiota consisting of different Microcystis morphospecies and associated bacteria (epibionts) have dominated. We conducted a comprehensive metagenomic study that analyzed species diversity, community structure, functional components, metabolic pathways and networks to investigate functional interactions among the members of six Microcystis-epibiont communities in these two lakes. Our integrated metagenomic pipeline consisted of efficient assembly, binning, annotation, and quality assurance methods that ensured high-quality genome reconstruction. This study provides a total of 68 reconstructed genomes including six complete Microcystis genomes and 28 high quality bacterial genomes of epibionts belonging to 14 distinct taxa. This metagenomic dataset constitutes the largest reference genome catalog available for genome-centric studies of the Microcystis microbiome. Epibiont community composition appears to be dynamic rather than fixed, and the functional profiles of communities were related to the environment of origin. This study demonstrates mutualistic interactions between Microcystis and epibionts at genetic and metabolic levels. Metabolic pathway reconstruction provided evidence for functional complementation in nitrogen and sulfur cycles, fatty acid catabolism, vitamin synthesis, and aromatic compound degradation among community members. Thus, bacterial social interactions within Microcystis-epibiont communities not only shape species composition, but also stabilize the communities functional profiles. These interactions appear to play an important role in environmental adaptation of Microcystis colonies.

September 22, 2019

Repeated evolution of self-compatibility for reproductive assurance.

Sexual reproduction in eukaryotes requires the fusion of two compatible gametes of opposite sexes or mating types. To meet the challenge of finding a mating partner with compatible gametes, evolutionary mechanisms such as hermaphroditism and self-fertilization have repeatedly evolved. Here, by combining the insights from comparative genomics, computer simulations and experimental evolution in fission yeast, we shed light on the conditions promoting separate mating types or self-compatibility by mating-type switching. Analogous to multiple independent transitions between switchers and non-switchers in natural populations mediated by structural genomic changes, novel switching genotypes readily evolved under selection in the experimental populations. Detailed fitness measurements accompanied by computer simulations show the benefits and costs of switching during sexual and asexual reproduction, governing the occurrence of both strategies in nature. Our findings illuminate the trade-off between the benefits of reproductive assurance and its fitness costs under benign conditions facilitating the evolution of self-compatibility.

September 22, 2019

Comparative genomics of smut pathogens: Insights from orphans and positively selected genes into host specialization.

Host specialization is a key evolutionary process for the diversification and emergence of new pathogens. However, the molecular determinants of host range are poorly understood. Smut fungi are biotrophic pathogens that have distinct and narrow host ranges based on largely unknown genetic determinants. Hence, we aimed to expand comparative genomics analyses of smut fungi by including more species infecting different hosts and to define orphans and positively selected genes to gain further insights into the genetics basis of host specialization. We analyzed nine lineages of smut fungi isolated from eight crop and non-crop hosts: maize, barley, sugarcane, wheat, oats, Zizania latifolia (Manchurian rice), Echinochloa colona (a wild grass), and Persicaria sp. (a wild dicot plant). We assembled two new genomes: Ustilago hordei (strain Uhor01) isolated from oats and U. tritici (strain CBS 119.19) isolated from wheat. The smut genomes were of small sizes, ranging from 18.38 to 24.63 Mb. U. hordei species experienced genome expansions due to the proliferation of transposable elements and the amount of these elements varied among the two strains. Phylogenetic analysis confirmed that Ustilago is not a monophyletic genus and, furthermore, detected misclassification of the U. tritici specimen. The comparison between smut pathogens of crop and non-crop hosts did not reveal distinct signatures, suggesting that host domestication did not play a dominant role in shaping the evolution of smuts. We found that host specialization in smut fungi likely has a complex genetic basis: different functional categories were enriched in orphans and lineage-specific selected genes. The diversification and gain/loss of effector genes are probably the most important determinants of host specificity.

September 22, 2019

Genome sequence, assembly and characterization of two Metschnikowia fructicola strains used as biocontrol agents of postharvest diseases.

The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term) and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue.

September 22, 2019

Plasmid-mediated quinolone resistance in Shigella flexneriisolated from macaques.

Non-human primates (NHPs) for biomedical research are commonly infected with Shigella spp. that can cause acute dysentery or chronic episodic diarrhea. These animals are often prophylactically and clinically treated with quinolone antibiotics to eradicate these possible infections. However, chromosomally- and plasmid-mediated antibiotic resistance has become an emerging concern for species in the family Enterobacteriaceae. In this study, five individual isolates of multi-drug resistant Shigella flexneri were isolated from the feces of three macaques. Antibiotic susceptibility testing confirmed resistance or decreased susceptibility to ampicillin, amoxicillin-clavulanic acid, cephalosporins, gentamicin, tetracycline, ciprofloxacin, enrofloxacin, levofloxacin, and nalidixic acid. S. flexneri isolates were susceptible to trimethoprim-sulfamethoxazole, and this drug was used to eradicate infection in two of the macaques. Plasmid DNA from all isolates was positive for the plasmid-encoded quinolone resistance gene qnrS, but not qnrA and qnrB. Conjugation and transformation of plasmid DNA from several S. flexneri isolates into antibiotic-susceptible Escherichia coli strains conferred the recipients with resistance or decreased susceptibility to quinolones and beta-lactams. Genome sequencing of two representative S. flexneri isolates identified the qnrS gene on a plasmid-like contig. These contigs showed >99% homology to plasmid sequences previously characterized from quinolone-resistant Shigella flexneri 2a and Salmonella enterica strains. Other antibiotic resistance genes and virulence factor genes were also identified in chromosome and plasmid sequences in these genomes. The findings from this study indicate macaques harbor pathogenic S. flexneri strains with chromosomally- and plasmid-encoded antibiotic resistance genes. To our knowledge, this is the first report of plasmid-mediated quinolone resistance in S. flexneri isolated from NHPs and warrants isolation and antibiotic testing of enteric pathogens before treating macaques with quinolones prophylactically or therapeutically.

September 22, 2019

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) ‘Hongyang’ draft genome has 164 Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models.A second genome of an A. chinensis (genotype Red5) was fully sequenced. This new sequence resulted in a 554.0 Mb assembly with all but 6 Mb assigned to pseudo-chromosomes. Pseudo-chromosomal comparisons showed a considerable number of translocation events have occurred following a whole genome duplication (WGD) event some consistent with centromeric Robertsonian-like translocations. RNA sequencing data from 12 tissues and ab initio analysis informed a genome-wide manual annotation, using the WebApollo tool. In total, 33,044 gene loci represented by 33,123 isoforms were identified, named and tagged for quality of evidential support. Of these 3114 (9.4%) were identical to a protein within ‘Hongyang’ The Kiwifruit Information Resource (KIR v2). Some proportion of the differences will be varietal polymorphisms. However, as most computationally predicted Red5 models required manual re-annotation this proportion is expected to be small. The quality of the new gene models was tested by fully sequencing 550 cloned ‘Hort16A’ cDNAs and comparing with the predicted protein models for Red5 and both the original ‘Hongyang’ assembly and the revised annotation from KIR v2. Only 48.9% and 63.5% of the cDNAs had a match with 90% identity or better to the original and revised ‘Hongyang’ annotation, respectively, compared with 90.9% to the Red5 models.Our study highlights the need to take a cautious approach to draft genomes and computationally predicted genes. Our use of the manual annotation tool WebApollo facilitated manual checking and correction of gene models enabling improvement of computational prediction. This utility was especially relevant for certain types of gene families such as the EXPANSIN like genes. Finally, this high quality gene set will supply the kiwifruit and general plant community with a new tool for genomics and other comparative analysis.

September 22, 2019

Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

Pyrenophora tritici-repentis (Ptr) is a necrotrophic fungal pathogen that causes the major wheat disease, tan spot. We set out to provide essential genomics-based resources in order to better understand the pathogenicity mechanisms of this important pathogen.Here, we present eight new Ptr isolate genomes, assembled and annotated; representing races 1, 2 and 5, and a new race. We report a high quality Ptr reference genome, sequenced by PacBio technology with Illumina paired-end data support and optical mapping. An estimated 98% of the genome coverage was mapped to 10 chromosomal groups, using a two-enzyme hybrid approach. The final reference genome was 40.9 Mb and contained a total of 13,797 annotated genes, supported by transcriptomic and proteogenomics data sets.Whole genome comparative analysis revealed major chromosomal segmental rearrangements and fusions, highlighting intraspecific genome plasticity in this species. Furthermore, the Ptr race classification was not supported at the whole genome level, as phylogenetic analysis did not cluster the ToxA producing isolates. This expansion of available Ptr genomics resources will directly facilitate research aimed at controlling tan spot disease.

September 22, 2019

Genome and transcriptome of the natural isopropanol producer Clostridium beijerinckii DSM6423.

There is a worldwide interest for sustainable and environmentally-friendly ways to produce fuels and chemicals from renewable resources. Among them, the production of acetone, butanol and ethanol (ABE) or Isopropanol, Butanol and Ethanol (IBE) by anaerobic fermentation has already a long industrial history. Isopropanol has recently received a specific interest and the best studied natural isopropanol producer is C. beijerinckii DSM 6423 (NRRL B-593). This strain metabolizes sugars into a mix of IBE with only low concentrations of ethanol produced (

September 22, 2019

Comparison of phasing strategies for whole human genomes.

Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not ‘phase’ the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available ‘Genome-In-A-Bottle’ (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a consensus haplotype combining multiple predictions for enhanced performance and site coverage. Finally, we also identified DNA sequence signatures associated with the genomic regions harboring phasing switch errors, which included regions of low polymorphism or SNV density.

September 22, 2019

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Unlike most other fungi, molds of the genus Trichoderma (Hypocreales, Ascomycota) are aggressive parasites of other fungi and efficient decomposers of plant biomass. Although nutritional shifts are common among hypocrealean fungi, there are no examples of such broad substrate versatility as that observed in Trichoderma. A phylogenomic analysis of 23 hypocrealean fungi (including nine Trichoderma spp. and the related Escovopsis weberi) revealed that the genus Trichoderma has evolved from an ancestor with limited cellulolytic capability that fed on either fungi or arthropods. The evolutionary analysis of Trichoderma genes encoding plant cell wall-degrading carbohydrate-active enzymes and auxiliary proteins (pcwdCAZome, 122 gene families) based on a gene tree / species tree reconciliation demonstrated that the formation of the genus was accompanied by an unprecedented extent of lateral gene transfer (LGT). Nearly one-half of the genes in Trichoderma pcwdCAZome (41%) were obtained via LGT from plant-associated filamentous fungi belonging to different classes of Ascomycota, while no LGT was observed from other potential donors. In addition to the ability to feed on unrelated fungi (such as Basidiomycota), we also showed that Trichoderma is capable of endoparasitism on a broad range of Ascomycota, including extant LGT donors. This phenomenon was not observed in E. weberi and rarely in other mycoparasitic hypocrealean fungi. Thus, our study suggests that LGT is linked to the ability of Trichoderma to parasitize taxonomically related fungi (up to adelphoparasitism in strict sense). This may have allowed primarily mycotrophic Trichoderma fungi to evolve into decomposers of plant biomass.

September 22, 2019

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb.The first draft genome sequences of P. ginseng cultivar “Chunpoong” were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page.This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

September 22, 2019

A survey of Type III restriction-modification systems reveals numerous, novel epigenetic regulators controlling phase-variable regulons; phasevarions.

Many bacteria utilize simple DNA sequence repeats as a mechanism to randomly switch genes on and off. This process is called phase variation. Several phase-variable N6-adenine DNA-methyltransferases from Type III restriction-modification systems have been reported in bacterial pathogens. Random switching of DNA methyltransferases changes the global DNA methylation pattern, leading to changes in gene expression. These epigenetic regulatory systems are called phasevarions – phase-variable regulons. The extent of these phase-variable genes in the bacterial kingdom is unknown. Here, we interrogated a database of restriction-modification systems, REBASE, by searching for all simple DNA sequence repeats in mod genes that encode Type III N6-adenine DNA-methyltransferases. We report that 17.4% of Type III mod genes (662/3805) contain simple sequence repeats. Of these, only one-fifth have been previously identified. The newly discovered examples are widely distributed and include many examples in opportunistic pathogens as well as in environmental species. In many cases, multiple phasevarions exist in one genome, with examples of up to 4 independent phasevarions in some species. We found several new types of phase-variable mod genes, including the first example of a phase-variable methyltransferase in pathogenic Escherichia coli. Phasevarions are a common epigenetic regulation contingency strategy used by both pathogenic and non-pathogenic bacteria.

September 22, 2019

Draft genome of the Peruvian scallop Argopecten purpuratus.

The Peruvian scallop, Argopecten purpuratus, is mainly cultured in southern Chile and Peru was introduced into China in the last century. Unlike other Argopecten scallops, the Peruvian scallop normally has a long life span of up to 7 to 10 years. Therefore, researchers have been using it to develop hybrid vigor. Here, we performed whole genome sequencing, assembly, and gene annotation of the Peruvian scallop, with an important aim to develop genomic resources for genetic breeding in scallops.A total of 463.19-Gb raw DNA reads were sequenced. A draft genome assembly of 724.78 Mb was generated (accounting for 81.87% of the estimated genome size of 885.29 Mb), with a contig N50 size of 80.11 kb and a scaffold N50 size of 1.02 Mb. Repeat sequences were calculated to reach 33.74% of the whole genome, and 26,256 protein-coding genes and 3,057 noncoding RNAs were predicted from the assembly.We generated a high-quality draft genome assembly of the Peruvian scallop, which will provide a solid resource for further genetic breeding and for the analysis of the evolutionary history of this economically important scallop.

September 22, 2019

Identification and pathogenomic analysis of an Escherichia coli strain producing a novel Shiga toxin 2 subtype.

Shiga toxin (Stx) is the key virulent factor in Shiga toxin-producing Escherichia coli (STEC). To date, three Stx1 subtypes and seven Stx2 subtypes have been described in E. coli, which differed in receptor preference and toxin potency. Here, we identified a novel Stx2 subtype designated Stx2h in E. coli strains isolated from wild marmots in the Qinghai-Tibetan plateau, China. Stx2h shares 91.9% nucleic acid sequence identity and 92.9% amino acid identity to the nearest Stx2 subtype. The expression of Stx2h in type strain STEC299 was inducible by mitomycin C, and culture supernatant from STEC299 was cytotoxic to Vero cells. The Stx2h converting prophage was unique in terms of insertion site and genetic composition. Whole genome-based phylo- and patho-genomic analysis revealed STEC299 was closer to other pathotypes of E. coli than STEC, and possesses virulence factors from other pathotypes. Our finding enlarges the pool of Stx2 subtypes and highlights the extraordinary genomic plasticity of E. coli strains. As the emergence of new Shiga toxin genotypes and new Stx-producing pathotypes pose a great threat to the public health, Stx2h should be further included in E. coli molecular typing, and in epidemiological surveillance of E. coli infections.

September 22, 2019

Genome evolution across 1,011 Saccharomyces cerevisiae isolates.

Large-scale population genomic surveys are essential to explore the phenotypic diversity of natural populations. Here we report the whole-genome sequencing and phenotyping of 1,011 Saccharomyces cerevisiae isolates, which together provide an accurate evolutionary picture of the genomic variants that shape the species-wide phenotypic landscape of this yeast. Genomic analyses support a single ‘out-of-China’ origin for this species, followed by several independent domestication events. Although domesticated isolates exhibit high variation in ploidy, aneuploidy and genome content, genome evolution in wild isolates is mainly driven by the accumulation of single nucleotide polymorphisms. A common feature is the extensive loss of heterozygosity, which represents an essential source of inter-individual variation in this mainly asexual species. Most of the single nucleotide polymorphisms, including experimentally identified functional polymorphisms, are present at very low frequencies. The largest numbers of variants identified by genome-wide association are copy-number changes, which have a greater phenotypic effect than do single nucleotide polymorphisms. This resource will guide future population genomics and genotype-phenotype studies in this classic model system.

Auto Tag: De novo assembly

A large-scale comparative metagenomic study reveals the functional interactions in six bloom-forming Microcystis-epibiont communities.

Repeated evolution of self-compatibility for reproductive assurance.

Comparative genomics of smut pathogens: Insights from orphans and positively selected genes into host specialization.

Genome sequence, assembly and characterization of two Metschnikowia fructicola strains used as biocontrol agents of postharvest diseases.

Plasmid-mediated quinolone resistance in Shigella flexneriisolated from macaques.

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

Genome and transcriptome of the natural isopropanol producer Clostridium beijerinckii DSM6423.

Comparison of phasing strategies for whole human genomes.

Massive lateral transfer of genes encoding plant cell wall-degrading enzymes to the mycoparasitic fungus Trichoderma from its plant-associated hosts.

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

A survey of Type III restriction-modification systems reveals numerous, novel epigenetic regulators controlling phase-variable regulons; phasevarions.

Draft genome of the Peruvian scallop Argopecten purpuratus.

Identification and pathogenomic analysis of an Escherichia coli strain producing a novel Shiga toxin 2 subtype.

Genome evolution across 1,011 Saccharomyces cerevisiae isolates.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert