Menu
September 22, 2019

Phenotypic diversification by enhanced genome restructuring after induction of multiple DNA double-strand breaks.

DNA double-strand break (DSB)-mediated genome rearrangements are assumed to provide diverse raw genetic materials enabling accelerated adaptive evolution; however, it remains unclear about the consequences of massive simultaneous DSB formation in cells and their resulting phenotypic impact. Here, we establish an artificial genome-restructuring technology by conditionally introducing multiple genomic DSBs in vivo using a temperature-dependent endonuclease TaqI. Application in yeast and Arabidopsis thaliana generates strains with phenotypes, including improved ethanol production from xylose at higher temperature and increased plant biomass, that are stably inherited to offspring after multiple passages. High-throughput genome resequencing revealed that these strains harbor diverse rearrangements, including copy number variations, translocations in retrotransposons, and direct end-joinings at TaqI-cleavage sites. Furthermore, large-scale rearrangements occur frequently in diploid yeasts (28.1%) and tetraploid plants (46.3%), whereas haploid yeasts and diploid plants undergo minimal rearrangement. This genome-restructuring system (TAQing system) will enable rapid genome breeding and aid genome-evolution studies.


September 22, 2019

Integrated proteomics, genomics, metabolomics approaches reveal oxalic acid as pathogenicity factor in Tilletia indica inciting Karnal bunt disease of wheat.

Tilletia indica incites Karnal bunt (KB) disease in wheat. To date, no KB resistant wheat cultivar could be developed due to non-availability of potential biomarkers related to pathogenicity/virulence for screening of resistant wheat genotypes. The present study was carried out to compare the proteomes of T. indica highly (TiK) and low (TiP) virulent isolates. Twenty one protein spots consistently observed as up-regulated/differential in the TiK proteome were selected for identification by MALDI-TOF/TOF. Identified sequences showed homology with fungal proteins playing essential role in plant infection and pathogen survival, including stress response, adhesion, fungal penetration, invasion, colonization, degradation of host cell wall, signal transduction pathway. These results were integrated with T. indica genome sequence for identification of homologs of candidate pathogenicity/virulence related proteins. Protein identified in TiK isolate as malate dehydrogenase that converts malate to oxaloacetate which is precursor of oxalic acid. Oxalic acid is key pathogenicity factor in phytopathogenic fungi. These results were validated by GC-MS based metabolic profiling of T. indica isolates indicating that oxalic acid was exclusively identified in TiK isolate. Thus, integrated omics approaches leads to identification of pathogenicity/virulence factor(s) that would provide insights into pathogenic mechanisms of fungi and aid in devising effective disease management strategies.


September 22, 2019

In vitro DNA SCRaMbLE.

The power of synthetic biology has enabled the expression of heterologous pathways in cells, as well as genome-scale synthesis projects. The complexity of biological networks makes rational de novo design a grand challenge. Introducing features that confer genetic flexibility is a powerful strategy for downstream engineering. Here we develop an in vitro method of DNA library construction based on structural variation to accomplish this goal. The “in vitro SCRaMbLE system” uses Cre recombinase mixed in a test tube with purified DNA encoding multiple loxPsym sites. Using a ß-carotene pathway designed for expression in yeast as an example, we demonstrate top-down and bottom-up in vitro SCRaMbLE, enabling optimization of biosynthetic pathway flux via the rearrangement of relevant transcription units. We show that our system provides a straightforward way to correlate phenotype and genotype and is potentially amenable to biochemical optimization in ways that the in vivo system cannot achieve.


September 22, 2019

Precise control of SCRaMbLE in synthetic haploid and diploid yeast.

Compatibility between host cells and heterologous pathways is a challenge for constructing organisms with high productivity or gain of function. Designer yeast cells incorporating the Synthetic Chromosome Rearrangement and Modification by LoxP-mediated Evolution (SCRaMbLE) system provide a platform for generating genotype diversity. Here we construct a genetic AND gate to enable precise control of the SCRaMbLE method to generate synthetic haploid and diploid yeast with desired phenotypes. The yield of carotenoids is increased to 1.5-fold by SCRaMbLEing haploid strains and we determine that the deletion of YEL013W is responsible for the increase. Based on the SCRaMbLEing in diploid strains, we develop a strategy called Multiplex SCRaMbLE Iterative Cycling (MuSIC) to increase the production of carotenoids up to 38.8-fold through 5 iterative cycles of SCRaMbLE. This strategy is potentially a powerful tool for increasing the production of bio-based chemicals and for mining deep knowledge.


September 22, 2019

Mutant phenotypes for thousands of bacterial genes of unknown function.

One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because they are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.


September 22, 2019

NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.

PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its extremely wide range of application areas, fast sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of subsequent analysis tools. Although there are several available simulators (e.g., PBSIM, SimLoRD and FASTQSim) that target the specific generation of PacBio libraries, the error rate of simulated sequences is not well matched to the quality value of raw PacBio datasets, especially for PacBio’s continuous long reads (CLR).By analyzing the characteristic features of CLR data from PacBio SMRT (single molecule real time) sequencing, we developed a new PacBio sequencing simulator (called NPBSS) for producing CLR reads. NPBSS simulator firstly samples the read sequences according to the read length logarithmic normal distribution, and choses different base quality values with different proportions. Then, NPBSS computes the overall error probability of each base in the read sequence with an empirical model, and calculates the deletion, substitution and insertion probabilities with the overall error probability to generate the PacBio CLR reads. Alignment results demonstrate that NPBSS fits the error rate of the PacBio CLR reads better than PBSIM and FASTQSim. In addition, the assembly results also show that simulated sequences of NPBSS are more like real PacBio CLR data.NPBSS simulator is convenient to use with efficient computation and flexible parameters setting. Its generating PacBio CLR reads are more like real PacBio datasets.


September 22, 2019

Genome-wide analysis of Mycoplasma bovirhinis GS01 reveals potential virulence factors and phylogenetic relationships.

Mycoplasma bovirhinis is a significant etiology in bovine pneumonia and mastitis, but our knowledge about the genetic and pathogenic mechanisms of M. bovirhinis is very limited. In this study, we sequenced the complete genome of M. bovirhinis strain GS01 isolated from the nasal swab of pneumonic calves in Gansu, China, and we found that its genome forms a 847,985 bp single circular chromosome with a GC content of 27.57% and with 707 protein-coding genes. The putative virulence determinants of M. bovirhinis were then analyzed. Results showed that three genomic islands and 16 putative virulence genes, including one adhesion gene enolase, seven surface lipoproteins, proteins involved in glycerol metabolism, and cation transporters, might be potential virulence factors. Glycerol and pyruvate metabolic pathways were defective. Comparative analysis revealed remarkable genome variations between GS01 and a recently reported HAZ141_2 strain, and extremely low homology with others mycoplasma species. Phylogenetic analysis demonstrated that M. bovirhinis was most genetically close to M. canis, distant from other bovine Mycoplasma species. Genomic dissection may provide useful information on the pathogenic mechanisms and genetics of M. bovirhinis. Copyright © 2018 Chen et al.


September 22, 2019

Whole genome analysis reveals the diversity and evolutionary relationships between necrotic enteritis-causing strains of Clostridium perfringens.

Clostridium perfringens causes a range of diseases in animals and humans including necrotic enteritis in chickens and food poisoning and gas gangrene in humans. Necrotic enteritis is of concern in commercial chicken production due to the cost of the implementation of infection control measures and to productivity losses. This study has focused on the genomic analysis of a range of chicken-derived C. perfringens isolates, from around the world and from different years. The genomes were sequenced and compared with 20 genomes available from public databases, which were from a diverse collection of isolates from chickens, other animals, and humans. We used a distance based phylogeny that was constructed based on gene content rather than sequence identity. Similarity between strains was defined as the number of genes that they have in common divided by their total number of genes. In this type of phylogenetic analysis, evolutionary distance can be interpreted in terms of evolutionary events such as acquisition and loss of genes, whereas the underlying properties (the gene content) can be interpreted in terms of function. We also compared these methods to the sequence-based phylogeny of the core genome.Distinct pathogenic clades of necrotic enteritis-causing C. perfringens were identified. They were characterised by variable regions encoded on the chromosome, with predicted roles in capsule production, adhesion, inhibition of related strains, phage integration, and metabolism. Some strains have almost identical genomes, even though they were isolated from different geographic regions at various times, while other highly distant genomes appear to result in similar outcomes with regard to virulence and pathogenesis.The high level of diversity in chicken isolates suggests there is no reliable factor that defines a chicken strain of C. perfringens, however, disease-causing strains can be defined by the presence of netB-encoding plasmids. This study reveals that horizontal gene transfer appears to play a significant role in genetic variation of the C. perfringens chromosome as well as the plasmid content within strains.


September 22, 2019

Coordinated regulation of core and accessory genes in the multipartite genome of Sinorhizobium fredii.

Prokaryotes benefit from having accessory genes, but it is unclear how accessory genes can be linked with the core regulatory network when developing adaptations to new niches. Here we determined hierarchical core/accessory subsets in the multipartite pangenome (composed of genes from the chromosome, chromid and plasmids) of the soybean microsymbiont Sinorhizobium fredii by comparing twelve Sinorhizobium genomes. Transcriptomes of two S. fredii strains at mid-log and stationary growth phases and in symbiotic conditions were obtained. The average level of gene expression, variation of expression between different conditions, and gene connectivity within the co-expression network were positively correlated with the gene conservation level from strain-specific accessory genes to genus core. Condition-dependent transcriptomes exhibited adaptive transcriptional changes in pangenome subsets shared by the two strains, while strain-dependent transcriptomes were enriched with accessory genes on the chromid. Proportionally more chromid genes than plasmid genes were co-expressed with chromosomal genes, while plasmid genes had a higher within-replicon connectivity in expression than chromid ones. However, key nitrogen fixation genes on the symbiosis plasmid were characterized by high connectivity in both within- and between-replicon analyses. Among those genes with host-specific upregulation patterns, chromosomal znu and mdt operons, encoding a conserved high-affinity zinc transporter and an accessory multi-drug efflux system, respectively, were experimentally demonstrated to be involved in host-specific symbiotic adaptation. These findings highlight the importance of integrative regulation of hierarchical core/accessory components in the multipartite genome of bacteria during niche adaptation and in shaping the prokaryotic pangenome in the long run.


September 22, 2019

Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics.

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.


September 22, 2019

Whole-genome analysis of three yeast strains used for production of sherry-like wines revealed genetic traits specific to Flor yeasts.

Flor yeast strains represent a specialized group of Saccharomyces cerevisiae yeasts used for biological wine aging. We have sequenced the genomes of three flor strains originated from different geographic regions and used for production of sherry-like wines in Russia. According to the obtained phylogeny of 118 yeast strains, flor strains form very tight cluster adjacent to the main wine clade. SNP analysis versus available genomes of wine and flor strains revealed 2,270 genetic variants in 1,337 loci specific to flor strains. Gene ontology analysis in combination with gene content evaluation revealed a complex landscape of possibly adaptive genetic changes in flor yeast, related to genes associated with cell morphology, mitotic cell cycle, ion homeostasis, DNA repair, carbohydrate metabolism, lipid metabolism, and cell wall biogenesis. Pangenomic analysis discovered the presence of several well-known “non-reference” loci of potential industrial importance. Events of gene loss included deletions of asparaginase genes, maltose utilization locus, and FRE-FIT locus involved in iron transport. The latter in combination with a flor-yeast-specific mutation in the Aft1 transcription factor gene is likely to be responsible for the discovered phenotype of increased iron sensitivity and improved iron uptake of analyzed strains. Expansion of the coding region of the FLO11 flocullin gene and alteration of the balance between members of the FLO gene family are likely to positively affect the well-known propensity of flor strains for velum formation. Our study provides new insights in the nature of genetic variation in flor yeast strains and demonstrates that different adaptive properties of flor yeast strains could have evolved through different mechanisms of genetic variation.


September 22, 2019

Convergent loss of ABC transporter genes from Clostridioides difficile genomes is associated with impaired tyrosine uptake and p-cresol production.

We report the frequent, convergent loss of two genes encoding the substrate-binding protein and the ATP-binding protein of an ATP-binding cassette (ABC) transporter from the genomes of unrelated Clostridioides difficile strains. This specific genomic deletion was strongly associated with the reduced uptake of tyrosine and phenylalanine and production of derived Stickland fermentation products, including p-cresol, suggesting that the affected ABC transporter had been responsible for the import of aromatic amino acids. In contrast, the transporter gene loss did not measurably affect bacterial growth or production of enterotoxins. Phylogenomic analysis of publically available genome sequences indicated that this transporter gene deletion had occurred multiple times in diverse clonal lineages of C. difficile, with a particularly high prevalence in ribotype 027 isolates, where 48 of 195 genomes (25%) were affected. The transporter gene deletion likely was facilitated by the repetitive structure of its genomic location. While at least some of the observed transporter gene deletions are likely to have occurred during the natural life cycle of C. difficile, we also provide evidence for the emergence of this mutation during long-term laboratory cultivation of reference strain R20291.


September 22, 2019

Nucleotide-binding resistance gene signatures in sugar beet, insights from a new reference genome.

Nucleotide-binding (NB-ARC), leucine-rich-repeat genes (NLRs) account for 60.8% of resistance (R) genes molecularly characterized from plants. NLRs exist as large gene families prone to tandem duplication and transposition, with high sequence diversity among crops and their wild relatives. This diversity can be a source of new disease resistance, but difficulty in distinguishing specific sequences from homologous gene family members hinders characterization of resistance for improving crop varieties. Current genome sequencing and assembly technologies, especially those using long-read sequencing, are improving resolution of repeat-rich genomic regions and clarifying locations of duplicated genes, such as NLRs. Using the conserved NB-ARC domain as a model, 231 tentative NB-ARC loci were identified in a highly contiguous genome assembly of sugar beet, revealing diverged and truncated NB-ARC signatures as well as full-length sequences. The NB-ARC-associated proteins contained NLR resistance gene domains, including TIR, CC, and LRR, as well as other integrated domains. Phylogenetic relationships of partial and complete domains were determined, and patterns of physical clustering in the genome were evaluated. Comparison of sugar beet NB-ARC domains to validated R genes from monocots and eudicots suggested extensive B. vulgaris-specific subfamily expansions. The NLR landscape in the rhizomania resistance conferring Rz region of Chromosome 3 was characterized, identifying 26 NLR-like sequences spanning 20 MB. This work presents the first detailed view of NLR family composition in a member of the Caryophyllales, builds a foundation for additional disease resistance work in B. vulgaris, and demonstrates an additional nucleic-acid-based method for NLR prediction in non-model plant species. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


September 22, 2019

Comparative genomics analysis of plasmid pPV989-94 from a clinical isolate of Pantoea vagans PV989.

Pantoea vagans, a gram-negative bacterium from the genus Pantoea and family Enterobacteriaceae, is present in various natural environments and considered to be plant endophytes. We isolated the Pantoea vagans PV989 strain from the clinic and sequenced its whole genome. Besides a chromosome DNA molecule, it also harboured three large plasmids. A comparative genomics analysis was performed for the smallest plasmid, pPV989-94. It can be divided into four regions, including three conservative regions related to replication (R1), transfer conjugation (R2), and transfer leading (R3), and one variable region (R4). Further analysis showed that pPV989-94 is most similar to plasmids LA637P2 and pEA68 of Erwinia amylovora strains isolated from fruit trees. These three plasmids share three conservative regions (R1, R2, and R3). Interestingly, a fragment (R4′) in R4, mediated by phage integrase and phage integrase family site-specific recombinase and encoding 9 genes related to glycometabolism, resistance, and DNA repair, was unique in pPV989-94. Homologues of R4′ were found in other plasmids or chromosomes, suggesting that horizontal gene transfer (HGT) occurred among different bacteria of various species or genera. The acquired functional genes may play important roles in the adaptation of bacteria to different hosts or environmental conditions.


September 22, 2019

Genus-wide assessment of lignocellulose utilization in the extremely thermophilic Caldicellulosiruptor by genomic, pan-genomic and metagenomic analysis

Metagenomic data from Obsidian Pool (Yellowstone National Park, USA) and 13 genome sequences were used to reassess genus-wide biodiversity for the extremely thermophilic Caldicellulosiruptor The updated core genome contains 1,401 ortholog groups (average genome size for 13 species = 2,516 genes). The pangenome, which remains open with a revised total of 3,493 ortholog groups, encodes a variety of multidomain glycoside hydrolases (GHs). These include three cellulases with GH48 domains that are colocated in the glucan degradation locus (GDL) and are specific determinants for microcrystalline cellulose utilization. Three recently sequenced species, Caldicellulosiruptor sp. strain Rt8.B8 (renamed here Caldicellulosiruptor morganii), Thermoanaerobacter cellulolyticus strain NA10 (renamed here Caldicellulosiruptor naganoensis), and Caldicellulosiruptor sp. strain Wai35.B1 (renamed here Caldicellulosiruptor danielii), degraded Avicel and lignocellulose (switchgrass). C. morganii was more efficient than Caldicellulosiruptor bescii in this regard and differed from the other 12 species examined, both based on genome content and organization and in the specific domain features of conserved GHs. Metagenomic analysis of lignocellulose-enriched samples from Obsidian Pool revealed limited new information on genus biodiversity. Enrichments yielded genomic signatures closely related to that of Caldicellulosiruptor obsidiansis, but there was also evidence for other thermophilic fermentative anaerobes (Caldanaerobacter, Fervidobacterium, Caloramator, and Clostridium). One enrichment, containing 89.8% Caldicellulosiruptor and 9.7% Caloramator, had a capacity for switchgrass solubilization comparable to that of C. bescii These results refine the known biodiversity of Caldicellulosiruptor and indicate that microcrystalline cellulose degradation at temperatures above 70°C, based on current information, is limited to certain members of this genus that produce GH48 domain-containing enzymes.IMPORTANCE The genus Caldicellulosiruptor contains the most thermophilic bacteria capable of lignocellulose deconstruction, which are promising candidates for consolidated bioprocessing for the production of biofuels and bio-based chemicals. The focus here is on the extant capability of this genus for plant biomass degradation and the extent to which this can be inferred from the core and pangenomes, based on analysis of 13 species and metagenomic sequence information from environmental samples. Key to microcrystalline hydrolysis is the content of the glucan degradation locus (GDL), a set of genes encoding glycoside hydrolases (GHs), several of which have GH48 and family 3 carbohydrate binding module domains, that function as primary cellulases. Resolving the relationship between the GDL and lignocellulose degradation will inform efforts to identify more prolific members of the genus and to develop metabolic engineering strategies to improve this characteristic. Copyright © 2018 American Society for Microbiology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.