The performance of RNA sequencing (RNA-seq) aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand.Here, we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artifacts such as erroneous transcript chimerisms.We have implemented this method in an open-source Python3 and Cython program, Mikado, available on GitHub.
An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations.
Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop. © 2017 Clavijo et al.; Published by Cold Spring Harbor Laboratory Press.
Cells are a fundamental unit of life, and the ability to study the phenotypes and behaviors of individual cells is crucial to understanding the workings of complex biological systems. Cell phenotypes (epigenomic, transcriptomic, proteomic, and metabolomic) exhibit dramatic heterogeneity between and within the different cell types and states underlying cellular functional diversity. Cell genotypes can also display heterogeneity throughout an organism, in the form of somatic genetic variation-most notably in the emergence and evolution of tumors. Recent technical advances in single-cell isolation and the development of omics approaches sensitive enough to reveal these aspects of cell identity have enabled a revolution in the study of multicellular systems. In this review, we discuss the technologies available to resolve the genomes, epigenomes, transcriptomes, proteomes, and metabolomes of single cells from a wide variety of living systems.© 2018 The Authors. Proteomics Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mucosal surfaces represent critical routes for entry and exit of pathogens. As such, animals have evolved strategies to combat infection at these sites, in particular the production of mucus to prevent attachment and to promote subsequent movement of the mucus/microbe away from the underlying epithelial surface. Using biochemical, biophysical, and infection studies, we have investigated the host protective properties of the skin mucus barrier of the Xenopus tropicalis tadpole. Specifically, we have characterized the major structural component of the barrier and shown that it is a mucin glycoprotein (Otogelin-like or Otogl) with similar sequence, domain organization, and structural properties to human gel-forming mucins. This mucin forms the structural basis of a surface barrier (~6 µm thick), which is depleted through knockdown of Otogl. Crucially, Otogl knockdown leads to susceptibility to infection by the opportunistic pathogen Aeromonas hydrophila To more accurately reflect its structure, tissue localization, and function, we have renamed Otogl as Xenopus Skin Mucin, or MucXS. Our findings characterize an accessible and tractable model system to define mucus barrier function and host-microbe interactions. Copyright © 2018 the Author(s). Published by PNAS.
Plant immune receptors are under constant selective pressure to maintain resistance to plant pathogens. Nucleotide-binding leucine-rich repeat (NLR) proteins are one class of cytoplasmic immune receptors whose genes commonly show signatures of adaptive evolution. While it is known that balancing selection contributes to maintaining high intraspecific allelic diversity, the evolutionary mechanism that influences the transmission of alleles during speciation remains unclear. The barley Mla locus has over 30 described alleles conferring isolate-specific resistance to barley powdery mildew and contains three NLR families (RGH1, RGH2, and RGH3). We discovered (using sequence capture and RNAseq) the presence of a novel integrated Exo70 domain in RGH2 in the Mla3 haplotype. Allelic variation across barley accessions includes presence/absence of the integrated domain in RGH2. Expanding our search to several Poaceae species, we found shared interspecific conservation in the RGH2-Exo70 integration. We hypothesise that balancing selection has maintained allelic variation at Mla as a trans-species polymorphism over 24 My, thus contributing to and preserving interspecific allelic diversity during speciation.
Identification of candidate genes at the Dp-fl locus conferring resistance against the rosy apple aphid Dysaphis plantaginea
The cultivated apple is susceptible to several pests including the rosy apple aphid (RAA; Dysaphis plantaginea Passerini), control of which is mainly based on chemical treatments. A few cases of resistance to aphids have been described in apple germplasm resources, laying the basis for the development of new resistant cultivars by breeding. The cultivar ‘Florina’ is resistant to RAA, and recently, the Dp-fl locus responsible for its resistance was mapped on linkage group 8 of the apple genome. In this paper, a chromosome walking approach was performed by using a ‘Florina’ bacterial artificial chromosome (BAC) library. The walking started from the available tightly linked molecular markers flanking the resistance region. Various walking steps were performed in order to identify the minimum tiling path of BAC clones covering the Dp-fl region from both the “resistant” and “susceptible” chromosomes of ‘Florina’. A genomic region of about 279 Kb encompassing the Dp-fl resistance locus was fully sequenced by the PacBio technology. Through the development of new polymorphic markers, the mapping interval around the resistance locus was narrowed down to a physical region of 95 Kb. The annotation of this sequence resulted in the identification of four candidate genes putatively involved in the RAA resistance response.
Intraspecific diversity promotes evolutionary change, and when partitioned among geographic regions or habitats can form the basis for speciation. Marine species live in an environment that can provide as much scope for diversification in the vertical as in the horizontal dimension. Understanding the relevant mechanisms will contribute significantly to our understanding of eco-evolutionary processes and effective biodiversity conservation. Here, we provide an annotated genome assembly for the deep-sea fish Coryphaenoides rupestris and re-sequencing data to show that differentiation at non-synonymous sites in functional loci distinguishes individuals living at different depths, independent of horizontal spatial distance. Our data indicate disruptive selection at these loci; however, we find no clear evidence for differentiation at neutral loci that may indicate assortative mating. We propose that individuals with distinct genotypes at relevant loci segregate by depth as they mature (supported by survey data), which may be associated with ecotype differentiation linked to distinct phenotypic requirements at different depths.
Bdelloid rotifers are a class of microscopic invertebrates that have existed for millions of years apparently without sex or meiosis. They inhabit a variety of temporary and permanent freshwater habitats globally, and many species are remarkably tolerant of desiccation. Bdelloids offer an opportunity to better understand the evolution of sex and recombination, but previous work has emphasised desiccation as the cause of several unusual genomic features in this group. Here, we present high-quality whole-genome sequences of 3 bdelloid species: Rotaria macrura and R. magnacalcarata, which are both desiccation intolerant, and Adineta ricciae, which is desiccation tolerant. In combination with the published assembly of A. vaga, which is also desiccation tolerant, we apply a comparative genomics approach to evaluate the potential effects of desiccation tolerance and asexuality on genome evolution in bdelloids. We find that ancestral tetraploidy is conserved among all 4 bdelloid species, but homologous divergence in obligately aquatic Rotaria genomes is unexpectedly low. This finding is contrary to current models regarding the role of desiccation in shaping bdelloid genomes. In addition, we find that homologous regions in A. ricciae are largely collinear and do not form palindromic repeats as observed in the published A. vaga assembly. Consequently, several features interpreted as genomic evidence for long-term ameiotic evolution are not general to all bdelloid species, even within the same genus. Finally, we substantiate previous findings of high levels of horizontally transferred nonmetazoan genes in both desiccating and nondesiccating bdelloid species and show that this unusual feature is not shared by other animal phyla, even those with desiccation-tolerant representatives. These comparisons call into question the proposed role of desiccation in mediating horizontal genetic transfer.
Antimycins are a family of natural products possessing outstanding biological activities and unique structures, which have intrigued chemists for over a half century. Of particular interest are the ring-expanded antimycins that show promising anticancer potential and whose biosynthesis remains uncharacterized. Specifically, neoantimycin and its analogs have been shown to be effective regulators of the oncogenic proteins GRP78/BiP and K-Ras. The neoantimycin structural skeleton is built on a 15-membered tetralactone ring containing one methyl, one hydroxy, one benzyl, and three alkyl moieties, as well as an amide linkage to a conserved 3-formamidosalicylic acid moiety. Although the biosynthetic gene cluster for neoantimycins was recently identified, the enzymatic logic that governs the synthesis of neoantimycins has not yet been revealed. In this work, the neoantimycin gene cluster is identified, and an updated sequence and annotation is provided delineating a nonribosomal peptide synthetase/polyketide synthase (NRPS/PKS) hybrid scaffold. Using cosmid expression and CRISPR/Cas-based genome editing, several heterologous expression strains for neoantimycin production are constructed in two separate Streptomyces species. A combination of in vivo and in vitro analysis is further used to completely characterize the biosynthesis of neoantimycins including the megasynthases and trans-acting domains. This work establishes a set of highly tractable hosts for producing and engineering neoantimycins and their C11 oxidized analogs, paving the way for neoantimycin-based drug discovery and development.
Signatures of host specialization and a recent transposable element burst in the dynamic one-speed genome of the fungal barley powdery mildew pathogen.
Powdery mildews are biotrophic pathogenic fungi infecting a number of economically important plants. The grass powdery mildew, Blumeria graminis, has become a model organism to study host specialization of obligate biotrophic fungal pathogens. We resolved the large-scale genomic architecture of B. graminis forma specialis hordei (Bgh) to explore the potential influence of its genome organization on the co-evolutionary process with its host plant, barley (Hordeum vulgare).The near-chromosome level assemblies of the Bgh reference isolate DH14 and one of the most diversified isolates, RACE1, enabled a comparative analysis of these haploid genomes, which are highly enriched with transposable elements (TEs). We found largely retained genome synteny and gene repertoires, yet detected copy number variation (CNV) of secretion signal peptide-containing protein-coding genes (SPs) and locally disrupted synteny blocks. Genes coding for sequence-related SPs are often locally clustered, but neither the SPs nor the TEs reside preferentially in genomic regions with unique features. Extended comparative analysis with different host-specific B. graminis formae speciales revealed the existence of a core suite of SPs, but also isolate-specific SP sets as well as congruence of SP CNV and phylogenetic relationship. We further detected evidence for a recent, lineage-specific expansion of TEs in the Bgh genome.The characteristics of the Bgh genome (largely retained synteny, CNV of SP genes, recently proliferated TEs and a lack of significant compartmentalization) are consistent with a “one-speed” genome that differs in its architecture and (co-)evolutionary pattern from the “two-speed” genomes reported for several other filamentous phytopathogens.
Comparative genomics of Pseudomonas syringae reveals convergent gene gain and loss associated with specialization onto cherry (Prunus avium).
Genome-wide analyses of the effector- and toxin-encoding genes were used to examine the phylogenetics and evolution of pathogenicity amongst diverse strains of Pseudomonas syringae causing bacterial canker of cherry (Prunus avium), including pathovars P. syringae pv morsprunorum (Psm) races 1 and 2, P. syringae pv syringae (Pss) and P. syringae pv avii. Phylogenetic analyses revealed Psm races and P. syringae pv avii clades were distinct and were each monophyletic, whereas cherry-pathogenic strains of Pss were interspersed amongst strains from other host species. A maximum likelihood approach was used to predict effectors associated with pathogenicity on cherry. Pss possesses a smaller repertoire of type III effectors but has more toxin biosynthesis clusters than Psm and P. syringae pv avii. Evolution of cherry pathogenicity was correlated with gain of genes such as hopAR1 and hopBB1 through putative phage transfer and horizontal transfer respectively. By contrast, loss of the avrPto/hopAB redundant effector group was observed in cherry-pathogenic clades. Ectopic expression of hopAB and hopC1 triggered the hypersensitive reaction in cherry leaves, confirming computational predictions. Cherry canker provides a fascinating example of convergent evolution of pathogenicity that is explained by the mix of effector and toxin repertoires acting on a common host.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
The structure of a conserved telomeric region associated with variant antigen loci in the blood parasite Trypanosoma congolense
African trypanosomiasis is a vector-borne disease of humans and livestock caused by African trypanosomes (Trypanosoma spp.). Survival in the vertebrate bloodstream depends on antigenic variation of Variant Surface Glycoproteins (VSGs) coating the parasite surface. In T. brucei, a model for antigenic variation, monoallelic VSG expression originates from dedicated VSG expression sites (VES). Trypanosoma brucei VES have a conserved structure consisting of a telomeric VSG locus downstream of unique, repeat sequences, and an independent promoter. Additional protein-coding sequences, known as “Expression Site Associated Genes (ESAGs)”, are also often present and are implicated in diverse, bloodstream-stage functions. Trypanosoma congolense is a related veterinary pathogen, also displaying VSG-mediated antigenic variation. A T. congolense VES has not been described, making it unclear if regulation of VSG expression is conserved between species. Here, we describe a conserved telomeric region associated with VSG loci from long-read DNA sequencing of two T. congolense strains, which consists of a distal repeat, conserved noncoding elements and other genes besides the VSG; although these are not orthologous to T. brucei ESAGs. Most conserved telomeric regions are associated with accessory minichromosomes, but the same structure may also be associated with megabase chromosomes. We propose that this region represents the T. congolense VES, and through comparison with T. brucei, we discuss the parallel evolution of antigenic switching mechanisms, and unique adaptation of the T. brucei VES for developmental regulation of bloodstream-stage genes. Hence, we provide a basis for understanding antigenic switching in T. congolense and the origins of the African trypanosome VES.
Characterisation of pathogen-specific regions and novel effector candidates in Fusarium oxysporum f. sp. cepae.
A reference-quality assembly of Fusarium oxysporum f. sp. cepae (Foc), the causative agent of onion basal rot has been generated along with genomes of additional pathogenic and non-pathogenic isolates of onion. Phylogenetic analysis confirmed a single origin of the Foc pathogenic lineage. Genome alignments with other F. oxysporum ff. spp. and non pathogens revealed high levels of syntenic conservation of core chromosomes but little synteny between lineage specific (LS) chromosomes. Four LS contigs in Foc totaling 3.9?Mb were designated as pathogen-specific (PS). A two-fold increase in segmental duplication events was observed between LS regions of the genome compared to within core regions or from LS regions to the core. RNA-seq expression studies identified candidate effectors expressed in planta, consisting of both known effector homologs and novel candidates. FTF1 and a subset of other transcription factors implicated in regulation of effector expression were found to be expressed in planta.
Many animal species comprise discrete phenotypic forms. A common example in natural populations of insects is the occurrence of different color patterns, which has motivated a rich body of ecological and genetic research [1-6]. The occurrence of dark, i.e., melanic, forms displaying discrete color patterns is found across multiple taxa, but the underlying genomic basis remains poorly characterized. In numerous ladybird species (Coccinellidae), the spatial arrangement of black and red patches on adult elytra varies wildly within species, forming strikingly different complex color patterns [7, 8]. In the harlequin ladybird, Harmonia axyridis, more than 200 distinct color forms have been described, which classic genetic studies suggest result from allelic variation at a single, unknown, locus [9, 10]. Here, we combined whole-genome sequencing, population-based genome-wide association studies, gene expression, and functional analyses to establish that the transcription factor Pannier controls melanic pattern polymorphism in H. axyridis. We show that pannier is necessary for the formation of melanic elements on the elytra. Allelic variation in pannier leads to protein expression in distinct domains on the elytra and thus determines the distinct color patterns in H. axyridis. Recombination between pannier alleles may be reduced by a highly divergent sequence of ~170 kb in the cis-regulatory regions of pannier, with a 50 kb inversion between color forms. This most likely helps maintain the distinct alleles found in natural populations. Thus, we propose that highly variable discrete color forms can arise in natural populations through cis-regulatory allelic variation of a single gene. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Molecular epidemiology of isolates with multiple mcr plasmids from a pig farm in Great Britain: the effects of colistin withdrawal in the short and long term.
The environment, including farms, might act as a reservoir for mobile colistin resistance (mcr) genes, which has led to calls for reduction of usage in livestock of colistin, an antibiotic of last resort for humans.To establish the molecular epidemiology of mcr Enterobacteriaceae from faeces of two cohorts of pigs, where one group had initially been treated with colistin and the other not, over a 5?month period following stoppage of colistin usage on a farm in Great Britain; faecal samples were also taken at ~20?months.mcr-1 Enterobacteriaceae were isolated from positive faeces and was WGS performed; conjugation was performed on selected Escherichia coli and colistin MICs were determined.E. coli of diverse ST harbouring mcr-1 and multiple resistance genes were isolated over 5?months from both cohorts. Two STs, from treated cohorts, contained both mcr-1 and mcr-3 plasmids, with some isolates also harbouring multiple copies of mcr-1 on different plasmids. The mcr-1 plasmids grouped into four Inc types (X4, pO111, I2 and HI2), with mcr-3 found in IncP. Multiple copies of mcr plasmids did not have a noticeable effect on colistin MIC, but they could be transferred simultaneously to a Salmonella host in vitro. Neither mcr-1 nor mcr-3 was detected in samples collected ~20?months after colistin cessation.We report for the first known time on the presence in Great Britain of mcr-3 from MDR Enterobacteriaceae, which might concurrently harbour multiple copies of mcr-1 on different plasmids. However, control measures, including stoppage of colistin, can successfully mitigate long-term on-farm persistence.