Menu
July 7, 2019  |  

Generality of toxins in defensive symbiosis: Ribosome-inactivating proteins and defense against parasitic wasps in Drosophila.

While it has become increasingly clear that multicellular organisms often harbor microbial symbionts that protect their hosts against natural enemies, the mechanistic underpinnings underlying most defensive symbioses are largely unknown. Spiroplasma bacteria are widespread associates of terrestrial arthropods, and include strains that protect diverse Drosophila flies against parasitic wasps and nematodes. Recent work implicated a ribosome-inactivating protein (RIP) encoded by Spiroplasma, and related to Shiga-like toxins in enterohemorrhagic Escherichia coli, in defense against a virulent parasitic nematode in the woodland fly, Drosophila neotestacea. Here we test the generality of RIP-mediated protection by examining whether Spiroplasma RIPs also play a role in wasp protection, in D. melanogaster and D. neotestacea. We find strong evidence for a major role of RIPs, with ribosomal RNA (rRNA) from the larval endoparasitic wasps, Leptopilina heterotoma and Leptopilina boulardi, exhibiting the hallmarks of RIP activity. In Spiroplasma-containing hosts, parasitic wasp ribosomes show abundant site-specific depurination in the a-sarcin/ricin loop of the 28S rRNA, with depurination occurring soon after wasp eggs hatch inside fly larvae. Interestingly, we found that the pupal ectoparasitic wasp, Pachycrepoideus vindemmiae, escapes protection by Spiroplasma, and its ribosomes do not show high levels of depurination. We also show that fly ribosomes show little evidence of targeting by RIPs. Finally, we find that the genome of D. neotestacea’s defensive Spiroplasma encodes a diverse repertoire of RIP genes, which are differ in abundance. This work suggests that specificity of defensive symbionts against different natural enemies may be driven by the evolution of toxin repertoires, and that toxin diversity may play a role in shaping host-symbiont-enemy interactions.


July 7, 2019  |  

Convergent evolution of Y chromosome gene content in flies.

Sex-chromosomes have formed repeatedly across Diptera from ordinary autosomes, and X-chromosomes mostly conserve their ancestral genes. Y-chromosomes are characterized by abundant gene-loss and an accumulation of repetitive DNA, yet the nature of the gene repertoire of fly Y-chromosomes is largely unknown. Here we trace gene-content evolution of Y-chromosomes across 22 Diptera species, using a subtraction pipeline that infers Y genes from male and female genome, and transcriptome data. Few genes remain on old Y-chromosomes, but the number of inferred Y-genes varies substantially between species. Young Y-chromosomes still show clear evidence of their autosomal origins, but most genes on old Y-chromosomes are not simply remnants of genes originally present on the proto-sex-chromosome that escaped degeneration, but instead were recruited secondarily from autosomes. Despite almost no overlap in Y-linked gene content in different species with independently formed sex-chromosomes, we find that Y-linked genes have evolved convergent gene functions associated with testis expression. Thus, male-specific selection appears as a dominant force shaping gene-content evolution of Y-chromosomes across fly species.While X-chromosome gene content tends to be conserved, Y-chromosome evolution is dynamic and difficult to reconstruct. Here, Mahajan and Bachtrog use a subtraction pipeline to identify Y-linked genes in 22 Diptera species, revealing patterns of Y-chromosome gene-content evolution.


July 7, 2019  |  

Lightweight BWT and LCP merging via the gap algorithm

Recently, Holt and McMillan [Bioinformatics 2014, ACM-BCB 2014] have proposed a simple and elegant algorithm to merge the Burrows-Wheeler transforms of a collection of strings. In this paper we show that their algorithm can be improved so that, in addition to the BWTs, it also merges the Longest Common Prefix (LCP) arrays. Because of its small memory footprint this new algorithm can be used for the final merge of BWT and LCP arrays computed by a faster but memory intensive construction algorithm.


July 7, 2019  |  

Genomics of parallel adaptation at two timescales in Drosophila.

Two interesting unanswered questions are the extent to which both the broad patterns and genetic details of adaptive divergence are repeatable across species, and the timescales over which parallel adaptation may be observed. Drosophila melanogaster is a key model system for population and evolutionary genomics. Findings from genetics and genomics suggest that recent adaptation to latitudinal environmental variation (on the timescale of hundreds or thousands of years) associated with Out-of-Africa colonization plays an important role in maintaining biological variation in the species. Additionally, studies of interspecific differences between D. melanogaster and its sister species D. simulans have revealed that a substantial proportion of proteins and amino acid residues exhibit adaptive divergence on a roughly few million years long timescale. Here we use population genomic approaches to attack the problem of parallelism between D. melanogaster and a highly diverged conger, D. hydei, on two timescales. D. hydei, a member of the repleta group of Drosophila, is similar to D. melanogaster, in that it too appears to be a recently cosmopolitan species and recent colonizer of high latitude environments. We observed parallelism both for genes exhibiting latitudinal allele frequency differentiation within species and for genes exhibiting recurrent adaptive protein divergence between species. Greater parallelism was observed for long-term adaptive protein evolution and this parallelism includes not only the specific genes/proteins that exhibit adaptive evolution, but extends even to the magnitudes of the selective effects on interspecific protein differences. Thus, despite the roughly 50 million years of time separating D. melanogaster and D. hydei, and despite their considerably divergent biology, they exhibit substantial parallelism, suggesting the existence of a fundamental predictability of adaptive evolution in the genus.


July 7, 2019  |  

Hidden genetic variation shapes the structure of functional elements in Drosophila.

Mutations that add, subtract, rearrange, or otherwise refashion genome structure often affect phenotypes, although the fragmented nature of most contemporary assemblies obscures them. To discover such mutations, we assembled the first new reference-quality genome of Drosophila melanogaster since its initial sequencing. By comparing this new genome to the existing D. melanogaster assembly, we created a structural variant map of unprecedented resolution and identified extensive genetic variation that has remained hidden until now. Many of these variants constitute candidates underlying phenotypic variation, including tandem duplications and a transposable element insertion that amplifies the expression of detoxification-related genes associated with nicotine resistance. The abundance of important genetic variation that still evades discovery highlights how crucial high-quality reference genomes are to deciphering phenotypes.


July 7, 2019  |  

HISEA: HIerarchical SEed Aligner for PacBio data.

The next generation sequencing (NGS) techniques have been around for over a decade. Many of their fundamental applications rely on the ability to compute good genome assemblies. As the technology evolves, the assembly algorithms and tools have to continuously adjust and improve. The currently dominant technology of Illumina produces reads that are too short to bridge many repeats, setting limits on what can be successfully assembled. The emerging SMRT (Single Molecule, Real-Time) sequencing technique from Pacific Biosciences produces uniform coverage and long reads of length up to sixty thousand base pairs, enabling significantly better genome assemblies. However, SMRT reads are much more expensive and have a much higher error rate than Illumina’s – around 10-15% – mostly due to indels. New algorithms are very much needed to take advantage of the long reads while mitigating the effect of high error rate and lowering the required coverage.An essential step in assembling SMRT data is the detection of alignments, or overlaps, between reads. High error rate and very long reads make this a much more challenging problem than for Illumina data. We present a new pairwise read aligner, or overlapper, HISEA (Hierarchical SEed Aligner) for SMRT sequencing data. HISEA uses a novel two-step k-mer search, employing consistent clustering, k-mer filtering, and read alignment extension.We compare HISEA against several state-of-the-art programs – BLASR, DALIGNER, GraphMap, MHAP, and Minimap – on real datasets from five organisms. We compare their sensitivity, precision, specificity, F1-score, as well as time and memory usage. We also introduce a new, more precise, evaluation method. Finally, we compare the two leading programs, MHAP and HISEA, for their genome assembly performance in the Canu pipeline.Our algorithm has the best alignment detection sensitivity among all programs for SMRT data, significantly higher than the current best. The currently best assembler for SMRT data is the Canu program which uses the MHAP aligner in its pipeline. We have incorporated our new HISEA aligner in the Canu pipeline and benchmarked it against the best pipeline for multiple datasets at two relevant coverage levels: 30x and 50x. Our assemblies are better than those using MHAP for both coverage levels. Moreover, Canu+HISEA assemblies for 30x coverage are comparable with Canu+MHAP assemblies for 50x coverage, while being faster and cheaper.The HISEA algorithm produces alignments with highest sensitivity compared with the current state-of-the-art algorithms. Integrated in the Canu pipeline, currently the best for assembling PacBio data, it produces better assemblies than Canu+MHAP.


July 7, 2019  |  

Whole-genome sequencing recommendations

Recent technological developments have revolutionized the way we perform genetic analyses. In particular whole-genome sequencing provides access to the entire genetic makeup of an individual, and it is now an affordable approach for many research groups. As a consequence genome sequencing is pervading many fields of biological research. Sequencing technologies are evolving rapidly and so do their applications. Here we provide a first primer on whole-genome sequencing, focusing on two of the most popular applications: (1) de novo genome sequencing, in which the objective is obtaining a high-quality genome assembly that can serve as a reference for a species or variety, and (2) genome resequencing, when there is an available reference genome and the objective is to map sequence variation of an individual or a set of individuals. It is not our intention to provide a comprehensive overview of current methodologies that will likely soon become obsolete, but rather focus on general principles that will have a more general applicability.


July 7, 2019  |  

Comparative evaluation of the genomes of three common Drosophila-associated bacteria.

Drosophila melanogaster is an excellent model to explore the molecular exchanges that occur between an animal intestine and associated microbes. Previous studies in Drosophila uncovered a sophisticated web of host responses to intestinal bacteria. The outcomes of these responses define critical events in the host, such as the establishment of immune responses, access to nutrients, and the rate of larval development. Despite our steady march towards illuminating the host machinery that responds to bacterial presence in the gut, there are significant gaps in our understanding of the microbial products that influence bacterial association with a fly host. We sequenced and characterized the genomes of three common Drosophila-associated microbes: Lactobacillus plantarum, Lactobacillus brevis and Acetobacter pasteurianus For each species, we compared the genomes of Drosophila-associated strains to the genomes of strains isolated from alternative sources. We found that environmental Lactobacillus strains readily associated with adult Drosophila and were similar to fly isolates in terms of genome organization. In contrast, we identified a strain of A. pasteurianus that apparently fails to associate with adult Drosophila due to an inability to grow on fly nutrient food. Comparisons between association competent and incompetent A. pasteurianus strains identified a short list of candidate genes that may contribute to survival on fly medium. Many of the gene products unique to fly-associated strains have established roles in the stabilization of host-microbe interactions. These data add to a growing body of literature that examines the microbial perspective of host-microbe relationships. © 2016. Published by The Company of Biologists Ltd.


July 7, 2019  |  

Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage.

Genome assemblies that are accurate, complete and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a ‘missing manual’ that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019  |  

Jabba: hybrid error correction for long sequencing reads.

Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned.In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is the use of a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of MEMs in the context of third generation reads are presented.Jabba produces highly reliable corrected reads: almost all corrected reads align to the reference, and these alignments have a very high identity. Many of the aligned reads are error-free. Additionally, Jabba corrects reads using a very low amount of CPU time. From this we conclude that pseudo alignment with MEMs is a fast and reliable method to map long highly erroneous sequences on a de Bruijn graph.


July 7, 2019  |  

Susan Celniker: Foundational resources to study a dynamic genome.

The Genetics Society of America’s George W. Beadle Award honors individuals who have made outstanding contributions to the community of genetics researchers and who exemplify the qualities of its namesake. The 2016 recipient, Susan E. Celniker, played a key role in the sequencing, annotation, and characterization of the Drosophila genome. She participated in early sequencing efforts at the Lawrence Berkeley National Laboratory and led the modENCODE Fly Transcriptome Consortium. Her efforts were critical to ensuring that the Drosophila genome was well-annotated, making it one of the best curated animal genomes available. As the Principal Investigator for the BDGP, Celniker has enabled the study of proteomes by creating a collection of over 13,000 clones that match annotated genes for protein expression in cells or transgenic flies, and she has established the most comprehensive spatial gene expression atlas in any organism, with in situ imaging of more than 80% of the Drosophila protein-coding transcriptome through embryogenesis. In addition to providing the research community with these invaluable resources and reagents, she continues to develop new tools and datasets for genetics researchers to explore the spatial and temporal control of gene expression.


July 7, 2019  |  

The recombination landscape of Drosophila virilis is robust to transposon activation in hybrid dysgenesis

DNA damage in the germline is a double-edged sword. Induced double-strand breaks establish the foundation for meiotic recombination and proper chromosome segregation but can also pose a significant challenge for genome stability. Within the germline, transposable elements are powerful agents of double-strand break formation. How different types of DNA damage are resolved within the germline is poorly understood. For example, little is known about the relationship between the frequency of double-stranded breaks, both endogenous and exogenous, and the decision to repair DNA through one of the many pathways, including crossing over and gene conversion. Here we use the Drosophila virilis hybrid dysgenesis model to determine how recombination landscapes change under transposable element activation. In this system, a cross between two strains of D. virilis with divergent transposable element profiles results in the hybrid dysgenesis phenotype, which includes the germline activation of diverse transposable elements, reduced fertility, and male recombination. However, only one direction of the cross results in hybrid dysgenesis. This allows the study of recombination in genetically identical F1 females; those with baseline levels of programmed DNA damage and those with an increased level of DNA damage resulting from transposable element proliferation. Using multiplexed shotgun genotyping to map crossover events, we compared the recombination landscapes of hybrid dysgenic and non-hybrid dysgenic individuals. The frequency and distribution of meiotic recombination appears to be robust during hybrid dysgenesis. However, hybrid dysgenesis is also associated with occasional clusters of recombination derived from single dysgenic F1 mothers. The clusters of recombination are hypothesized to be the result of mitotic crossovers during early germline development. Overall, these results show that meiotic recombination in D. virilis is robust to the damage caused by transposable elements during early development.


July 7, 2019  |  

An investigation of Y chromosome incorporations in 400 species of Drosophila and related genera.

Y chromosomes are widely believed to evolve from a normal autosome through a process of massive gene loss (with preservation of some male genes), shaped by sex-antagonistic selection and complemented by occasional gains of male-related genes. The net result of these processes is a male-specialized chromosome. This might be expected to be an irreversible process, but it was found in 2005 that the Drosophila pseudoobscura Y chromosome was incorporated into an autosome. Y chromosome incorporations have important consequences: a formerly male-restricted chromosome reverts to autosomal inheritance, and the species may shift from an XY/XX to X0/XX sex-chromosome system. In order to assess the frequency and causes of this phenomenon we searched for Y chromosome incorporations in 400 species from Drosophila and related genera. We found one additional large scale event of Y chromosome incorporation, affecting the whole montium subgroup (40 species in our sample); overall 13% of the sampled species (52/400) have Y incorporations. While previous data indicated that after the Y incorporation the ancestral Y disappeared as a free chromosome, the much larger data set analyzed here indicates that a copy of the Y survived as a free chromosome both in montium and pseudoobscura species, and that the current Y of the pseudoobscura lineage results from a fusion between this free Y and the neoY. The 400 species sample also showed that the previously suggested causal connection between X-autosome fusions and Y incorporations is, at best, weak: the new case of Y incorporation (montium) does not have X-autosome fusion, whereas nine independent cases of X-autosome fusions were not followed by Y incorporations. Y incorporation is an underappreciated mechanism affecting Y chromosome evolution; our results show that at least in Drosophila it plays a relevant role and highlight the need of similar studies in other groups.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.