Menu
July 7, 2019

NOVOPlasty: de novo assembly of organelle genomes from whole genome data.

The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty.


July 7, 2019

LoRTE: Detecting transposon-induced genomic variants using low coverage PacBio long read sequences.

Population genomic analysis of transposable elements has greatly benefited from recent advances of sequencing technologies. However, the short size of the reads and the propensity of transposable elements to nest in highly repeated regions of genomes limits the efficiency of bioinformatic tools when Illumina or 454 technologies are used. Fortunately, long read sequencing technologies generating read length that may span the entire length of full transposons are now available. However, existing TE population genomic softwares were not designed to handle long reads and the development of new dedicated tools is needed.LoRTE is the first tool able to use PacBio long read sequences to identify transposon deletions and insertions between a reference genome and genomes of different strains or populations. Tested against simulated and genuine Drosophila melanogaster PacBio datasets, LoRTE appears to be a reliable and broadly applicable tool to study the dynamic and evolutionary impact of transposable elements using low coverage, long read sequences.LoRTE is an efficient and accurate tool to identify structural genomic variants caused by TE insertion or deletion. LoRTE is available for download at http://www.egce.cnrs-gif.fr/?p=6422.


July 7, 2019

The comparative landscape of duplications in Heliconius melpomene and Heliconius cydno.

Gene duplications can facilitate adaptation and may lead to interpopulation divergence, causing reproductive isolation. We used whole-genome resequencing data from 34 butterflies to detect duplications in two Heliconius species, Heliconius cydno and Heliconius melpomene. Taking advantage of three distinctive signals of duplication in short-read sequencing data, we identified 744 duplicated loci in H. cydno and H. melpomene and evaluated the accuracy of our approach using single-molecule sequencing. We have found that duplications overlap genes significantly less than expected at random in H. melpomene, consistent with the action of background selection against duplicates in functional regions of the genome. Duplicate loci that are highly differentiated between H. melpomene and H. cydno map to four different chromosomes. Four duplications were identified with a strong signal of divergent selection, including an odorant binding protein and another in close proximity with a known wing colour pattern locus that differs between the two species. Heredity advance online publication, 7 December 2016; doi:10.1038/hdy.2016.107.


July 7, 2019

Comparative genomics of extrachromosomal elements in Bacillus thuringiensis subsp. israelensis.

Bacillus thuringiensis subsp. israelensis is one of the most important microorganisms used against mosquitoes. It was intensively studied following its discovery and became a model bacterium of the B. thuringiensis species. Those studies focused on toxin genes, aggregation-associated conjugation, linear genome phages, etc. Recent announcements of genomic sequences of different strains have not been explicitly related to the biological properties studied. We report data on plasmid content analysis of four strains using ultra-high-throughput sequencing. The strains were commercial product isolates, with their putative ancestor and type B. thuringiensis subsp. israelensis strain sequenced earlier. The assembled contigs corresponding to published and novel data were assigned to plasmids described earlier in B. thuringiensis subsp. israelensis and other B. thuringiensis strains. A new 360 kb plasmid was identified, encoding multiple transporters, also found in most of the earlier sequenced strains. Our genomic data show the presence of two toxin-coding plasmids of 128 and 100 kb instead of the reported 225 kb plasmid, a co-integrate of the former two. In two of the sequenced strains, only a 100 kb plasmid was present. Some heterogeneity exists in the small plasmid content and structure between strains. These data support the perception of active plasmid exchange among B. thuringiensis subsp. israelensis strains in nature. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.


July 7, 2019

Comparative mitogenomic analysis of three species of periwinkles: Littorina fabalis, L. obtusata and L. saxatilis.

The flat periwinkles, Littorina fabalis and L. obtusata, offer an interesting system for local adaptation and ecological speciation studies. In order to provide genomic resources for these species, we sequenced their mitogenomes together with that of the rough periwinkle L. saxatilis by means of next-generation sequencing technologies. The three mitogenomes present the typical repertoire of 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and a putative control region. Although the latter could not be fully recovered in flat periwinkles using short-reads due to a highly repetitive fragment, in L. saxatilis this problem was overcome with additional long-reads and we were able to assemble the complete mitogenome. Both gene order and nucleotide composition are similar between the three species as well as compared to other Littorinimorpha. A large variance in divergence was observed across mitochondrial regions, with six- to ten-fold difference between the highest and the lowest divergence rates. Based on nucleotide changes on the whole molecule and assuming a molecular clock, L. fabalis and L. obtusata started to diverge around 0.8 Mya (0.4-1.1 Mya). The evolution of the mitochondrial protein-coding genes in the three Littorina species appears mainly influenced by purifying selection as revealed by phylogenetic tests based on dN/dS ratios that did not detect any evidence for positive selection, although some caution is required given the limited power of the dataset and the implemented approaches. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Competition assays and physiological experiments of soil and phyllosphere yeasts identify Candida subhashii as a novel antagonist of filamentous fungi.

While recent advances in next generation sequencing technologies have enabled researchers to readily identify countless microbial species in soil, rhizosphere, and phyllosphere microbiomes, the biological functions of the majority of these species are unknown. Functional studies are therefore urgently needed in order to characterize the plethora of microorganisms that are being identified and to point out species that may be used for biotechnology or plant protection. Here, we used a dual culture assay and growth analyses to characterise yeasts (40 different isolates) and their antagonistic effect on 16 filamentous fungi; comprising plant pathogens, antagonists, and saprophytes.Overall, this competition screen of 640 pairwise combinations revealed a broad range of outcomes, ranging from small stimulatory effects of some yeasts up to a growth inhibition of more than 80% by individual species. On average, yeasts isolated from soil suppressed filamentous fungi more strongly than phyllosphere yeasts and the antagonistic activity was a species-/isolate-specific property and not dependent on the filamentous fungus a yeast was interacting with. The isolates with the strongest antagonistic activity were Metschnikowia pulcherrima, Hanseniaspora sp., Cyberlindnera sargentensis, Aureobasidium pullulans, Candida subhashii, and Pichia kluyveri. Among these, the soil yeasts (C. sargentensis, A. pullulans, C. subhashii) assimilated and/or oxidized more di-, tri- and tetrasaccharides and organic acids than yeasts from the phyllosphere. Only the two yeasts C. subhashii and M. pulcherrima were able to grow with N-acetyl-glucosamine as carbon source.The competition assays and physiological experiments described here identified known antagonists that have been implicated in the biological control of plant pathogenic fungi in the past, but also little characterised species such as C. subhashii. Overall, soil yeasts were more antagonistic and metabolically versatile than yeasts from the phyllosphere. Noteworthy was the strong antagonistic activity of the soil yeast C. subhashii, which had so far only been described from a clinical sample and not been studied with respect to biocontrol. Based on binary competition assays and growth analyses (e.g., on different carbon sources, growth in root exudates), C. subhashii was identified as a competitive and antagonistic soil yeast with potential as a novel biocontrol agent against plant pathogenic fungi.


July 7, 2019

Wild tobacco genomes reveal the evolution of nicotine biosynthesis.

Nicotine, the signature alkaloid of Nicotiana species responsible for the addictive properties of human tobacco smoking, functions as a defensive neurotoxin against attacking herbivores. However, the evolution of the genetic features that contributed to the assembly of the nicotine biosynthetic pathway remains unknown. We sequenced and assembled genomes of two wild tobaccos, Nicotiana attenuata (2.5 Gb) and Nicotiana obtusifolia (1.5 Gb), two ecological models for investigating adaptive traits in nature. We show that after the Solanaceae whole-genome triplication event, a repertoire of rapidly expanding transposable elements (TEs) bloated these Nicotiana genomes, promoted expression divergences among duplicated genes, and contributed to the evolution of herbivory-induced signaling and defenses, including nicotine biosynthesis. The biosynthetic machinery that allows for nicotine synthesis in the roots evolved from the stepwise duplications of two ancient primary metabolic pathways: the polyamine and nicotinamide adenine dinucleotide (NAD) pathways. In contrast to the duplication of the polyamine pathway that is shared among several solanaceous genera producing polyamine-derived tropane alkaloids, we found that lineage-specific duplications within the NAD pathway and the evolution of root-specific expression of the duplicated Solanaceae-specific ethylene response factor that activates the expression of all nicotine biosynthetic genes resulted in the innovative and efficient production of nicotine in the genus Nicotiana Transcription factor binding motifs derived from TEs may have contributed to the coexpression of nicotine biosynthetic pathway genes and coordinated the metabolic flux. Together, these results provide evidence that TEs and gene duplications facilitated the emergence of a key metabolic innovation relevant to plant fitness.


July 7, 2019

Complete genome analysis of Serratia marcescens RSC-14: A plant growth-promoting bacterium that alleviates cadmium stress in host plants.

Serratia marcescens RSC-14 is a Gram-negative bacterium that was previously isolated from the surface-sterilized roots of the Cd-hyperaccumulator Solanum nigrum. The strain stimulates plant growth and alleviates Cd stress in host plants. To investigate the genetic basis for these traits, the complete genome of RSC-14 was obtained by single-molecule real-time sequencing. The genome of S. marcescens RSC-14 comprised a 5.12-Mbp-long circular chromosome containing 4,593 predicted protein-coding genes, 22 rRNA genes, 88 tRNA genes, and 41 pseudogenes. It contained genes with potential functions in plant growth promotion, including genes involved in indole-3-acetic acid (IAA) biosynthesis, acetoin synthesis, and phosphate solubilization. Moreover, annotation using NCBI and Rapid Annotation using Subsystem Technology identified several genes that encode antioxidant enzymes as well as genes involved in antioxidant production, supporting the observed resistance towards heavy metals, such as Cd. The presence of IAA pathway-related genes and oxidative stress-responsive enzyme genes may explain the plant growth-promoting potential and Cd tolerance, respectively. This is the first report of a complete genome sequence of Cd-tolerant S. marcescens and its plant growth promotion pathway. The whole-genome analysis of this strain clarified the genetic basis underlying its phenotypic and biochemical characteristics, underpinning the beneficial interactions between RSC-14 and plants.


July 7, 2019

Genome of the pitcher plant Cephalotus reveals genetic changes associated with carnivory

Carnivorous plants exploit animals as a nutritional source and have inspired long-standing questions about the origin and evolution of carnivory-related traits. To investigate the molecular bases of carnivory, we sequenced the genome of the heterophyllous pitcher plant Cephulotus folliculnris, in which we succeeded in regulating the developmental switch between carnivorous and non-carnivorous leaves. Transcriptome comparison of the two leaf types and gene repertoire analysis identi- fied genetic changes associated with prey attraction, capture, digestion and nutrient absorption. Analysis of digestive fluid proteins from C. folliculnris and three other carnivorous plants with independent carnivorous origins revealed repeated co-options of stress-responsive protein lineages coupled with convergent amino acid substitutions to acquire digestive physiology. These results imply constraints on the available routes to evolve plant carnivory.


July 7, 2019

Complex modular architecture around a simple toolkit of wing pattern genes

Identifying the genomic changes that control morphological variation and understanding how they generate diversity is a major goal of evolutionary biology. In Heliconius butterflies, a small number of genes control the development of diverse wing colour patterns. Here, we used full-genome sequencing of individuals across the Heliconius erato radiation and closely related species to characterize genomic variation associated with wing pattern diversity. We show that variation around colour pattern genes is highly modular, with narrow genomic intervals associated with specific differences in colour and pattern. This modular architecture explains the diversity of colour patterns and provides a flexible mechanism for rapid morphological diversification.


July 7, 2019

The value of new genome references.

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.


July 7, 2019

The genome sequence of Barbarea vulgaris facilitates the study of ecological biochemistry.

The genus Barbarea has emerged as a model for evolution and ecology of plant defense compounds, due to its unusual glucosinolate profile and production of saponins, unique to the Brassicaceae. One species, B. vulgaris, includes two ‘types’, G-type and P-type that differ in trichome density, and their glucosinolate and saponin profiles. A key difference is the stereochemistry of hydroxylation of their common phenethylglucosinolate backbone, leading to epimeric glucobarbarins. Here we report a draft genome sequence of the G-type, and re-sequencing of the P-type for comparison. This enables us to identify candidate genes underlying glucosinolate diversity, trichome density, and study the genetics of biochemical variation for glucosinolate and saponins. B. vulgaris is resistant to the diamondback moth, and may be exploited for “dead-end” trap cropping where glucosinolates stimulate oviposition and saponins deter larvae to the extent that they die. The B. vulgaris genome will promote the study of mechanisms in ecological biochemistry to benefit crop resistance breeding.


July 7, 2019

Genomic sequence of ‘Candidatus Liberibacter solanacearum’ haplotype C and its comparison with haplotype A and B genomes.

Haplotypes A and B of ‘Candidatus Liberibacter solanacearum’ (CLso) are associated with diseases of solanaceous plants, especially Zebra chip disease of potato, and haplotypes C, D and E are associated with symptoms on apiaceous plants. To date, one complete genome of haplotype B and two high quality draft genomes of haplotype A have been obtained for these unculturable bacteria using metagenomics from the psyllid vector Bactericera cockerelli. Here, we present the first genomic sequences obtained for the carrot-associated CLso. These two genomic sequences of haplotype C, FIN114 (1.24 Mbp) and FIN111 (1.20 Mbp), were obtained from carrot psyllids (Trioza apicalis) harboring CLso. Genomic comparisons between the haplotypes A, B and C revealed that the genome organization differs between these haplotypes, due to large inversions and other recombinations. Comparison of protein-coding genes indicated that the core genome of CLso consists of 885 ortholog groups, with the pan-genome consisting of 1327 ortholog groups. Twenty-seven ortholog groups are unique to CLso haplotype C, whilst 11 ortholog groups shared by the haplotypes A and B, are not found in the haplotype C. Some of these ortholog groups that are not part of the core genome may encode functions related to interactions with the different host plant and psyllid species.


July 7, 2019

Genome scaffolding and annotation for the pathogen vector Ixodes ricinus by ultra-long single molecule sequencing.

Global warming and other ecological changes have facilitated the expansion of Ixodes ricinus tick populations. Ixodes ricinus is the most important carrier of vector-borne pathogens in Europe, transmitting viruses, protozoa and bacteria, in particular Borrelia burgdorferi (sensu lato), the causative agent of Lyme borreliosis, the most prevalent vector-borne disease in humans in the Northern hemisphere. To faster control this disease vector, a better understanding of the I. ricinus tick is necessary. To facilitate such studies, we recently published the first reference genome of this highly prevalent pathogen vector. Here, we further extend these studies by scaffolding and annotating the first reference genome by using ultra-long sequencing reads from third generation single molecule sequencing. In addition, we present the first genome size estimation for I. ricinus ticks and the embryo-derived cell line IRE/CTVM19.235,953 contigs were integrated into 204,904 scaffolds, extending the currently known genome lengths by more than 30% from 393 to 516 Mb and the N50 contig value by 87% from 1643 bp to a N50 scaffold value of 3067 bp. In addition, 25,263 sequences were annotated by comparison to the tick’s North American relative Ixodes scapularis. After (conserved) hypothetical proteins, zinc finger proteins, secreted proteins and P450 coding proteins were the most prevalent protein categories annotated. Interestingly, more than 50% of the amino acid sequences matching the homology threshold had 95-100% identity to the corresponding I. scapularis gene models. The sequence information was complemented by the first genome size estimation for this species. Flow cytometry-based genome size analysis revealed a haploid genome size of 2.65Gb for I. ricinus ticks and 3.80 Gb for the cell line.We present a first draft sequence map of the I. ricinus genome based on a PacBio-Illumina assembly. The I. ricinus genome was shown to be 26% (500 Mb) larger than the genome of its American relative I. scapularis. Based on the genome size of 2.65 Gb we estimated that we covered about 67% of the non-repetitive sequences. Genome annotation will facilitate screening for specific molecular pathways in I. ricinus cells and provides an overview of characteristics and functions.


July 7, 2019

Draft genome sequence of Karnal bunt pathogen (Tilletia indica) of wheat provides insights into the pathogenic mechanisms of quarantined fungus.

Karnal bunt disease in wheat is caused by hemibiotrophic fungus, Tilletia indica that has been placed as quarantine pest in more than 70 countries. Despite its economic importance, little knowledge about the molecular components of fungal pathogenesis is known. In this study, first time the genome sequence of T. indica has been deciphered for unraveling the effectors’ functions of molecular pathogenesis of Karnal bunt disease. The T. indica genome was sequenced employing hybrid approach of PacBio Single Molecule Real Time (SMRT) and Illumina HiSEQ 2000 sequencing platforms. The genome was assembled into 10,957 contigs (N50 contig length 3 kb) with total size of 26.7 Mb and GC content of 53.99%. The number of predicted putative genes were 11,535, which were annotated with Gene Ontology databases. Functional annotation of Karnal bunt pathogen genome and classification of identified effectors into protein families revealed interesting functions related to pathogenesis. Search for effectors’ genes using pathogen host interaction database identified 135 genes. The T. indica genome sequence and putative genes involved in molecular pathogenesis would further help in devising novel and effective disease management strategies including development of resistant wheat genotypes, novel biomarkers for pathogen detection and new targets for fungicide development.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.