Menu
July 7, 2019

Ten steps to get started in Genome Assembly and Annotation.

As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).


July 7, 2019

FMLRC: Hybrid long read error correction using an FM-index.

Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy.We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods.Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.


July 7, 2019

The odyssey of the ancestral Escherich strain through culture collections: an example of allopatric diversification.

More than a century ago, Theodor Escherich isolated the bacterium that was to become Escherichia coli, one of the most studied organisms. Not long after, the strain began an odyssey and landed in many laboratories across the world. As laboratory culture conditions could be responsible for major changes in bacterial strains, we conducted a genome analysis of isolates of this emblematic strain from different culture collections (England, France, the United States, Germany). Strikingly, many discrepancies between the isolates were observed, as revealed by multilocus sequence typing (MLST), the presence of virulence-associated genes, core genome MLST, and single nucleotide polymorphism/indel analyses. These differences are correlated with the phylogeographic history of the strain and were due to an unprecedented number of mutations in coding DNA repair functions such as mismatch repair (MutL) and oxidized guanine nucleotide pool cleaning (MutT), conferring a specific mutational spectrum and leading to a mutator phenotype. The mutator phenotype was probably acquired during subculturing and corresponded to second-order selection. Furthermore, all of the isolates exhibited hypersusceptibility to antibiotics due to mutations in efflux pump- and porin-encoding genes, as well as a specific mutation in the sigma factor-encoding generpoS. These defects reflect a self-preservation and nutritional competence tradeoff allowing survival under the starvation conditions imposed by storage. From a clinical point of view, dealing with such mutator strains can lead microbiologists to draw false conclusions about isolate relatedness and may impact therapeutic effectiveness. IMPORTANCE Mutator phenotypes have been described in laboratory-evolved bacteria, as well as in natural isolates. Several genes can be impacted, each of them being associated with a typical mutational spectrum. By studying one of the oldest strains available, the ancestral Escherich strain, we were able to identify its mutator status leading to tremendous genetic diversity among the isolates from various collections and allowing us to reconstruct the phylogeographic history of the strain. This mutator phenotype was probably acquired during the storage of the strain, promoting adaptation to a specific environment. Other mutations inrpoSand efflux pump- and porin-encoding genes highlight the acclimatization of the strain through self-preservation and nutritional competence regulation. This strain history can be viewed as unintentional experimental evolution in culture collections all over the word since 1885, mimicking the long-term experimental evolution ofE. coliof Lenski et al. (O. Tenaillon, J. E. Barrick, N. Ribeck, D. E. Deatherage, J. L. Blanchard, A. Dasgupta, G. C. Wu, S. Wielgoss, S. Cruveiller, C. Médigue, D. Schneider, and R. E. Lenski, Nature 536:165-170, 2016, https://doi.org/10.1038/nature18959) that shares numerous molecular features.


July 7, 2019

Oryza rufipogon Griff.

Oryza rufipogon, the progenitor of present-day cultivated rice, O. sativa, is one of the most studied wild species of rice. It is a perennial plant commonly found in a marsh or aquatic habitats of eastern and southern Asia. It has partial outcrossing behavior and is photoperiod sensitive. The flowering time usually ranges between September and November. It has been and is being exploited as a source of valuable genes and QTLs for yield components as well as resistance against biotic and abiotic stresses. A number of populations like chromosome segment substitution lines, backcross inbred lines, near-isogenic lines, and recombinant inbred lines have been developed from crosses between O. rufipogon and O. sativa as a prebreeding resource. These are being employed for broadening the genetic base of cultivated rice and diversify the breeder’s pool. With the advent of sequencing technologies, a number of phylogenetic studies have been conducted to reveal the evolutionary relationship of O. rufipogon with cultivated rice O. sativa. Further, transcriptomic studies characterizing the effect of various abiotic stresses have been conducted on this wild species. Role of miRNA under stress reaction has also been studied. Though the genetic, genomic, and transcriptomic resources are abundant, the proteomic resources for O. rufipogon are limited.


July 7, 2019

IWTomics: testing high-resolution sequence-based ‘Omics’ data at multiple locations and scales.

With increased generation of high-resolution sequence-based ‘Omics’ data, detecting statistically significant effects at different genomic locations and scales has become key to addressing several scientific questions. IWTomics is an R/Bioconductor package (integrated in Galaxy) that, exploiting sophisticated Functional Data Analysis techniques (i.e. statistical techniques that deal with the analysis of curves), allows users to pre-process, visualize and test these data at multiple locations and scales. The package provides a friendly, flexible and complete workflow that can be employed in many genomic and epigenomic applications.IWTomics is freely available at the Bioconductor website (http://bioconductor.org/packages/IWTomics) and on the main Galaxy instance (https://usegalaxy.org/).Supplementary data are available at Bioinformatics online.


July 7, 2019

The ‘gifted’ actinomycete Streptomyces leeuwenhoekii.

Streptomyces leeuwenhoekii strains C34T, C38, C58 and C79 were isolated from a soil sample collected from the Chaxa Lagoon, located in the Salar de Atacama in northern Chile. These streptomycetes produce a variety of new specialised metabolites with antibiotic, anti-cancer and anti-inflammatory activities. Moreover, genome mining performed on two of these strains has revealed the presence of biosynthetic gene clusters with the potential to produce new specialised metabolites. This review focusses on this new clade of Streptomyces strains, summarises the literature and presents new information on strain C34T.


July 7, 2019

Natural rubber and the Russian dandelion genome

The world needs rubber. Rubber is crucial for the tires on the cars, trucks and airplanes that propel modern transportation. It is equally important for daily tasks: latex gloves in the lab, balloons in angioplasty and wetsuits that warm a cold dip in the ocean. Rubber can be made synthetically from petroleum derivatives, but synthetic rubber is not as strong as rubber iso- lated from plants. The principal plant source for natural rubber (NR) is the sap of the Par´ a tree (Hevea brasiliensis), which is grown throughout Southeast Asia. Unfortunately, the produc- tion capacity of the Par´ a tree is limited by the availability of suitable land and by labor-intensive harvesting methods. The sustainability of the Par´ a crop is also constrained by its narrow genetic base, which may make the crop susceptible to disease.


July 7, 2019

Rooting for new sources of natural rubber

Global production of natural rubber (NR) depends overwhelmingly on the Pará rubber tree (Hevea brasiliensis), a slow-growing tropical tree that is threatened by low genetic diversity and high susceptibility to fungal blight [1]. Alternative rubber sources have been sought for more than a century, but very few species have been found that produce rubber of comparable quality [2]. One of the brightest candidates, first noticed by breeders in Soviet-era Russia, is Taraxacum kok-saghyz (commonly called TKS). This close relative of the common weedy dandelion has a number of attractive features. As a native of central Asia, TKS can be cultivated as a hardy, annual field crop in temperate climates. Its natural latex, produced at highest levels in the roots, yields a high-molecular-weight NR that is chemically similar to the rubber tree and far superior to synthetic rubber. And, as an added bonus, TKS produces inulin, a dietary fiber and low-glycemic-index sweetener that can be fermented for industrial bioethanol production. What TKS has lacked—until now—is an assembled reference genome that could be used for genome-enabled crop improvement and elucidation of the pathways for rubber and inulin biosynthesis. In their paper published in this issue, Jiayang Li, Hong Yu and colleagues [3] have taken a major step in rectifying that problem.


July 7, 2019

Gapless genome assembly of the potato and tomato early blight pathogen Alternaria solani.

The Alternaria genus consists of saprophytic fungi as well as plant-pathogenic species that have significant economic impact. To date, the genomes of multiple Alternaria species have been sequenced. These studies have yielded valuable data for molecular studies on Alternaria fungi. However, most of the current Alternaria genome assemblies are highly fragmented, thereby hampering the identification of genes that are involved in causing disease. Here, we report a gapless genome assembly of A. solani, the causal agent of early blight in tomato and potato. The genome assembly is a significant step toward a better understanding of pathogenicity of A. solani.


July 7, 2019

Genome sequencing to develop Paenibacillus donghaensis strain JH8T (KCTC 13049T=LMG 23780T) as a microbial fertilizer and correlation to its plant growth-promoting phenotype

Paenibacillus donghaensis JH8T (KCTC 13049T=LMG 23780T) is a Gram-positive, mesophilic, endospore-forming bacterium isolated from East Sea sediment at depth of 500m in Korea. The strain exhibited plant cell wall hydrolytic and plant growth promoting abilities. The complete genome of P. donghaensis strain JH8T contains 7602 protein-coding sequences and an average GC content of 49.7% in its chromosome (8.54Mbp). Genes encoding proteins related to the degradation of plant cell wall, nitrogen-fixation, phosphate solubilization, and synthesis of siderophore were existed in the P. donghaensis strain JH8T genome, indicating that this strain can be used as an eco-friendly microbial agent for increasing agricultural productivity.


July 7, 2019

Complete genome sequence of Tsukamurella sp. MH1: A wide-chain length alkane-degrading actinomycete.

Tsukamurella sp. strain MH1, capable to use a wide range of n-alkanes as the only carbon source, was isolated from petroleum-contaminated soil (Pite?ti, Romania) and its complete genome was sequenced. The 4,922,396?bp genome contains only one circular chromosome with a G?+?C content of 71.12%, much higher than the type strains of this genus (68.4%). Based on the 16S rRNA genes sequence similarity, strain MH1 was taxonomically identified as Tsukamurella carboxydivorans. Genome analyses revealed that strain MH1 is harboring only one gene encoding for the alkB-like hydroxylase, arranged in a complete alkane monooxygenase operon. This is the first complete genome of the specie T. carboxydivorans, which will provide insights into the potential of Tsukamurella sp. MH1 and related strains for bioremediation of petroleum hydrocarbons-contaminated sites and into the environmental role of these bacteria. Copyright © 2017. Published by Elsevier B.V.


July 7, 2019

Synthetic biology, genome mining, and combinatorial biosynthesis of NRPS-derived antibiotics: a perspective.

Combinatorial biosynthesis of novel secondary metabolites derived from nonribosomal peptide synthetases (NRPSs) has been in slow development for about a quarter of a century. Progress has been hampered by the complexity of the giant multimodular multienzymes. More recently, advances have been made on understanding the chemical and structural biology of these complex megaenzymes, and on learning the design rules for engineering functional hybrid enzymes. In this perspective, I address what has been learned about successful engineering of complex lipopeptides related to daptomycin, and discuss how synthetic biology and microbial genome mining can converge to broaden the scope and enhance the speed and robustness of combinatorial biosynthesis of NRPS-derived natural products for drug discovery.


July 7, 2019

Sustaining global agriculture through rapid detection and deployment of genetic resistance to deadly crop diseases.

Contents Summary 45 I. Introduction 45 II. Targeted chromosome-based cloning via long-range assembly (TACCA) 46 III. Resistance gene cloning through mutational mapping (MutMap) 47 IV. Cloning through mutant chromosome sequencing (MutChromSeq) 47 V. Rapid cloning through resistance gene enrichment and sequencing (RenSeq) 49 VI. Cloning resistance genes through transcriptome profiling (RNAseq) 49 VII. Resistance gene deployment strategies 49 VIII. Conclusions 50 Acknowledgements 50 References 50 SUMMARY: Genetically encoded resistance is a major component of crop disease management. Historically, gene loci conferring resistance to pathogens have been identified through classical genetic methods. In recent years, accelerated gene cloning strategies have become available through advances in sequencing, gene capture and strategies for reducing genome complexity. Here, I describe these approaches with key emphasis on the isolation of resistance genes to the cereal crop diseases that are an ongoing threat to global food security. Rapid gene isolation enables their efficient deployment through marker-assisted selection and transgenic technology. Together with innovations in genome editing and progress in pathogen virulence studies, this creates further opportunities to engineer long-lasting resistance. These approaches will speed progress towards a future of farming using fewer pesticides.© 2017 Commonwealth of Australia. New Phytologist © 2017 New Phytologist Trust.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.