Menu
July 7, 2019

The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation.In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis.These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).


July 7, 2019

Absence of genome reduction in diverse, facultative endohyphal bacteria.

Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, and generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.


July 7, 2019

Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform.

Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Implementation and data analysis of Tn-seq, whole genome resequencing, and single-molecule real time sequencing for bacterial genetics.

Few discoveries have been more transformative to the biological sciences than the development of DNA sequencing technologies. The rapid advancement of sequencing and bioinformatics tools has revolutionized bacterial genetics, deepening our understanding of model and clinically relevant organisms. Although application of newer sequencing technologies to studies in bacterial genetics is increasing, the implementation of DNA sequencing technologies and development of the bioinformatics tools required for analyzing the large data sets generated remains a challenge for many. In this minireview, we have chosen to summarize three sequencing approaches that are particularly useful for bacterial genetics. We provide resources for scientists new to and interested in their application. Herein, we discuss the analysis of Tn-seq data to determine gene disruptions differentially represented in a mutant population, Illumina sequencing for identification of suppressor or other mutations, and we summarize single-molecule real time (SMRT) sequencing for de novo genome assembly and the use of the output data for detection of DNA base modifications. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


July 7, 2019

Genomic analysis of the multi-drug-resistant clinical isolate Myroides odoratimimus PR63039.

Myroides odoratimimus (M. odoratimimus) has been gradually implicated as an important nosocomial pathogen that poses a serious health threat to immunocompromised patients owing to its multi-drug resistance. However, the resistance mechanism is currently unclear. To clarify the antibiotic resistance and infectivity mechanisms of M. odoratimimus, whole genome sequencing was performed on the multi-drug-resistant M. odoratimimus strain PR63039. The genome sequence was completed with single molecule real-time (SMRT) technologies. Then, annotation was performed using RAST and IMG-ER. A number of databases and software programs were used to analyze the genomic characteristics, including GC-Profile, ISfinder, CG viewer, ARDB, CARD, ResFinder, the VFDB database, PHAST and Progressive Mauve. The M. odoratimimus PR63039 genome consisted of a chromosome and a plasmid. The genome contained a large number of resistance genes and virulence factors. The distribution of the resistance genes was distinctive, and a resistance region named MY63039-RR was found. The subsystem features generated by RAST indicated that the annotated genome had 108 genes that were potentially involved in virulence, disease and defense, all of which had strong associations with resistance and pathogenicity. The prophage analysis showed two incomplete prophages in the genome. The genomic analysis of M. odoratimimus PR63039 partially clarified its antibiotic resistance mechanisms and virulence factors. Obtaining a clear understanding of its genomic characteristics will be conducive to the management of multidrug-resistant M. odoratimimus.


July 7, 2019

Current status of genome sequencing and its applications in aquaculture

Aquaculture is the fastest-growing food production sector in agriculture, with great potential to meet projected protein needs of human beings. Aquaculture is facing several challenges, including lack of a sufficient number of genetically improved species, lack of species-specific feeds, high mortality due to diseases and pollution of ecosystems. The rapid development of sequencing technologies has revolutionized biological sciences, and supplied necessary tools to tackle these challenges in aquaculture and thus ensure its sustainability and profitability. So far, draft genomes have been published in over 24 aquaculture species, and used to address important issues related to aquaculture. We briefly review the advances of next generation sequencing technologies, and summarize the status of whole genome sequencing and its general applications (i.e. establishing reference genomes and discovering DNA markers) and specific applications in tackling some important issues (e.g. breeding, diseases, sex determination & maturation) related to aquaculture. For sequencing a new genome, we recommend the use of 100–200 × short reads using Illumina and 50–60 × long reads with PacBio sequencing technologies. For identification of a large number of SNPs, resequencing pooled DNA samples from different populations is the most cost-effective way. We also discuss the challenges and future directions of whole genome sequencing in aquaculture.


July 7, 2019

Genome organization of the vg1 and nodal3 gene clusters in the allotetraploid frog Xenopus laevis.

Extracellular factors belonging to the TGF-ß family play pivotal roles in the formation and patterning of germ layers during early Xenopus embryogenesis. Here, we show that the vg1 and nodal3 genes of Xenopus laevis are present in gene clusters on chromosomes XLA1L and XLA3L, respectively, and that both gene clusters have been completely lost from the syntenic S chromosome regions. The presence of gene clusters and chromosome-specific gene loss were confirmed by cDNA FISH analyses. Sequence and expression analyses revealed that paralogous genes in the vg1 and nodal3 clusters on the L chromosomes were also altered compared to their Xenopus tropicalis orthologs. X. laevis vg1 and nodal3 paralogs have potentially become pseudogenes or sub-functionalized genes and are expressed at different levels. As X. tropicalis has a single vg1 gene on chromosome XTR1, the ancestral vg1 gene in X. laevis appears to have been expanded on XLA1L. Of note, two reported vg1 genes, vg1(S20) and vg1(P20), reside in the cluster on XLA1L. The nodal3 gene cluster is also present on X. tropicalis chromosome XTR3, but phylogenetic analysis indicates that nodal3 genes in X. laevis and X. tropicalis were independently expanded and/or evolved in concert within each cluster by gene conversion. These findings provide insights into the function and molecular evolution of TGF-ß family genes in response to allotetraploidization. Copyright © 2016 Elsevier Inc. All rights reserved.


July 7, 2019

A gapless genome sequence of the fungus Botrytis cinerea.

Following earlier incomplete and fragmented versions of a genome sequence for the grey mould Botrytis cinerea, we here report a gapless, near-finished genome sequence for B. cinerea strain B05.10. The assembly comprises 18 chromosomes and was confirmed by an optical map and a genetic map based on ~75 000 SNP markers. All chromosomes contain fully assembled centromeric regions, and 10 chromosomes have telomeres on both ends. The genetic map consisted of 4153 cM and comparison of genetic distances with the physical distances identified 40 recombination hotspots. The linkage map also identified two mutations, located in the previously described genes Bos1 and BcsdhB, that confer resistance to the fungicides boscalid and iprodione. The genome was predicted to encode 11 701 proteins. RNAseq data from >20 different samples were used to validate and improve gene models. Manual curation of chromosome 1 revealed interesting features, such as the occurrence of a dicistronic transcript and fully overlapping genes in opposite orientations, as well as many spliced antisense transcripts. Manual curation also revealed that UTRs of genes can be complex and long, with many UTRs exceeding lengths of 1 kb and possessing multiple introns. Community annotation is in progress. This article is protected by copyright. All rights reserved. © 2016 BSPP AND JOHN WILEY & SONS LTD.


July 7, 2019

What distinguishes cyanobacteria able to revive after desiccation from those that cannot: the genome aspect.

Filamentous cyanobacteria are the main founders and primary producers in biological desert soil crusts (BSCs) and are likely equipped to cope with one of the harshest environmental conditions on earth including daily hydration/dehydration cycles, high irradiance and extreme temperatures. Here, we resolved and report on the genome sequence of Leptolyngbya ohadii, an important constituent of the BSC. Comparative genomics identified a set of genes present in desiccation-tolerant but not in dehydration-sensitive cyanobacteria. RT qPCR analyses showed that the transcript abundance of many of them is upregulated during desiccation in L. ohadii. In addition, we identified genes where the orthologs detected in desiccation-tolerant cyanobacteria differs substantially from that found in desiccation-sensitive cells. We present two examples, treS and fbpA (encoding trehalose synthase and fructose 1,6-bisphosphate aldolase respectively) where, in addition to the orthologs present in the desiccation-sensitive strains, the resistant cyanobacteria also possess genes with different predicted structures. We show that in both cases the two orthologs are transcribed during controlled dehydration of L. ohadii and discuss the genetic basis for the acclimation of cyanobacteria to the desiccation conditions in desert BSC.© 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.


July 7, 2019

Strategies for complete plastid genome sequencing.

Plastid sequencing is an essential tool in the study of plant evolution. This high-copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low-cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation-sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short-range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.© 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.


July 7, 2019

Draft genome assembly and annotation of Glycyrrhiza uralensis, a medicinal legume.

Chinese liquorice/licorice (Glycyrrhiza uralensis) is a leguminous plant species whose roots and rhizomes have been widely used as a herbal medicine and natural sweetener. Whole-genome sequencing is essential for gene discovery studies and molecular breeding in liquorice. Here, we report a draft assembly of the approximately 379-Mb whole-genome sequence of strain 308-19 of G. uralensis; this assembly contains 34 445 predicted protein-coding genes. Comparative analyses suggested well-conserved genomic components and collinearity of gene loci (synteny) between the genome of liquorice and those of other legumes such as Medicago and chickpea. We observed that three genes involved in isoflavonoid biosynthesis, namely, 2-hydroxyisoflavanone synthase (CYP93C), 2,7,4′-trihydroxyisoflavanone 4′-O-methyltransferase/isoflavone 4′-O-methyltransferase (HI4OMT) and isoflavone-7-O-methyltransferase (7-IOMT) formed a cluster on the scaffold of the liquorice genome and showed conserved microsynteny with Medicago and chickpea. Based on the liquorice genome annotation, we predicted genes in the P450 and UDP-dependent glycosyltransferase (UGT) superfamilies, some of which are involved in triterpenoid saponin biosynthesis, and characterised their gene expression with the reference genome sequence. The genome sequencing and its annotations provide an essential resource for liquorice improvement through molecular breeding and the discovery of useful genes for engineering bioactive components through synthetic biology approaches.© 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.


July 7, 2019

Turkey meat as source of CC9/CC398 methicillin-resistant Staphylococcus aureus in humans?

Livestock-associated methicillin-resistant Staphylococcus aureus (MRSA) of clonal complex (CC) 398 were first reported to cause severe infections in humans in 2005 [1]. Direct animal exposure is considered the most effective means of MRSA CC398 transmission from livestock to humans. However, about 20%–38% of MRSA CC398 cases among humans cannot be epidemiologically linked to direct livestock contact, indicating other transmission pathways [2]. As recently reported in this journal by Larsen et al [3], poultry meat may serve as a vehicle for livestock-to-human transmission. Here, we present similar findings for CC9/CC398 MRSA (displaying spa type t899 and related), which shares unique characteristics with human clinical isolates in Denmark as shown by Larsen et al [3], strongly supporting the implication of poultry, especially turkey meat, as the source of CC9/CC398.


July 7, 2019

Draft genome sequence of Mentha longifolia (L.) and development of resources for mint cultivar improvement.

The genus Mentha encompasses mint species cultivated for their essential oils, which are formulated into a vast array of consumer products. Desirable oil characteristics and resistance to the fungal disease Verticillium wilt are top priorities for the mint industry. However, cultivated mints have complex polyploid genomes and are sterile. Breeding efforts, therefore, require the development of genomic resources for fertile mint species. Here, we present draft de novo genome and plastome assemblies for a wilt-resistant South African accession of Mentha longifolia (L.) Huds., a diploid species ancestral to cultivated peppermint and spearmint. The 353 Mb genome contains 35 597 predicted protein-coding genes, including 292 disease resistance gene homologs, and nine genes determining essential oil characteristics. A genetic linkage map ordered 1397 genome scaffolds on 12 pseudochromosomes. More than two million simple sequence repeats were identified, which will facilitate molecular marker development. The M. longifolia genome is a valuable resource for both metabolic engineering and molecular breeding. This is exemplified by employing the genome sequence to clone and functionally characterize the promoters in a peppermint cultivar, and demonstrating the utility of a glandular trichome-specific promoter to increase expression of a biosynthetic gene, thereby modulating essential oil composition. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Comparative genomics of extrachromosomal elements in Bacillus thuringiensis subsp. israelensis.

Bacillus thuringiensis subsp. israelensis is one of the most important microorganisms used against mosquitoes. It was intensively studied following its discovery and became a model bacterium of the B. thuringiensis species. Those studies focused on toxin genes, aggregation-associated conjugation, linear genome phages, etc. Recent announcements of genomic sequences of different strains have not been explicitly related to the biological properties studied. We report data on plasmid content analysis of four strains using ultra-high-throughput sequencing. The strains were commercial product isolates, with their putative ancestor and type B. thuringiensis subsp. israelensis strain sequenced earlier. The assembled contigs corresponding to published and novel data were assigned to plasmids described earlier in B. thuringiensis subsp. israelensis and other B. thuringiensis strains. A new 360 kb plasmid was identified, encoding multiple transporters, also found in most of the earlier sequenced strains. Our genomic data show the presence of two toxin-coding plasmids of 128 and 100 kb instead of the reported 225 kb plasmid, a co-integrate of the former two. In two of the sequenced strains, only a 100 kb plasmid was present. Some heterogeneity exists in the small plasmid content and structure between strains. These data support the perception of active plasmid exchange among B. thuringiensis subsp. israelensis strains in nature. Copyright © 2016 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.


July 7, 2019

Brassica rapa genome 2.0: a reference upgrade through sequence re-assembly and gene re-annotation.

Brassica rapa includes many important crops that are cultivated as vegetables, condiments, and oilseeds. Recently, the Brassica genomes have been sequenced extensively: a B. rapa draft reference genome in 2011 (Wang et al., 2011), a Brassica oleracea in 2014 (Liu et al., 2014), a Brassica napus in 2014 (Chalhoub et al., 2014), and Brassica nigra and Brassica juncea in 2016 (Yang et al., 2016). The first released B. rapa genome reference served as a valuable resource in the genome assembly and annotation of the other Brassicas (Chalhoub et al., 2014, Liu et al., 2014, Parkin et al., 2014). B. rapa has been used widely in Brassica comparative and evolutionary genomics among the Brassicaceae (Cheng et al., 2013). However, the first B. rapa genome assembly (version 1.5) is only about 283.8 Mb, 58.52% of the estimated genome size (485 Mb) (Wang et al., 2011). Considering that much of the genome assembly is still missing (41.48%), there is a considerable possibility that important genes have been missed.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.