Menu
July 7, 2019

Genome sequence of “Candidatus Microthrix parvicella” Bio17-1, a long-chain-fatty-acid-accumulating filamentous actinobacterium from a biological wastewater treatment plant.

Candidatus Microthrix bacteria are deeply branching filamentous actinobacteria which occur at the water-air interface of biological wastewater treatment plants, where they are often responsible for foaming and bulking. Here, we report the first draft genome sequence of a strain from this genus: “Candidatus Microthrix parvicella” strain Bio17-1.


July 7, 2019

The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation.In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis.These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).


July 7, 2019

Draft genome assembly and annotation of Glycyrrhiza uralensis, a medicinal legume.

Chinese liquorice/licorice (Glycyrrhiza uralensis) is a leguminous plant species whose roots and rhizomes have been widely used as a herbal medicine and natural sweetener. Whole-genome sequencing is essential for gene discovery studies and molecular breeding in liquorice. Here, we report a draft assembly of the approximately 379-Mb whole-genome sequence of strain 308-19 of G. uralensis; this assembly contains 34 445 predicted protein-coding genes. Comparative analyses suggested well-conserved genomic components and collinearity of gene loci (synteny) between the genome of liquorice and those of other legumes such as Medicago and chickpea. We observed that three genes involved in isoflavonoid biosynthesis, namely, 2-hydroxyisoflavanone synthase (CYP93C), 2,7,4′-trihydroxyisoflavanone 4′-O-methyltransferase/isoflavone 4′-O-methyltransferase (HI4OMT) and isoflavone-7-O-methyltransferase (7-IOMT) formed a cluster on the scaffold of the liquorice genome and showed conserved microsynteny with Medicago and chickpea. Based on the liquorice genome annotation, we predicted genes in the P450 and UDP-dependent glycosyltransferase (UGT) superfamilies, some of which are involved in triterpenoid saponin biosynthesis, and characterised their gene expression with the reference genome sequence. The genome sequencing and its annotations provide an essential resource for liquorice improvement through molecular breeding and the discovery of useful genes for engineering bioactive components through synthetic biology approaches.© 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.


July 7, 2019

Draft genome sequence of Mentha longifolia (L.) and development of resources for mint cultivar improvement.

The genus Mentha encompasses mint species cultivated for their essential oils, which are formulated into a vast array of consumer products. Desirable oil characteristics and resistance to the fungal disease Verticillium wilt are top priorities for the mint industry. However, cultivated mints have complex polyploid genomes and are sterile. Breeding efforts, therefore, require the development of genomic resources for fertile mint species. Here, we present draft de novo genome and plastome assemblies for a wilt-resistant South African accession of Mentha longifolia (L.) Huds., a diploid species ancestral to cultivated peppermint and spearmint. The 353 Mb genome contains 35 597 predicted protein-coding genes, including 292 disease resistance gene homologs, and nine genes determining essential oil characteristics. A genetic linkage map ordered 1397 genome scaffolds on 12 pseudochromosomes. More than two million simple sequence repeats were identified, which will facilitate molecular marker development. The M. longifolia genome is a valuable resource for both metabolic engineering and molecular breeding. This is exemplified by employing the genome sequence to clone and functionally characterize the promoters in a peppermint cultivar, and demonstrating the utility of a glandular trichome-specific promoter to increase expression of a biosynthetic gene, thereby modulating essential oil composition. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Brassica rapa genome 2.0: a reference upgrade through sequence re-assembly and gene re-annotation.

Brassica rapa includes many important crops that are cultivated as vegetables, condiments, and oilseeds. Recently, the Brassica genomes have been sequenced extensively: a B. rapa draft reference genome in 2011 (Wang et al., 2011), a Brassica oleracea in 2014 (Liu et al., 2014), a Brassica napus in 2014 (Chalhoub et al., 2014), and Brassica nigra and Brassica juncea in 2016 (Yang et al., 2016). The first released B. rapa genome reference served as a valuable resource in the genome assembly and annotation of the other Brassicas (Chalhoub et al., 2014, Liu et al., 2014, Parkin et al., 2014). B. rapa has been used widely in Brassica comparative and evolutionary genomics among the Brassicaceae (Cheng et al., 2013). However, the first B. rapa genome assembly (version 1.5) is only about 283.8 Mb, 58.52% of the estimated genome size (485 Mb) (Wang et al., 2011). Considering that much of the genome assembly is still missing (41.48%), there is a considerable possibility that important genes have been missed.


July 7, 2019

The genome sequence of Barbarea vulgaris facilitates the study of ecological biochemistry.

The genus Barbarea has emerged as a model for evolution and ecology of plant defense compounds, due to its unusual glucosinolate profile and production of saponins, unique to the Brassicaceae. One species, B. vulgaris, includes two ‘types’, G-type and P-type that differ in trichome density, and their glucosinolate and saponin profiles. A key difference is the stereochemistry of hydroxylation of their common phenethylglucosinolate backbone, leading to epimeric glucobarbarins. Here we report a draft genome sequence of the G-type, and re-sequencing of the P-type for comparison. This enables us to identify candidate genes underlying glucosinolate diversity, trichome density, and study the genetics of biochemical variation for glucosinolate and saponins. B. vulgaris is resistant to the diamondback moth, and may be exploited for “dead-end” trap cropping where glucosinolates stimulate oviposition and saponins deter larvae to the extent that they die. The B. vulgaris genome will promote the study of mechanisms in ecological biochemistry to benefit crop resistance breeding.


July 7, 2019

Draft genome sequence of the acidophilic, halotolerant, and iron/sulfur-oxidizing Acidihalobacter prosperus DSM 14174 (strain V6).

The principal genomic features of Acidihalobacter prosperus DSM 14174 (strain V6) are presented here. This is a mesophilic, halotolerant, and iron/sulfur-oxidizing acidophile that was isolated from seawater at Vulcano, Italy. It has potential for use in biomining applications in regions where high salinity exists in the source water and ores. Copyright © 2017 Khaleque et al.


July 7, 2019

Genome sequence of Streptomyces sp. H-KF8, a marine actinobacterium isolated from a northern Chilean Patagonian fjord.

Streptomyces sp. H-KF8 is a fjord-derived marine actinobacterium capable of producing antimicrobial activity. Streptomyces sp. H-KF8 was isolated from sediments of the Comau fjord, located in the northern Chilean Patagonia. Here, we report the 7.7-Mb genome assembly, which represents the first genome of a Chilean marine actinobacterium. Copyright © 2017 Undabarrena et al.


July 7, 2019

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Long-read sequencing can overcome the weaknesses of short reads in the assembly of eukaryotic genomes, however, at present additional scaffolding is needed to achieve chromosome-level assemblies. We generated PacBio long-read data of the genomes of three relatives of the model plant Arabidopsis thaliana and assembled all three genomes into only a few hundred contigs. To improve the contiguities of these assemblies, we generated BioNano Genomics optical mapping and Dovetail Genomics chromosome conformation capture data for genome scaffolding. Despite their technical differences, optical mapping and chromosome conformation capture performed similarly and doubled N50 values. After improving both integration methods, assembly contiguity reached chromosome-arm-levels. We rigorously assessed the quality of contigs and scaffolds using Illumina mate-pair libraries and genetic map information. This showed that PacBio assemblies have high sequence accuracy but can contain several misassemblies, which join unlinked regions of the genome. Most, but not all of these mis-joints were removed during the integration of the optical mapping and chromosome conformation capture data. Even though none of the centromeres was fully assembled, the scaffolds revealed large parts of some centromeric regions, even including some of the heterochromatic regions, which are not present in gold standard reference sequences. Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Genome scaffolding and annotation for the pathogen vector Ixodes ricinus by ultra-long single molecule sequencing.

Global warming and other ecological changes have facilitated the expansion of Ixodes ricinus tick populations. Ixodes ricinus is the most important carrier of vector-borne pathogens in Europe, transmitting viruses, protozoa and bacteria, in particular Borrelia burgdorferi (sensu lato), the causative agent of Lyme borreliosis, the most prevalent vector-borne disease in humans in the Northern hemisphere. To faster control this disease vector, a better understanding of the I. ricinus tick is necessary. To facilitate such studies, we recently published the first reference genome of this highly prevalent pathogen vector. Here, we further extend these studies by scaffolding and annotating the first reference genome by using ultra-long sequencing reads from third generation single molecule sequencing. In addition, we present the first genome size estimation for I. ricinus ticks and the embryo-derived cell line IRE/CTVM19.235,953 contigs were integrated into 204,904 scaffolds, extending the currently known genome lengths by more than 30% from 393 to 516 Mb and the N50 contig value by 87% from 1643 bp to a N50 scaffold value of 3067 bp. In addition, 25,263 sequences were annotated by comparison to the tick’s North American relative Ixodes scapularis. After (conserved) hypothetical proteins, zinc finger proteins, secreted proteins and P450 coding proteins were the most prevalent protein categories annotated. Interestingly, more than 50% of the amino acid sequences matching the homology threshold had 95-100% identity to the corresponding I. scapularis gene models. The sequence information was complemented by the first genome size estimation for this species. Flow cytometry-based genome size analysis revealed a haploid genome size of 2.65Gb for I. ricinus ticks and 3.80 Gb for the cell line.We present a first draft sequence map of the I. ricinus genome based on a PacBio-Illumina assembly. The I. ricinus genome was shown to be 26% (500 Mb) larger than the genome of its American relative I. scapularis. Based on the genome size of 2.65 Gb we estimated that we covered about 67% of the non-repetitive sequences. Genome annotation will facilitate screening for specific molecular pathways in I. ricinus cells and provides an overview of characteristics and functions.


July 7, 2019

Toward a complete North American Borrelia miyamotoi genome.

Borrelia miyamotoi, of the relapsing-fever spirochete group, is an emerging tick-borne pathogen causing human illness in the northern hemisphere. Here, we present the chromosome, eight extrachromosomal linear plasmids, and a draft sequence for five circular and one linear plasmid of a Borrelia miyamotoi strain isolated from an Ixodes sp. tick from Connecticut, USA. Copyright © 2017 Kingry et al.


July 7, 2019

Draft genome sequence of Halolamina pelagica CDK2 isolated from natural salterns from Rann of Kutch, Gujarat, India.

Halolamina pelagica strain CDK2, a halophilic archaeon (growth range 1.36 to 5.12 M NaCl), was isolated from rhizosphere of wild grasses of hypersaline soil of the Rann of Kutch, Gujarat, India. Its draft genome contains 2,972,542 bp and 3,485 coding sequences, depicting genes for halophilic serine proteases and trehalose synthesis. Copyright © 2017 Gaba et al.


July 7, 2019

Genome sequence of enterotoxigenic Escherichia coli strain FMU073332.

Enterotoxigenic Escherichia coli (ETEC) is an important cause of bacterial diarrheal illness, affecting practically every population worldwide, and was estimated to cause 120,800 deaths in 2010. Here, we report the genome sequence of ETEC strain FMU073332, isolated from a 25-month-old girl from Tlaltizapán, Morelos, México. Copyright © 2017 Saldaña-Ahuactzi et al.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.