Menu
July 7, 2019

Improved long read correction for de novo assembly using an FM-index

Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging textquotedbllefthybridtextquotedblright assemblies that use long reads for scaffolding and short reads for accuracy. To this end, we describe a novel application of a multi-string Burrows-Wheeler transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We show that our method efficiently produces significantly higher quality corrected sequence than existing hybrid error-correction methods. We demonstrate the effectiveness of our method compared to state-of-the-art hybrid and long-read only de novo assembly methods.


July 7, 2019

The nuclear genome of Rhazya stricta and the evolution of alkaloid diversity in a medically relevant clade of Apocynaceae.

Alkaloid accumulation in plants is activated in response to stress, is limited in distribution and specific alkaloid repertoires are variable across taxa. Rauvolfioideae (Apocynaceae, Gentianales) represents a major center of structural expansion in the monoterpenoid indole alkaloids (MIAs) yielding thousands of unique molecules including highly valuable chemotherapeutics. The paucity of genome-level data for Apocynaceae precludes a deeper understanding of MIA pathway evolution hindering the elucidation of remaining pathway enzymes and the improvement of MIA availability in planta or in vitro. We sequenced the nuclear genome of Rhazya stricta (Apocynaceae, Rauvolfioideae) and present this high quality assembly in comparison with that of coffee (Rubiaceae, Coffea canephora, Gentianales) and others to investigate the evolution of genome-scale features. The annotated Rhazya genome was used to develop the community resource, RhaCyc, a metabolic pathway database. Gene family trees were constructed to identify homologs of MIA pathway genes and to examine their evolutionary history. We found that, unlike Coffea, the Rhazya lineage has experienced many structural rearrangements. Gene tree analyses suggest recent, lineage-specific expansion and diversification among homologs encoding MIA pathway genes in Gentianales and provide candidate sequences with the potential to close gaps in characterized pathways and support prospecting for new MIA production avenues.


July 7, 2019

Ectomycorrhizal ecology is imprinted in the genome of the dominant symbiotic fungus Cenococcum geophilum.

The most frequently encountered symbiont on tree roots is the ascomycete Cenococcum geophilum, the only mycorrhizal species within the largest fungal class Dothideomycetes, a class known for devastating plant pathogens. Here we show that the symbiotic genomic idiosyncrasies of ectomycorrhizal basidiomycetes are also present in C. geophilum with symbiosis-induced, taxon-specific genes of unknown function and reduced numbers of plant cell wall-degrading enzymes. C. geophilum still holds a significant set of genes in categories known to be involved in pathogenesis and shows an increased genome size due to transposable elements proliferation. Transcript profiling revealed a striking upregulation of membrane transporters, including aquaporin water channels and sugar transporters, and mycorrhiza-induced small secreted proteins (MiSSPs) in ectomycorrhiza compared with free-living mycelium. The frequency with which this symbiont is found on tree roots and its possible role in water and nutrient transport in symbiosis calls for further studies on mechanisms of host and environmental adaptation.


July 7, 2019

Genomic, physiologic, and proteomic insights into metabolic versatility in Roseobacter clade bacteria isolated from deep-sea water.

Roseobacter clade bacteria are ubiquitous in marine environments and now thought to be significant contributors to carbon and sulfur cycling. However, only a few strains of roseobacters have been isolated from the deep-sea water column and have not been thoroughly investigated. Here, we present the complete genomes of phylogentically closed related Thiobacimonas profunda JLT2016 and Pelagibaca abyssi JLT2014 isolated from deep-sea water of the Southeastern Pacific. The genome sequences showed that the two deep-sea roseobacters carry genes for versatile metabolisms with functional capabilities such as ribulose bisphosphate carboxylase-mediated carbon fixation and inorganic sulfur oxidation. Physiological and biochemical analysis showed that T. profunda JLT2016 was capable of autotrophy, heterotrophy, and mixotrophy accompanied by the production of exopolysaccharide. Heterotrophic carbon fixation via anaplerotic reactions contributed minimally to bacterial biomass. Comparative proteomics experiments showed a significantly up-regulated carbon fixation and inorganic sulfur oxidation associated proteins under chemolithotrophic conditions compared to heterotrophic conditions. Collectively, rosebacters show a high metabolic flexibility, suggesting a considerable capacity for adaptation to the marine environment.


July 7, 2019

Sequence assembly of Yarrowia lipolytica strain W29/CLIB89 shows transposable element diversity.

Yarrowia lipolytica, an oleaginous yeast, is capable of accumulating significant cellular mass in lipid making it an important source of biosustainable hydrocarbon-based chemicals. In spite of a similar number of protein-coding genes to that in other Hemiascomycetes, the Y. lipolytica genome is almost double that of model yeasts. Despite its economic importance and several distinct strains in common use, an independent genome assembly exists for only one strain. We report here a de novo annotated assembly of the chromosomal genome of an industrially-relevant strain, W29/CLIB89, determined by hybrid next-generation sequencing. For the first time, each Y. lipolytica chromosome is represented by a single contig. The telomeric rDNA repeats were localized by Irys long-range genome mapping and one complete copy of the rDNA sequence is reported. Two large structural variants and retroelement differences with reference strain CLIB122 including a full-length, novel Ty3/Gypsy long terminal repeat (LTR) retrotransposon and multiple LTR-like sequences are described. Strikingly, several of these are adjacent to RNA polymerase III-transcribed genes, which are almost double in number in Y. lipolytica compared to other Hemiascomycetes. In addition to previously-reported dimeric RNA polymerase III-transcribed genes, tRNA pseudogenes were identified. Multiple full-length and truncated LINE elements are also present. Therefore, although identified transposons do not constitute a significant fraction of the Y. lipolytica genome, they could have played an active role in its evolution. Differences between the sequence of this strain and of the existing reference strain underscore the utility of an additional independent genome assembly for this economically important organism.


July 7, 2019

BAC-pool sequencing and analysis confirms growth-associated QTLs in the Asian seabass genome.

The Asian seabass is an important marine food fish that has been cultured for several decades in Asia Pacific. However, the lack of a high quality reference genome has hampered efforts to improve its selective breeding. A 3D BAC pool set generated in this study was screened using 22 SSR markers located on linkage group 2 which contains a growth-related QTL region. Seventy-two clones corresponding to 22 FPC contigs were sequenced by Illumina MiSeq technology. We co-assembled the MiSeq-derived scaffolds from each FPC contig with error-corrected PacBio reads, resulting in 187 sequences covering 9.7?Mb. Eleven genes annotated within this region were found to be potentially associated with growth and their tissue-specific expression was investigated. Correlation analysis demonstrated that SNPs in ctsb, skp1 and ppp2ca can be potentially used as markers for selecting fast-growing fingerlings. Conserved syntenies between seabass LG2 and five other teleosts were identified. This study i) provided a 10?Mb targeted genome assembly; ii) demonstrated NGS of BAC pools as a potential approach for mining candidates underlying QTLs of this species; iii) detected eleven genes potentially responsible for growth in the QTL region; and iv) identified useful SNP markers for selective breeding programs of Asian seabass.


July 7, 2019

Complete genome sequence of Lactobacillus plantarum LZ206, a potential probiotic strain with antimicrobial activity against food-borne pathogenic microorganisms.

Lactobacilli strains have been considered as important candidates for manufacturing “natural food”, due to their antimicrobial properties and generally regarded as safe (GRAS) status. Lactobacillus plantarum LZ206 is a potential probiotic strain isolated from raw cow milk, with antimicrobial activity against various pathogens, including Gram-positive bacteria (Staphylococcus aureus and Listeria monocytogenes), Gram-negtive bacteria (Escherichia coli and Salmonella enterica), and fungus Candida albicans. To better understand molecular base for its antimicrobial activity, entire genome of LZ206 was sequenced. It was revealed that genome of LZ206 contained a circular 3,212,951-bp chromosome, two circular plasmids and one predicted linear plasmid. A plantaricin gene cluster, which is responsible for bacteriocins biosynthesis and could be associated with its broad-spectrum antimicrobial activity, was identified based on comparative genomic analysis. Whole genome sequencing of L. plantarum LZ206 might facilitate its applications to protect food products from pathogens’ contamination in the dairy industry. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage.

Genome assemblies that are accurate, complete and contiguous are essential for identifying important structural and functional elements of genomes and for identifying genetic variation. Nevertheless, most recent genome assemblies remain incomplete and fragmented. While long molecule sequencing promises to deliver more complete genome assemblies with fewer gaps, concerns about error rates, low yields, stringent DNA requirements and uncertainty about best practices may discourage many investigators from adopting this technology. Here, in conjunction with the platinum standard Drosophila melanogaster reference genome, we analyze recently published long molecule sequencing data to identify what governs completeness and contiguity of genome assemblies. We also present a hybrid meta-assembly approach that achieves remarkable assembly contiguity for both Drosophila and human assemblies with only modest long molecule sequencing coverage. Our results motivate a set of preliminary best practices for obtaining accurate and contiguous assemblies, a ‘missing manual’ that guides key decisions in building high quality de novo genome assemblies, from DNA isolation to polishing the assembly.© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Complete genome sequence of Halomonas sp. R5-57.

The marine Arctic isolate Halomonas sp. R5-57 was sequenced as part of a bioprospecting project which aims to discover novel enzymes and organisms from low-temperature environments, with potential uses in biotechnological applications. Phenotypically, Halomonas sp. R5-57 exhibits high salt tolerance over a wide range of temperatures and has extra-cellular hydrolytic activities with several substrates, indicating it secretes enzymes which may function in high salinity conditions. Genome sequencing identified the genes involved in the biosynthesis of the osmoprotectant ectoine, which has applications in food processing and pharmacy, as well as those involved in production of polyhydroxyalkanoates, which can serve as precursors to bioplastics. The percentage identity of these biosynthetic genes from Halomonas sp. R5-57 and current production strains varies between 99 % for some to 69 % for others, thus it is plausible that R5-57 may have a different production capacity to currently used strains, or that in the case of PHAs, the properties of the final product may vary. Here we present the finished genome sequence (LN813019) of Halomonas sp. R5-57 which will facilitate exploitation of this bacterium; either as a whole-cell production host, or by recombinant expression of its individual enzymes.


July 7, 2019

Use of WGS data for investigation of a long-term NDM-1-producing Citrobacter freundii outbreak and secondary in vivo spread of blaNDM-1 to Escherichia coli, Klebsiella pneumoniae and Klebsiella oxytoca.

An outbreak of NDM-1-producing Citrobacter freundii and possible secondary in vivo spread of blaNDM-1 to other Enterobacteriaceae were investigated.From October 2012 to March 2015, meropenem-resistant Enterobacteriaceae were detected in 45 samples from seven patients at Aalborg University Hospital, Aalborg, Denmark. In silico resistance genes, Inc plasmid types and STs (MLST) were obtained from WGS data from 24 meropenem-resistant isolates (13 C. freundii, 6 Klebsiella pneumoniae, 4 Escherichia coli and 1 Klebsiella oxytoca) and 1 meropenem-susceptible K. oxytoca. The sequences of the meropenem-resistant C. freundii isolates were compared by phylogenetic analyses. In vitro susceptibility to 21 antimicrobial agents was tested. Furthermore, in vitro conjugation and plasmid characterization was performed.From the seven patients, 13 highly clonal ST18 NDM-1-producing C. freundii were isolated. The ST18 NDM-1-producing C. freundii isolates were only susceptible to tetracycline, tigecycline, colistin and fosfomycin (except for the C. freundii isolates from Patient 2 and Patient 7, which were additionally resistant to tetracycline). The E. coli and K. pneumoniae from different patients belonged to different STs, indicating in vivo transfer of blaNDM-1 in the individual patients. This was further supported by in vitro conjugation and detection of a 154 kb IncA/C2 plasmid with blaNDM-1. Patient screenings failed to reveal any additional cases. None of the patients had a history of recent travel abroad and the source of the blaNDM-1 plasmid was unknown.To our knowledge, this is the first report of an NDM-1-producing C. freundii outbreak and secondary in vivo spread of an IncA/C2 plasmid with blaNDM-1 to other Enterobacteriaceae.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.


July 7, 2019

Genomic studies of nitrogen-fixing rhizobial strains from Phaseolus vulgaris seeds and nodules.

Rhizobia are soil bacteria that establish symbiotic relationships with legumes and fix nitrogen in root nodules. We recently reported that several nitrogen-fixing rhizobial strains, belonging to Rhizobium phaseoli, R. trifolii, R. grahamii and Sinorhizobium americanum, were able to colonize Phaseolus vulgaris (common bean) seeds. To gain further insight into the traits that support this ability, we analyzed the genomic sequences and proteomes of R. phaseoli (CCGM1) and S. americanum (CCGM7) strains from seeds and compared them with those of the closely related strains CIAT652 and CFNEI73, respectively, isolated only from nodules.In a fine structural study of the S. americanum genomes, the chromosomes, megaplasmids and symbiotic plasmids were highly conserved and syntenic, with the exception of the smaller plasmid, which appeared unrelated. The symbiotic tract of CCGM7 appeared more disperse, possibly due to the action of transposases. The chromosomes of seed strains had less transposases and strain-specific genes. The seed strains CCGM1 and CCGM7 shared about half of their genomes with their closest strains (3353 and 3472 orthologs respectively), but a large fraction of the rest also had homology with other rhizobia. They contained 315 and 204 strain-specific genes, respectively, particularly abundant in the functions of transcription, motility, energy generation and cofactor biosynthesis. The proteomes of seed and nodule strains were obtained and showed a particular profile for each of the strains. About 82 % of the proteins in the comparisons appeared similar. Forty of the most abundant proteins in each strain were identified; these proteins in seed strains were involved in stress responses and coenzyme and cofactor biosynthesis and in the nodule strains mainly in central processes. Only 3 % of the abundant proteins had hypothetical functions.Functions that were enriched in the genomes and proteomes of seed strains possibly participate in the successful occupancy of the new niche. The genome of the strains had features possibly related to their presence in the seeds. This study helps to understand traits of rhizobia involved in seed adaptation.


July 7, 2019

Genome sequence of Phormia regina Meigen (Diptera: Calliphoridae): implications for medical, veterinary and forensic research.

Blow flies (Diptera: Calliphoridae) are important medical, veterinary and forensic insects encompassing 8 % of the species diversity observed in the calyptrate insects. Few genomic resources exist to understand the diversity and evolution of this group.We present the hybrid (short and long reads) draft assemblies of the male and female genomes of the common North American blow fly, Phormia regina (Diptera: Calliphoridae). The 550 and 534 Mb draft assemblies contained 8312 and 9490 predicted genes in the female and male genomes, respectively; including?>?93 % conserved eukaryotic genes. Putative X and Y chromosomes (21 and 14 Mb, respectively) were assembled and annotated. The P. regina genomes appear to contain few mobile genetic elements, an almost complete absence of SINEs, and most of the repetitive landscape consists of simple repetitive sequences. Candidate gene approaches were undertaken to annotate insecticide resistance, sex-determining, chemoreceptors, and antimicrobial peptides.This work yielded a robust, reliable reference calliphorid genome from a species located in the middle of a calliphorid phylogeny. By adding an additional blow fly genome, the ability to tease apart what might be true of general calliphorids vs. what is specific of two distinct lineages now exists. This resource will provide a strong foundation for future studies into the evolution, population structure, behavior, and physiology of all blow flies.


July 7, 2019

Function and phylogeny of bacterial butyryl coenzyme A: acetate transferases and their diversity in the proximal colon of swine.

Studying the host-associated butyrate-producing bacterial community is important, because butyrate is essential for colonic homeostasis and gut health. Previous research has identified the butyryl coenzyme A (CoA):acetate-CoA transferase (EC 2.3.8.3) as a gene of primary importance for butyrate production in intestinal ecosystems; however, this gene family (but) remains poorly defined. We developed tools for the analysis of butyrate-producing bacteria based on 12 putative but genes identified in the genomes of nine butyrate-producing bacteria obtained from the swine intestinal tract. Functional analyses revealed that eight of these genes had strong But enzyme activity. When but paralogues were found within a genome, only one gene per genome encoded strong activity, with the exception of one strain in which no gene encoded strong But activity. Degenerate primers were designed to amplify the functional but genes and were tested by amplifying environmental but sequences from DNA and RNA extracted from swine colonic contents. The results show diverse but sequences from swine-associated butyrate-producing bacteria, most of which clustered near functionally confirmed sequences. Here, we describe tools and a framework that allow the bacterial butyrate-producing community to be profiled in the context of animal health and disease.Butyrate is a compound produced by the microbiota in the intestinal tracts of animals. This compound is of critical importance for intestinal health, and yet studying its production by diverse intestinal bacteria is technically challenging. Here, we present an additional way to study the butyrate-producing community of bacteria using one degenerate primer set that selectively targets genes experimentally demonstrated to encode butyrate production. This work will enable researchers to more easily study this very important bacterial function that has implications for host health and resistance to disease. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.