Menu
April 21, 2020

The complete genome sequence of Ethanoligenens harbinense reveals the metabolic pathway of acetate-ethanol fermentation: A novel understanding of the principles of anaerobic biotechnology.

Ethanol-type fermentation is one of three main fermentation types in the acidogenesis of anaerobic treatment systems. Non-spore-forming Ethanoligenens is as a typical genus capable of ethanol-type fermentation in mixed culture (i.e. acetate-ethanol fermentation). This genus can produce ethanol, acetate, CO2, and H2 using carbohydrates, and has application potential in anaerobic bioprocesses. Here, the complete genome sequences and methylome of Ethanoligenens harbinense strains with different autoaggregative and coaggregative abilities were obtained using the PacBio single-molecule real-time sequencing platform. The genome size of E. harbinense strains was about 2.97-3.10?Mb with 55.5% G+C content. 3020-3153 genes were annotated, most of which were methylated at specific sites or motifs. The methylation types included 6mA, 4mC, and unknown types. Comparative genomic analysis demonstrated low levels of genetic similarity between E. harbinense and other well-known hydrogen-producing bacteria (i.e., Clostridium and Thermoanaerobacter) in phylogenesis. Hydrogen production of E. harbinense was catalyzed by genes that encode [FeFe]-hydrogenases and that were synthesized by three maturases of [FeFe]-H2ase. The metabolic mechanism of H2-ethanol co-production fermentation, catalyzed by pyruvate ferredoxin oxidoreductase was proposed. This study provides genetic and evolutionary information of a model genus for the further investigation of the metabolic pathway and regulatory network of ethanol-type fermentation and anaerobic bioprocesses for waste or wastewater treatment.Copyright © 2019. Published by Elsevier Ltd.


April 21, 2020

Extensive intraspecific gene order and gene structural variations in upland cotton cultivars.

Multiple cotton genomes (diploid and tetraploid) have been assembled. However, genomic variations between cultivars of allotetraploid upland cotton (Gossypium hirsutum L.), the most widely planted cotton species in the world, remain unexplored. Here, we use single-molecule long read and Hi-C sequencing technologies to assemble genomes of the two upland cotton cultivars TM-1 and zhongmiansuo24 (ZM24). Comparisons among TM-1 and ZM24 assemblies and the genomes of the diploid ancestors reveal a large amount of genetic variations. Among them, the top three longest structural variations are located on chromosome A08 of the tetraploid upland cotton, which account for ~30% total length of this chromosome. Haplotype analyses of the mapping population derived from these two cultivars and the germplasm panel show suppressed recombination rates in this region. This study provides additional genomic resources for the community, and the identified genetic variations, especially the reduced meiotic recombination on chromosome A08, will help future breeding.


April 21, 2020

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (=50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.


April 21, 2020

The Impact of cDNA Normalization on Long-Read Sequencing of a Complex Transcriptome

Normalization of cDNA is widely used to improve the coverage of rare transcripts in analysis of transcriptomes employing next-generation sequencing. Recently, long-read technology has been emerging as a powerful tool for sequencing and construction of transcriptomes, especially for complex genomes containing highly similar transcripts and transcript-spliced isoforms. Here, we analyzed the transcriptome of sugarcane, with a highly polyploidy plant genome, by PacBio isoform sequencing (Iso-Seq) of two different cDNA library preparations, with and without a normalization step. The results demonstrated that, while the two libraries included many of the same transcripts, many longer transcripts were removed and many new generally shorter transcripts were detected by normalization. For the same input cDNA and the same data yield, the normalized library recovered more total transcript isoforms, number of predicted gene families and orthologous groups, resulting in a higher representation for the sugarcane transcriptome, compared to the non-normalized library. The non-normalized library, on the other hand, included a wider transcript length range with more longer transcripts above ~1.25 kb, more transcript isoforms per gene family and gene ontology terms per transcript. A large proportion of the unique transcripts comprising ~52% of the normalized library were expressed at a lower level than the unique transcripts from the non-normalized library, across three tissue types tested including leaf, stalk and root. About 83% of the total 5,348 predicted long noncoding transcripts was derived from the normalized library, of which ~80% was derived from the lowly expressed fraction. Functional annotation of the unique transcripts suggested that each library enriched different functional transcript fractions. This demonstrated the complementation of the two approaches in obtaining a complete transcriptome of a complex genome at the sequencing depth used in this study.


April 21, 2020

Closing the Yield Gap for Cannabis: A Meta-Analysis of Factors Determining Cannabis Yield.

Until recently, the commercial production of Cannabis sativa was restricted to varieties that yielded high-quality fiber while producing low levels of the psychoactive cannabinoid tetrahydrocannabinol (THC). In the last few years, a number of jurisdictions have legalized the production of medical and/or recreational cannabis with higher levels of THC, and other jurisdictions seem poised to follow suit. Consequently, demand for industrial-scale production of high yield cannabis with consistent cannabinoid profiles is expected to increase. In this paper we highlight that currently, projected annual production of cannabis is based largely on facility size, not yield per square meter. This meta-analysis of cannabis yields reported in scientific literature aimed to identify the main factors contributing to cannabis yield per plant, per square meter, and per W of lighting electricity. In line with previous research we found that variety, plant density, light intensity and fertilization influence cannabis yield and cannabinoid content; we also identified pot size, light type and duration of the flowering period as predictors of yield and THC accumulation. We provide insight into the critical role of light intensity, quality, and photoperiod in determining cannabis yields, with particular focus on the potential for light-emitting diodes (LEDs) to improve growth and reduce energy requirements. We propose that the vast amount of genomics data currently available for cannabis can be used to better understand the effect of genotype on yield. Finally, we describe diversification that is likely to emerge in cannabis growing systems and examine the potential role of plant-growth promoting rhizobacteria (PGPR) for growth promotion, regulation of cannabinoid biosynthesis, and biocontrol.


April 21, 2020

Long-Read Sequencing Emerging in Medical Genetics

The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.


April 21, 2020

A First Study of the Virulence Potential of a Bacillus subtilis Isolate From Deep-Sea Hydrothermal Vent.

Bacillus subtilis is the best studied Gram-positive bacterium, primarily as a model of cell differentiation and industrial exploitation. To date, little is known about the virulence of B. subtilis. In this study, we examined the virulence potential of a B. subtilis strain (G7) isolated from the Iheya North hydrothermal field of Okinawa Trough. G7 is aerobic, motile, endospore-forming, and requires NaCl for growth. The genome of G7 is composed of one circular chromosome of 4,216,133 base pairs with an average GC content of 43.72%. G7 contains 4,416 coding genes, 27.5% of which could not be annotated, and the remaining 72.5% were annotated with known or predicted functions in 25 different COG categories. Ten sets of 23S, 5S, and 16S ribosomal RNA operons, 86 tRNA and 14 sRNA genes, 50 tandem repeats, 41 mini-satellites, one microsatellite, and 42 transposons were identified in G7. Comparing to the genome of the B. subtilis wild type strain NCIB 3610T, G7 genome contains many genomic translocations, inversions, and insertions, and twice the amount of genomic Islands (GIs), with 42.5% of GI genes encoding hypothetical proteins. G7 possesses abundant putative virulence genes associated with adhesion, invasion, dissemination, anti-phagocytosis, and intracellular survival. Experimental studies showed that G7 was able to cause mortality in fish and mice following intramuscular/intraperitoneal injection, resist the killing effect of serum complement, and replicate in mouse macrophages and fish peripheral blood leukocytes. Taken together, our study indicates that G7 is a B. subtilis isolate with unique genetic features and can be lethal to vertebrate animals once being introduced into the animals by artificial means. These results provide the first insight into the potential harmfulness of deep-sea B. subtilis.


April 21, 2020

Comparative genomics and pathogenicity potential of members of the Pseudomonas syringae species complex on Prunus spp.

Diseases on Prunus spp. have been associated with a large number of phylogenetically different pathovars and species within the P. syringae species complex. Despite their economic significance, there is a severe lack of genomic information of these pathogens. The high phylogenetic diversity observed within strains causing disease on Prunus spp. in nature, raised the question whether other strains or species within the P. syringae species complex were potentially pathogenic on Prunus spp.To gain insight into the genomic potential of adaptation and virulence in Prunus spp., a total of twelve de novo whole genome sequences of P. syringae pathovars and species found in association with diseases on cherry (sweet, sour and ornamental-cherry) and peach were sequenced. Strains sequenced in this study covered three phylogroups and four clades. These strains were screened in vitro for pathogenicity on Prunus spp. together with additional genome sequenced strains thus covering nine out of thirteen of the currently defined P. syringae phylogroups. Pathogenicity tests revealed that most of the strains caused symptoms in vitro and no obvious link was found between presence of known virulence factors and the observed pathogenicity pattern based on comparative genomics. Non-pathogenic strains were displaying a two to three times higher generation time when grown in rich medium.In this study, the first set of complete genomes of cherry associated P. syringae strains as well as the draft genome of the quarantine peach pathogen P. syringae pv. persicae were generated. The obtained genomic data were matched with phenotypic data in order to determine factors related to pathogenicity to Prunus spp. Results of this study suggest that the inability to cause disease on Prunus spp. in vitro is not the result of host specialization but rather linked to metabolic impairments of individual strains.


April 21, 2020

Metaepigenomic analysis reveals the unexplored diversity of DNA methylation in an environmental prokaryotic community.

DNA methylation plays important roles in prokaryotes, and their genomic landscapes-prokaryotic epigenomes-have recently begun to be disclosed. However, our knowledge of prokaryotic methylation systems is focused on those of culturable microbes, which are rare in nature. Here, we used single-molecule real-time and circular consensus sequencing techniques to reveal the ‘metaepigenomes’ of a microbial community in the largest lake in Japan, Lake Biwa. We reconstructed 19 draft genomes from diverse bacterial and archaeal groups, most of which are yet to be cultured. The analysis of DNA chemical modifications in those genomes revealed 22 methylated motifs, nine of which were novel. We identified methyltransferase genes likely responsible for methylation of the novel motifs, and confirmed the catalytic specificities of four of them via transformation experiments using synthetic genes. Our study highlights metaepigenomics as a powerful approach for identification of the vast unexplored variety of prokaryotic DNA methylation systems in nature.


April 21, 2020

Cichorium intybus L.?×?Cicerbita alpina Walbr.: doubled haploid chicory induction and CENH3 characterization

Intergeneric hybridization between industrial chicory (Cichorium intybus L.) and Cicerbita alpina Walbr. induces interspecific hybrids and haploid chicory plants after in vitro embryo rescue. The protocol yielded haploids in 5 out of 12 cultivars pollinated; altogether 18 haploids were regenerated from 2836 embryos, with a maximum efficiency of 1.96% haploids per cross. Obtained haploids were chromosome doubled with mitosis inhibitors trifluralin and oryzalin; exposure to 0.05 g L-1 oryzalin during one week was the most efficient treatment to regenerate doubled haploids. Inbreeding effects in vitro were limited, but the ploidy level affects morphology. Transcriptome sequencing revealed two unique copies of CENH3 in Cicerbita alpina Walbr. Comparison of CENH3.1 protein sequences of Cicerbita and Cichorium obtained through transcriptome and whole shotgun genome sequencing revealed two amino-acid substitutions at critical residues of the histone fold domain. These particular changes cause chromosome elimination and reduced centromere loading in several other species and might indicate a CENH3-dependent mechanism causing chromosome elimination of parental chromosomes during Cichorium?×?Cicerbita intergeneric hybridization. Our results provide insights in chromosome elimination and might increase the efficiency of haploid induction in Cichorium.


April 21, 2020

Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight.

The human genome contains “dark” gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are =?5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer’s Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer’s disease gene, found in disease cases but not in controls.While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer’s disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.


April 21, 2020

Molecular evolutionary trends and feeding ecology diversification in the Hemiptera, anchored by the milkweed bug genome.

The Hemiptera (aphids, cicadas, and true bugs) are a key insect order, with high diversity for feeding ecology and excellent experimental tractability for molecular genetics. Building upon recent sequencing of hemipteran pests such as phloem-feeding aphids and blood-feeding bed bugs, we present the genome sequence and comparative analyses centered on the milkweed bug Oncopeltus fasciatus, a seed feeder of the family Lygaeidae.The 926-Mb Oncopeltus genome is well represented by the current assembly and official gene set. We use our genomic and RNA-seq data not only to characterize the protein-coding gene repertoire and perform isoform-specific RNAi, but also to elucidate patterns of molecular evolution and physiology. We find ongoing, lineage-specific expansion and diversification of repressive C2H2 zinc finger proteins. The discovery of intron gain and turnover specific to the Hemiptera also prompted the evaluation of lineage and genome size as predictors of gene structure evolution. Furthermore, we identify enzymatic gains and losses that correlate with feeding biology, particularly for reductions associated with derived, fluid nutrition feeding.With the milkweed bug, we now have a critical mass of sequenced species for a hemimetabolous insect order and close outgroup to the Holometabola, substantially improving the diversity of insect genomics. We thereby define commonalities among the Hemiptera and delve into how hemipteran genomes reflect distinct feeding ecologies. Given Oncopeltus’s strength as an experimental model, these new sequence resources bolster the foundation for molecular research and highlight technical considerations for the analysis of medium-sized invertebrate genomes.


April 21, 2020

Resource Concentration Modulates the Fate of Dissimilated Nitrogen in a Dual-Pathway Actinobacterium.

Respiratory ammonification and denitrification are two evolutionarily unrelated dissimilatory nitrogen (N) processes central to the global N cycle, the activity of which is thought to be controlled by carbon (C) to nitrate (NO3-) ratio. Here we find that Intrasporangium calvum C5, a novel dual-pathway denitrifier/respiratory ammonifier, disproportionately utilizes ammonification rather than denitrification when grown under low C concentrations, even at low C:NO3- ratios. This finding is in conflict with the paradigm that high C:NO3- ratios promote ammonification and low C:NO3- ratios promote denitrification. We find that the protein atomic composition for denitrification modules (NirK) are significantly cost minimized for C and N compared to ammonification modules (NrfA), indicating that limitation for C and N is a major evolutionary selective pressure imprinted in the architecture of these proteins. The evolutionary precedent for these findings suggests ecological importance for microbial activity as evidenced by higher growth rates when I. calvum grows predominantly using its ammonification pathway and by assimilating its end-product (ammonium) for growth under ammonium-free conditions. Genomic analysis of I. calvum further reveals a versatile ecophysiology to cope with nutrient stress and redox conditions. Metabolite and transcriptional profiles during growth indicate that enzyme modules, NrfAH and NirK, are not constitutively expressed but rather induced by nitrite production via NarG. Mechanistically, our results suggest that pathway selection is driven by intracellular redox potential (redox poise), which may be lowered when resource concentrations are low, thereby decreasing catalytic activity of upstream electron transport steps (i.e., the bc1 complex) needed for denitrification enzymes. Our work advances our understanding of the biogeochemical flexibility of N-cycling organisms, pathway evolution, and ecological food-webs.


October 23, 2019

Efficient genome editing of a facultative thermophile using mesophilic spCas9.

Well-developed genetic tools for thermophilic microorganisms are scarce, despite their industrial and scientific relevance. Whereas highly efficient CRISPR/Cas9-based genome editing is on the rise in prokaryotes, it has never been employed in a thermophile. Here, we apply Streptococcus pyogenes Cas9 (spCas9)-based genome editing to a moderate thermophile, i.e., Bacillus smithii, including a gene deletion, gene knockout via insertion of premature stop codons, and gene insertion. We show that spCas9 is inactive in vivo above 42 °C, and we employ the wide temperature growth range of B. smithii as an induction system for spCas9 expression. Homologous recombination with plasmid-borne editing templates is performed at 45-55 °C, when spCas9 is inactive. Subsequent transfer to 37 °C allows for counterselection through production of active spCas9, which introduces lethal double-stranded DNA breaks to the nonedited cells. The developed method takes 4 days with 90, 100, and 20% efficiencies for gene deletion, knockout, and insertion, respectively. The major advantage of our system is the limited requirement for genetic parts: only one plasmid, one selectable marker, and a promoter are needed, and the promoter does not need to be inducible or well-characterized. Hence, it can be easily applied for genome editing purposes in both mesophilic and thermophilic nonmodel organisms with a limited genetic toolbox and ability to grow at, or tolerate, temperatures of 37 and at or above 42 °C.


October 23, 2019

Improved production of propionic acid using genome shuffling.

Traditionally derived from fossil fuels, biological production of propionic acid has recently gained interest. Propionibacterium species produce propionic acid as their main fermentation product. Production of other organic acids reduces propionic acid yield and productivity, pointing to by-products gene-knockout strategies as a logical solution to increase yield. However, removing by-product formation has seen limited success due to our inability to genetically engineer the best producing strains (i.e. Propionibacterium acidipropionici). To overcome this limitation, random mutagenesis continues to be the best path towards improving strains for biological propionic acid production. Recent advances in next generation sequencing opened new avenues to understand improved strains. In this work, we use genome shuffling on two wild type strains to generate a better propionic acid producing strain. Using next generation sequencing, we mapped the genomic changes leading to the improved phenotype. The best strain produced 25% more propionic acid than the wild type strain. Sequencing of the strains showed that genomic changes were restricted to single point mutations and gene duplications in well-conserved regions in the genomes. Such results confirm the involvement of gene conversion in genome shuffling as opposed to long genomic insertions. © 2016 The Authors. Biotechnology Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.