About 64% of the total aboveground biomass in sugarcane production is from the culm, of which ~90% is present in fiber and sugars. Understanding the transcriptome in the sugarcane culm, and the transcripts that are associated with the accumulation of the sugar and fiber components would facilitate the modification of biomass composition for enhanced biofuel and biomaterial production. The Sugarcane Iso-Seq Transcriptome (SUGIT) database was used as a reference for RNA-Seq analysis of variation in gene expression between young and mature tissues, and between 10 genotypes with varying fiber content. Global expression analysis suggests that each genotype displayed a unique expression pattern, possibly due to different chromosome combinations and maturation amongst these genotypes. Apart from direct sugar- and fiber-related transcripts, the differentially expressed (DE) transcripts in this study belonged to various supporting pathways that are not obviously involved in the accumulation of these major biomass components. The analysis revealed 1,649 DE transcripts between the young and mature tissues, while 555 DE transcripts were found between the low and high fiber genotypes. Of these, 151 and 23 transcripts respectively, were directly involved in sugar and fiber accumulation. Most of the transcripts identified were up-regulated in the young tissues (2 to 22-fold, FDR adjusted p-value <0.05), which could be explained by the more active metabolism in the young tissues compared to the mature tissues in the sugarcane culm. The results of analysis of the contrasting genotypes suggests that due to the large number of genes contributing to these traits, some of the critical DE transcripts could display less than 2-fold differences in expression and might not be easily identified. However, this transcript profiling analysis identified full-length candidate transcripts and pathways that were likely to determine the differences in sugar and fiber accumulation between tissue types and contrasting genotypes.
Arabica coffee (Coffea arabica) has a small gene pool limiting genetic improvement. Selection for caffeine content within this gene pool would be assisted by identification of the genes controlling this important trait. Sequencing of DNA bulks from 18 genotypes with extreme high- or low-caffeine content from a population of 232 genotypes was used to identify linked polymorphisms. To obtain a reference genome, a whole genome assembly of arabica coffee (variety K7) was achieved by sequencing using short read (Illumina) and long-read (PacBio) technology. Assembly was performed using a range of assembly tools resulting in 76 409 scaffolds with a scaffold N50 of 54 544 bp and a total scaffold length of 1448 Mb. Validation of the genome assembly using different tools showed high completeness of the genome. More than 99% of transcriptome sequences mapped to the C. arabica draft genome, and 89% of BUSCOs were present. The assembled genome annotated using AUGUSTUS yielded 99 829 gene models. Using the draft arabica genome as reference in mapping and variant calling allowed the detection of 1444 nonsynonymous single nucleotide polymorphisms (SNPs) associated with caffeine content. Based on Kyoto Encyclopaedia of Genes and Genomes pathway-based analysis, 65 caffeine-associated SNPs were discovered, among which 11 SNPs were associated with genes encoding enzymes involved in the conversion of substrates, which participate in the caffeine biosynthesis pathways. This analysis demonstrated the complex genetic control of this key trait in coffee.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Molecular detection methods, such as quantitative PCR (qPCR), have found their way into clinical microbiology laboratories for the detection of an array of pathogens. Most routinely used methods, however, are directed at specific species. Thus, anything that is not explicitly searched for will be missed. This greatly limits the flexibility and universal application of these techniques. We investigated the application of a rapid universal bacterial molecular identification method, IS-pro, to routine patient samples received in a clinical microbiology laboratory. IS-pro is a eubacterial technique based on the detection and categorization of 16S-23S rRNA gene interspace regions with lengths that are specific for each microbial species. As this is an open technique, clinicians do not need to decide in advance what to look for. We compared routine culture to IS-pro using 66 samples sent in for routine bacterial diagnostic testing. The samples were obtained from patients with infections in normally sterile sites (without a resident microbiota). The results were identical in 20 (30%) samples, IS-pro detected more bacterial species than culture in 31 (47%) samples, and five of the 10 culture-negative samples were positive with IS-pro. The case histories of the five patients from whom these culture-negative/IS-pro-positive samples were obtained suggest that the IS-pro findings are highly clinically relevant. Our findings indicate that an open molecular approach, such as IS-pro, may have a high added value for clinical practice. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Acquisition of genes through horizontal gene transfer (HGT) allows microbes to rapidly gain new capabilities and adapt to new or changing environments. Identifying widespread HGT regions within multispecies microbiomes can pinpoint the molecular mechanisms that play key roles in microbiome assembly. We sought to identify horizontally transferred genes within a model microbiome, the cheese rind. Comparing 31 newly sequenced and 134 previously sequenced bacterial isolates from cheese rinds, we identified over 200 putative horizontally transferred genomic regions containing 4733 protein coding genes. The largest of these regions are enriched for genes involved in siderophore acquisition, and are widely distributed in cheese rinds in both Europe and the US. These results suggest that HGT is prevalent in cheese rind microbiomes, and that identification of genes that are frequently transferred in a particular environment may provide insight into the selective forces shaping microbial communities.
The genomes of two fungi isolated from soil (MEA-2) and sediment (SUP5-1) were sequenced. Both were members of the order Hypocreales, closely related to Tolypocladium inflatum, and capable of producing novel secondary metabolites. The draft genomes enabled the characterization of key biosynthetic pathways. Copyright © 2015 Stamps et al.
Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome.
Shotgun metagenomics has become an important tool for investigating the ecology of microorganisms. Underlying these investigations is the assumption that metagenome sequence data accurately estimates the census of microbial populations. Multiple displacement amplification (MDA) of microbial community DNA is often used in cases where it is difficult to obtain enough DNA for sequencing; however, MDA can result in amplification biases that may impact subsequent estimates of population census from metagenome data. Some have posited that pooling replicate MDA reactions negates these biases and restores the accuracy of population analyses. This assumption has not been empirically tested.Using mock viral communities, we examined the influence of pooling on population-scale analyses. In pooled and single reaction MDA treatments, sequence coverage of viral populations was highly variable and coverage patterns across viral genomes were nearly identical, indicating that initial priming biases were reproducible and that pooling did not alleviate biases. In contrast, control unamplified sequence libraries showed relatively even coverage across phage genomes.MDA should be avoided for metagenomic investigations that require quantitative estimates of microbial taxa and gene functional groups. While MDA is an indispensable technique in applications such as single-cell genomics, amplification biases cannot be overcome by combining replicate MDA reactions. Alternative library preparation techniques should be utilized for quantitative microbial ecology studies utilizing metagenomic sequencing approaches.
To obtain intact and full-length RNA transcripts of onion (Allium cepa), long-read sequencing technology was first applied. Total RNAs extracted from four tissues; flowers, leaves, bulbs and roots, of red–purple and yellow-colored onions (A. cepa) were sequenced using long-read sequencing (RSII platform, P4-C2 chemistry). The 99,247 polished high-quality isoforms were produced by sequence correction processes of consensus calling, quality filtering, orientation verification, misread-nucleotide correction and dot-matrix view. The dot-matrix view was subsequently used to remove artificial inverted repeats (IRs), and resultantly 421 IRs were removed. The remaining 98,826 isoforms were condensed to 35,505 through the removal process of redundant isoforms. To assess the completeness of the 35,505 isoforms, the ratio of full-length isoforms, short-read mapping to the isoforms, and differentially expressed genes among the four tissues were analyzed along with the gene ontology across the tissues. As a result, the 35,505 isoforms were verified as a collection of isoforms with high completeness, and designated as draft reference transcripts (DRTs, ver 1.0) constructed by long-read sequencing.
De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts
Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.
Whole genome sequencing of “Faecalibaculum rodentium” ALO17, isolated from C57BL/6J laboratory mouse feces.
Intestinal microorganisms affect host physiology, including ageing. Given the difficulty in controlling for human studies of the gut microbiome, mouse models provide an alternative avenue to study such relationships. In this study, we report on the complete genome of “Faecalibaculum rodentium” ALO17, a bacterium that was isolated from the faeces of a 9-month-old female C57BL/6J mouse. This strain will be utilized in future in vivo studies detailing the relationships between the gut microbiome and ageing.The whole genome sequence of “F. rodentium” ALO17 was obtained using single-molecule, real-time (SMRT) technique on a PacBio instrument. The assembled genome consisted of 2,542,486 base pairs of double-stranded DNA with a GC content of 54.0 % and no plasmids. The genome was predicted to contain 2794 open reading frames, 55 tRNA genes, and 38 rRNA genes. The 16S rRNA gene of ALO17 was 86.9 % similar to that of Allobaculum stercoricanis DSM 13633(T), and the average overall nucleotide identity between strains ALO17 and DSM 13633(T) was 66.8 %. After confirming the phylogenetic relationship between “F. rodentium” ALO17 and A. stercoricanis DSM 13633(T), their whole genome sequences were compared, revealing that “F. rodentium” ALO17 contains more fermentation-related genes than A. stercoricanis DSM 13633(T). Furthermore, “F. rodentium” ALO17 produces higher levels of lactic acid than A. stercoricanis DSM 13633(T) as determined by high-performance liquid chromatography.The availability of the “F. rodentium” ALO17 whole genome sequence will enhance studies concerning the gut microbiota and host physiology, especially when investigating the molecular relationships between gut microbiota and ageing.
RNAi-based treatment of chronically infected patients and chimpanzees reveals that integrated hepatitis B virus DNA is a source of HBsAg.
Chronic hepatitis B virus (HBV) infection is a major health concern worldwide, frequently leading to liver cirrhosis, liver failure, and hepatocellular carcinoma. Evidence suggests that high viral antigen load may play a role in chronicity. Production of viral proteins is thought to depend on transcription of viral covalently closed circular DNA (cccDNA). In a human clinical trial with an RNA interference (RNAi)-based therapeutic targeting HBV transcripts, ARC-520, HBV S antigen (HBsAg) was strongly reduced in treatment-naïve patients positive for HBV e antigen (HBeAg) but was reduced significantly less in patients who were HBeAg-negative or had received long-term therapy with nucleos(t)ide viral replication inhibitors (NUCs). HBeAg positivity is associated with greater disease risk that may be moderately reduced upon HBeAg loss. The molecular basis for this unexpected differential response was investigated in chimpanzees chronically infected with HBV. Several lines of evidence demonstrated that HBsAg was expressed not only from the episomal cccDNA minichromosome but also from transcripts arising from HBV DNA integrated into the host genome, which was the dominant source in HBeAg-negative chimpanzees. Many of the integrants detected in chimpanzees lacked target sites for the small interfering RNAs in ARC-520, explaining the reduced response in HBeAg-negative chimpanzees and, by extension, in HBeAg-negative patients. Our results uncover a heretofore underrecognized source of HBsAg that may represent a strategy adopted by HBV to maintain chronicity in the presence of host immunosurveillance. These results could alter trial design and endpoint expectations of new therapies for chronic HBV. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Comparative genomic analysis of Sulfurospirillum cavolei MES reconstructed from the metagenome of an electrosynthetic microbiome.
Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this potent greenhouse gas.
Piezo2 is a mechanically activated ion channel required for touch discrimination, vibration detection, and proprioception. Here, we discovered that Piezo2 is extensively spliced, producing different Piezo2 isoforms with distinct properties. Sensory neurons from both mice and humans express a large repertoire of Piezo2 variants, whereas non-neuronal tissues express predominantly a single isoform. Notably, even within sensory ganglia, we demonstrate the splicing of Piezo2 to be cell type specific. Biophysical characterization revealed substantial differences in ion permeability, sensitivity to calcium modulation, and inactivation kinetics among Piezo2 splice variants. Together, our results describe, at the molecular level, a potential mechanism by which transduction is tuned, permitting the detection of a variety of mechanosensory stimuli. Published by Elsevier Inc.
A high-resolution genetic map of the cereal crown rot pathogen Fusarium pseudograminearum provides a near-complete genome assembly.
Fusarium pseudograminearum is an important pathogen of wheat and barley, particularly in semi-arid environments. Previous genome assemblies for this organism were based entirely on short read data and are highly fragmented. In this work, a genetic map of F. pseudograminearum has been constructed for the first time based on a mapping population of 178 individuals. The genetic map, together with long read scaffolding of a short read-based genome assembly, was used to give a near-complete assembly of the four F. pseudograminearum chromosomes. Large regions of synteny between F. pseudograminearum and F. graminearum, the related pathogen that is the primary causal agent of cereal head blight disease, were previously proposed in the core conserved genome, but the construction of a genetic map to order and orient contigs is critical to the validation of synteny and the placing of species-specific regions. Indeed, our comparative analyses of the genomes of these two related pathogens suggest that rearrangements in the F. pseudograminearum genome have occurred in the chromosome ends. One of these rearrangements includes the transposition of an entire gene cluster involved in the detoxification of the benzoxazolinone (BOA) class of plant phytoalexins. This work provides an important genomic and genetic resource for F. pseudograminearum, which is less well characterized than F. graminearum. In addition, this study provides new insights into a better understanding of the sexual reproduction process in F. pseudograminearum, which informs us of the potential of this pathogen to evolve.© 2016 BSPP AND JOHN WILEY & SONS LTD.
Complete genome sequence of Geobacillus thermodenitrificans T12, a potential host for biotechnological applications.
In attempt to obtain a thermophilic host for the conversion of lignocellulose derived substrates into lactic acid, Geobacillus thermodenitrificans T12 was isolated from a compost heap. It was selected from over 500 isolates as a genetically tractable hemicellulolytic lactic acid producer, requiring little nutrients. The strain is able to ferment glucose and xylose simultaneously and can produce lactic acid from xylan, making it a potential host for biotechnological applications. The genome of strain T12 consists of a 3.64 Mb chromosome and two plasmids of 59 and 56 kb. It has a total of 3.676 genes with an average genomic GC content of 48.7%. The T12 genome encodes a denitrification pathway, allowing for anaerobic respiration. The identity and localization of the responsible genes are similar to those of the denitrification pathways found in strain NG80-2. The hemicellulose utilization (HUS) locus was identified based on sequence homology against G. stearothermophilus T-6. It appeared that T12 has all the genes that are present in strain T-6 except for the arabinan degradation cluster. Instead, the HUS locus of strain T12 contains genes for both an inositol and a pectate degradation pathway. Strain T12 has complete pathways for the synthesis of purine and pyrimidine, all 20 amino acids and several vitamins except D-biotin. The host-defense systems present comprise a Type II and a Type III restriction-modification system, as well as a CRISPR-Cas Type II system. It is concluded that G. thermodenitrificans T12 is a potentially interesting candidate for industrial applications.
Identification of a biosynthetic gene cluster for the polyene macrolactam sceliphrolactam in a Streptomyces strain isolated from mangrove sediment.
Streptomyces are a genus of Actinobacteria capable of producing structurally diverse natural products. Here we report the isolation and characterization of a biosynthetically talented Streptomyces (Streptomyces sp. SD85) from tropical mangrove sediments. Whole-genome sequencing revealed that Streptomyces sp. SD85 harbors at least 52 biosynthetic gene clusters (BGCs), which constitute 21.2% of the 8.6-Mb genome. When cultivated under lab conditions, Streptomyces sp. SD85 produces sceliphrolactam, a 26-membered polyene macrolactam with unknown biosynthetic origin. Genome mining yielded a putative sceliphrolactam BGC (sce) that encodes a type I modular polyketide synthase (PKS) system, several ß-amino acid starter biosynthetic enzymes, transporters, and transcriptional regulators. Using the CRISPR/Cas9-based gene knockout method, we demonstrated that the sce BGC is essential for sceliphrolactam biosynthesis. Unexpectedly, the PKS system encoded by sce is short of one module required for assembling the 26-membered macrolactam skeleton according to the collinearity rule. With experimental data disfavoring the involvement of a trans-PKS module, the biosynthesis of sceliphrolactam seems to be best rationalized by invoking a mechanism whereby the PKS system employs an iterative module to catalyze two successive chain extensions with different outcomes. The potential violation of the collinearity rule makes the mechanism distinct from those of other polyene macrolactams.