Menu
September 22, 2019  |  

Analysis of microbial community structure of pit mud for Chinese strong-flavor liquor fermentation using next generation DNA sequencing of full-length 16S rRNA

The pit is the necessary bioreactor for brewing process of Chinese strong-flavor liquor. Pit mud in pits contains a large number of microorganisms and is a complex ecosystem. The analysis of bacterial flora in pit mud is of great significance to understand liquor fermentation mechanisms. To overcome taxonomic limitations of short reads in 16S rRNA variable region sequencing, we used high-throughput DNA sequencing of near full-length 16S rRNA gene to analyze microbial compositions of different types of pit mud that produce different qualities of strong-flavor liquor. The results showed that the main species in pit mud were Pseudomonas extremaustralis 14-3, Pseudomonas veronii, Serratia marcescens WW4, and Clostridium leptum in Ruminiclostridium. The microbial diversity of pit mud with different quality was significantly different. From poor to good quality of pit mud (thus the quality of liquor), the relative abundances of Ruminiclostridium and Syntrophomonas in Firmicutes was increased, and the relative abundance of Olsenella in Actinobacteria also increased, but the relative abundances of Pseudomonas and Serratia in Proteobacteria were decreased. The surprising findings of this study include that the diversity of intermediate level quality of N pit mud was the lowest, and the diversity levels of high quality pit mud G and poor quality pit mud B were similar. Correlation analysis showed that there were high positive correlations (r > 0.8) among different microbial groups in the flora. Based on the analysis of the microbial structures of pit mud in different quality, the good quality pit mud has a higher microbial diversity, but how this higher diversity and differential microbial compositions contribute to better quality of liquor fermentation remains obscure.


September 22, 2019  |  

Long-read sequencing revealed an extensive transcript complexity in herpesviruses.

Long-read sequencing (LRS) techniques are very recent advancements, but they have already been used for transcriptome research in all of the three subfamilies of herpesviruses. These techniques have multiplied the number of known transcripts in each of the examined viruses. Meanwhile, they have revealed a so far hidden complexity of the herpesvirus transcriptome with the discovery of a large number of novel RNA molecules, including coding and non-coding RNAs, as well as transcript isoforms, and polycistronic RNAs. Additionally, LRS techniques have uncovered an intricate meshwork of transcriptional overlaps between adjacent and distally located genes. Here, we review the contribution of LRS to herpesvirus transcriptomics and present the complexity revealed by this technology, while also discussing the functional significance of this phenomenon.


September 22, 2019  |  

CRISPR/Cas9 deletions in a conserved exon of Distal-less generates gains and losses in a recently acquired morphological novelty in flies.

Distal-less has been repeatedly co-opted for the development of many novel traits. Here, we document its curious role in the development of a novel abdominal appendage (“sternite brushes”) in sepsid flies. CRISPR/Cas9 deletions in the homeodomain result in losses of sternite brushes, demonstrating that Distal-less is necessary for their development. However, deletions in the upstream coding exon (Exon 2) produce losses or gains of brushes. A dissection of Exon 2 reveals that the likely mechanism for gains involves a deletion in an exon-splicing enhancer site that leads to exon skipping. Such contradictory phenotypes are also observed in butterflies, suggesting that mutations in the conserved upstream regions have the potential to generate phenotypic variability in insects that diverged 300 million years ago. Our results demonstrate the importance of Distal-less for the development of a novel abdominal appendage in insects and highlight how site-specific mutations in the same exon can produce contradictory phenotypes. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

Comprehensive genomic analysis of malignant pleural mesothelioma identifies recurrent mutations, gene fusions and splicing alterations.

We analyzed transcriptomes (n = 211), whole exomes (n = 99) and targeted exomes (n = 103) from 216 malignant pleural mesothelioma (MPM) tumors. Using RNA-seq data, we identified four distinct molecular subtypes: sarcomatoid, epithelioid, biphasic-epithelioid (biphasic-E) and biphasic-sarcomatoid (biphasic-S). Through exome analysis, we found BAP1, NF2, TP53, SETD2, DDX3X, ULK2, RYR2, CFAP45, SETDB1 and DDX51 to be significantly mutated (q-score = 0.8) in MPMs. We identified recurrent mutations in several genes, including SF3B1 (~2%; 4/216) and TRAF7 (~2%; 5/216). SF3B1-mutant samples showed a splicing profile distinct from that of wild-type tumors. TRAF7 alterations occurred primarily in the WD40 domain and were, except in one case, mutually exclusive with NF2 alterations. We found recurrent gene fusions and splice alterations to be frequent mechanisms for inactivation of NF2, BAP1 and SETD2. Through integrated analyses, we identified alterations in Hippo, mTOR, histone methylation, RNA helicase and p53 signaling pathways in MPMs.


September 22, 2019  |  

Neural circular RNAs are derived from synaptic genes and regulated by development and plasticity.

Circular RNAs (circRNAs) have re-emerged as an interesting RNA species. Using deep RNA profiling in different mouse tissues, we observed that circRNAs were substantially enriched in brain and a disproportionate fraction of them were derived from host genes that encode synaptic proteins. Moreover, on the basis of separate profiling of the RNAs localized in neuronal cell bodies and neuropil, circRNAs were, on average, more enriched in the neuropil than their host gene mRNA isoforms. Using high-resolution in situ hybridization, we visualized circRNA punctae in the dendrites of neurons. Consistent with the idea that circRNAs might regulate synaptic function during development, many circRNAs changed their abundance abruptly at a time corresponding to synaptogenesis. In addition, following a homeostatic downscaling of neuronal activity many circRNAs exhibited substantial up- or downregulation. Together, our data indicate that brain circRNAs are positioned to respond to and regulate synaptic function.


September 22, 2019  |  

Computational analysis of alternative splicing in plant genomes.

Computational analyses play crucial roles in characterizing splicing isoforms in plant genomes. In this review, we provide a survey of computational tools used in recently published, genome-scale splicing analyses in plants. We summarize the commonly used software and pipelines for read mapping, isoform reconstruction, isoform quantification, and differential expression analysis. We also discuss methods for analyzing long reads and the strategies to combine long and short reads in identifying splicing isoforms. We review several tools for characterizing local splicing events, splicing graphs, coding potential, and visualizing splicing isoforms. We further discuss the procedures for identifying conserved splicing isoforms across plant species. Finally, we discuss the outlook of integrating other genomic data with splicing analyses to identify regulatory mechanisms of AS on genome-wide scale. Copyright © 2018 Elsevier B.V. All rights reserved.


September 22, 2019  |  

Revealing missing human protein isoforms based on Ab initio prediction, RNA-seq and proteomics.

Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.


September 22, 2019  |  

SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification.

High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.© 2018 Tardaguila et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Extensive allele-specific translational regulation in hybrid mice.

Translational regulation is mediated through the interaction between diffusible trans-factors and cis-elements residing within mRNA transcripts. In contrast to extensively studied transcriptional regulation, cis-regulation on translation remains underexplored. Using deep sequencing-based transcriptome and polysome profiling, we globally profiled allele-specific translational efficiency for the first time in an F1 hybrid mouse. Out of 7,156 genes with reliable quantification of both alleles, we found 1,008 (14.1%) exhibiting significant allelic divergence in translational efficiency. Systematic analysis of sequence features of the genes with biased allelic translation revealed that local RNA secondary structure surrounding the start codon and proximal out-of-frame upstream AUGs could affect translational efficiency. Finally, we observed that the cis-effect was quantitatively comparable between transcriptional and translational regulation. Such effects in the two regulatory processes were more frequently compensatory, suggesting that the regulation at the two levels could be coordinated in maintaining robustness of protein expression. © 2015 The Authors. Published under the terms of the CC BY 4.0 license.


September 22, 2019  |  

Proteogenomic analysis reveals alternative splicing and translation as part of the abscisic acid response in Arabidopsis seedlings.

In eukaryotes, mechanisms such as alternative splicing (AS) and alternative translation initiation (ATI) contribute to organismal protein diversity. Specifically, splicing factors play crucial roles in responses to environment and development cues; however, the underlying mechanisms are not well investigated in plants. Here, we report the parallel employment of short-read RNA sequencing, single molecule long-read sequencing and proteomic identification to unravel AS isoforms and previously unannotated proteins in response to abscisic acid (ABA) treatment. Combining the data from the two sequencing methods, approximately 83.4% of intron-containing genes were alternatively spliced. Two AS types, which are referred to as alternative first exon (AFE) and alternative last exon (ALE), were more abundant than intron retention (IR); however, by contrast to AS events detected under normal conditions, differentially expressed AS isoforms were more likely to be translated. ABA extensively affects the AS pattern, indicated by the increasing number of non-conventional splicing sites. This work also identified thousands of unannotated peptides and proteins by ATI based on mass spectrometry and a virtual peptide library deduced from both strands of coding regions within the Arabidopsis genome. The results enhance our understanding of AS and alternative translation mechanisms under normal conditions, and in response to ABA treatment.© 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.


September 22, 2019  |  

Next generation sequencing technology: Advances and applications.

Impressive progress has been made in the field of Next Generation Sequencing (NGS). Through advancements in the fields of molecular biology and technical engineering, parallelization of the sequencing reaction has profoundly increased the total number of produced sequence reads per run. Current sequencing platforms allow for a previously unprecedented view into complex mixtures of RNA and DNA samples. NGS is currently evolving into a molecular microscope finding its way into virtually every fields of biomedical research. In this chapter we review the technical background of the different commercially available NGS platforms with respect to template generation and the sequencing reaction and take a small step towards what the upcoming NGS technologies will bring. We close with an overview of different implementations of NGS into biomedical research. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.


September 22, 2019  |  

Characterization of the human ESC transcriptome by hybrid sequencing.

Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency-associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.


September 22, 2019  |  

Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

Pre-mRNA splicing is accomplished by the spliceosome, a megadalton complex that assembles de novo on each intron. Because spliceosome assembly and catalysis occur cotranscriptionally, we hypothesized that introns are removed in the order of their transcription in genomes dominated by constitutive splicing. Remarkably little is known about splicing order and the regulatory potential of nascent transcript remodeling by splicing, due to the limitations of existing methods that focus on analysis of mature splicing products (mRNAs) rather than substrates and intermediates. Here, we overcome this obstacle through long-read RNA sequencing of nascent, multi-intron transcripts in the fission yeast Schizosaccharomyces pombe Most multi-intron transcripts were fully spliced, consistent with rapid cotranscriptional splicing. However, an unexpectedly high proportion of transcripts were either fully spliced or fully unspliced, suggesting that splicing of any given intron is dependent on the splicing status of other introns in the transcript. Supporting this, mild inhibition of splicing by a temperature-sensitive mutation in prp2, the homolog of vertebrate U2AF65, increased the frequency of fully unspliced transcripts. Importantly, fully unspliced transcripts displayed transcriptional read-through at the polyA site and were degraded cotranscriptionally by the nuclear exosome. Finally, we show that cellular mRNA levels were reduced in genes with a high number of unspliced nascent transcripts during caffeine treatment, showing regulatory significance of cotranscriptional splicing. Therefore, overall splicing of individual nascent transcripts, 3′ end formation, and mRNA half-life depend on the splicing status of neighboring introns, suggesting crosstalk among spliceosomes and the polyA cleavage machinery during transcription elongation.© 2018 Herzel et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Bypassing the Restriction System To Improve Transformation of Staphylococcus epidermidis.

Staphylococcus epidermidis is the leading cause of infections on indwelling medical devices worldwide. Intrinsic antibiotic resistance and vigorous biofilm production have rendered these infections difficult to treat and, in some cases, require the removal of the offending medical prosthesis. With the exception of two widely passaged isolates, RP62A and 1457, the pathogenesis of infections caused by clinical S. epidermidis strains is poorly understood due to the strong genetic barrier that precludes the efficient transformation of foreign DNA into clinical isolates. The difficulty in transforming clinical S. epidermidis isolates is primarily due to the type I and IV restriction-modification systems, which act as genetic barriers. Here, we show that efficient plasmid transformation of clinical S. epidermidis isolates from clonal complexes 2, 10, and 89 can be realized by employing a plasmid artificial modification (PAM) in Escherichia coli DC10B containing a ?dcm mutation. This transformative technique should facilitate our ability to genetically modify clinical isolates of S. epidermidis and hence improve our understanding of their pathogenesis in human infections.IMPORTANCEStaphylococcus epidermidis is a source of considerable morbidity worldwide. The underlying mechanisms contributing to the commensal and pathogenic lifestyles of S. epidermidis are poorly understood. Genetic manipulations of clinically relevant strains of S. epidermidis are largely prohibited due to the presence of a strong restriction barrier. With the introductions of the tools presented here, genetic manipulation of clinically relevant S. epidermidis isolates has now become possible, thus improving our understanding of S. epidermidis as a pathogen. Copyright © 2017 American Society for Microbiology.


September 22, 2019  |  

A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.

Despite the economic importance of sugarcane in sugar and bioenergy production, there is not yet a reference genome available. Most of the sugarcane transcriptomic studies have been based on Saccharum officinarum gene indices (SoGI), expressed sequence tags (ESTs) and de novo assembled transcript contigs from short-reads; hence knowledge of the sugarcane transcriptome is limited in relation to transcript length and number of transcript isoforms.The sugarcane transcriptome was sequenced using PacBio isoform sequencing (Iso-Seq) of a pooled RNA sample derived from leaf, internode and root tissues, of different developmental stages, from 22 varieties, to explore the potential for capturing full-length transcript isoforms. A total of 107,598 unique transcript isoforms were obtained, representing about 71% of the total number of predicted sugarcane genes. The majority of this dataset (92%) matched the plant protein database, while just over 2% was novel transcripts, and over 2% was putative long non-coding RNAs. About 56% and 23% of total sequences were annotated against the gene ontology and KEGG pathway databases, respectively. Comparison with de novo contigs from Illumina RNA-Sequencing (RNA-Seq) of the internode samples from the same experiment and public databases showed that the Iso-Seq method recovered more full-length transcript isoforms, had a higher N50 and average length of largest 1,000 proteins; whereas a greater representation of the gene content and RNA diversity was captured in RNA-Seq. Only 62% of PacBio transcript isoforms matched 67% of de novo contigs, while the non-matched proportions were attributed to the inclusion of leaf/root tissues and the normalization in PacBio, and the representation of more gene content and RNA classes in the de novo assembly, respectively. About 69% of PacBio transcript isoforms and 41% of de novo contigs aligned with the sorghum genome, indicating the high conservation of orthologs in the genic regions of the two genomes.The transcriptome dataset should contribute to improved sugarcane gene models and sugarcane protein predictions; and will serve as a reference database for analysis of transcript expression in sugarcane.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.