Menu
September 22, 2019

RNA sequencing (RNA-Seq) reveals extremely low levels of reticulocyte-derived globin gene transcripts in peripheral blood from horses (Equus caballus) and cattle (Bos taurus).

RNA-seq has emerged as an important technology for measuring gene expression in peripheral blood samples collected from humans and other vertebrate species. In particular, transcriptomics analyses of whole blood can be used to study immunobiology and develop novel biomarkers of infectious disease. However, an obstacle to these methods in many mammalian species is the presence of reticulocyte-derived globin mRNAs in large quantities, which can complicate RNA-seq library sequencing and impede detection of other mRNA transcripts. A range of supplementary procedures for targeted depletion of globin transcripts have, therefore, been developed to alleviate this problem. Here, we use comparative analyses of RNA-seq data sets generated from human, porcine, equine, and bovine peripheral blood to systematically assess the impact of globin mRNA on routine transcriptome profiling of whole blood in cattle and horses. The results of these analyses demonstrate that total RNA isolated from equine and bovine peripheral blood contains very low levels of globin mRNA transcripts, thereby negating the need for globin depletion and greatly simplifying blood-based transcriptomic studies in these two domestic species.


September 22, 2019

The Epstein-Barr virus miR-BHRF1 microRNAs regulate viral gene expression in cis.

The Epstein-Barr virus (EBV) miR-BHRF1 microRNA (miRNA) cluster has been shown to facilitate B-cell transformation and promote the rapid growth of the resultant lymphoblastoid cell lines (LCLs). However, we find that expression of physiological levels of the miR-BHRF1 miRNAs in LCLs transformed with a miR-BHRF1 null mutant (?123) fails to increase their growth rate. We demonstrate that the pri-miR-BHRF1-2 and 1-3 stem-loops are present in the 3’UTR of transcripts encoding EBNA-LP and that excision of pre-miR-BHRF1-2 and 1-3 by Drosha destabilizes these mRNAs and reduces expression of the encoded protein. Therefore, mutational inactivation of pri-miR-BHRF1-2 and 1-3 in the ?123 mutant upregulates the expression of not only EBNA-LP but also EBNA-LP-regulated mRNAs and proteins, including LMP1. We hypothesize that this overexpression causes the reduced transformation capacity of the ?123 EBV mutant. Thus, in addition to regulating cellular mRNAs in trans, miR-BHRF1-2 and 1-3 also regulate EBNA-LP mRNA expression in cis. Copyright © 2017 Elsevier Inc. All rights reserved.


September 22, 2019

SMRT-Cappable-seq reveals complex operon variants in bacteria.

Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo processing of transcripts confounds the accurate identification of the 5′ and 3′ ends of operons. Here we develop a methodology called SMRT-Cappable-seq that combines the isolation of un-fragmented primary transcripts with single-molecule long read sequencing. Applied to E. coli, this technology results in an accurate definition of the transcriptome with 34% of known operons from RegulonDB being extended by at least one gene. Furthermore, 40% of transcription termination sites have read-through that alters the gene content of the operons. As a result, most of the bacterial genes are present in multiple operon variants reminiscent of eukaryotic splicing. By providing such granularity in the operon structure, this study represents an important resource for the study of prokaryotic gene network and regulation.


September 22, 2019

PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results.The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI.lfgu@fafu.edu.cn.


September 22, 2019

Global dissection of alternative splicing uncovers transcriptional diversity in tissues and associates with the flavonoid pathway in tea plant (Camellia sinensis).

Alternative splicing (AS) regulates mRNA at the post-transcriptional level to change gene function in organisms. However, little is known about the AS and its roles in tea plant (Camellia sinensis), widely cultivated for making a popular beverage tea.In our study, the AS landscape and dynamics were characterized in eight tissues (bud, young leaf, summer mature leaf, winter old leaf, stem, root, flower, fruit) of tea plant by Illumina RNA-Seq and confirmed by Iso-Seq. The most abundant AS (~?20%) was intron retention and involved in RNA processes. The some alternative splicings were found to be tissue specific in stem and root etc. Thirteen co-expressed modules of AS transcripts were identified, which revealed a similar pattern between the bud and young leaves as well as a distinct pattern between seasons. AS events of structural genes including anthocyanidin reductase and MYB transcription factors were involved in biosynthesis of flavonoid, especially in vegetative tissues. The AS isoforms rather than the full-length ones were the major transcripts involved in flavonoid synthesis pathway, and is positively correlated with the catechins content conferring the tea taste. We propose that the AS is an important functional mechanism in regulating flavonoid metabolites.Our study provides the insight into the AS events underlying tea plant’s uniquely different developmental process and highlights the important contribution and efficacy of alternative splicing regulatory function to biosynthesis of flavonoids.


September 22, 2019

Uncovering full-length transcript isoforms of sugarcane cultivar Khon Kaen 3 using single-molecule long-read sequencing.

Sugarcane is an important global food crop and energy resource. To facilitate the sugarcane improvement program, genome and gene information are important for studying traits at the molecular level. Most currently available transcriptome data for sugarcane were generated using second-generation sequencing platforms, which provide short reads. The de novo assembled transcripts from these data are limited in length, and hence may be incomplete and inaccurate, especially for long RNAs.We generated a transcriptome dataset of leaf tissue from a commercial Thai sugarcane cultivar Khon Kaen 3 (KK3) using PacBio RS II single-molecule long-read sequencing by the Iso-Seq method. Short-read RNA-Seq data were generated from the same RNA sample using the Ion Proton platform for reducing base calling errors.A total of 119,339 error-corrected transcripts were generated with the N50 length of 3,611 bp, which is on average longer than any previously reported sugarcane transcriptome dataset. 110,253 sequences (92.4%) contain an open reading frame (ORF) of at least 300 bp long with ORF N50 of 1,416 bp. The mean lengths of 5′ and 3′ untranslated regions in 73,795 sequences with complete ORFs are 1,249 and 1,187 bp, respectively. 4,774 transcripts are putatively novel full-length transcripts which do not match with a previous Iso-Seq study of sugarcane. We annotated the functions of 68,962 putative full-length transcripts with at least 90% coverage when compared with homologous protein coding sequences in other plants.The new catalog of transcripts will be useful for genome annotation, identification of splicing variants, SNP identification, and other research pertaining to the sugarcane improvement program. The putatively novel transcripts suggest unique features of KK3, although more data from different tissues and stages of development are needed to establish a reference transcriptome of this cultivar.


September 22, 2019

Genome-wide transcriptome profiling of the medicinal plant Zanthoxylum planispinum using a single-molecule direct RNA sequencing approach.

High-throughput RNA sequencing has revolutionized transcriptome-based studies of candidate genes, key pathways and gene regulation in non-model organisms. We analyzed full-length cDNA sequences in Zanthoxylum planispinum (Z. planispinum), a medicinal herb in major parts of East Asia. The full-length mRNA derived from tissues of leaf, early fruit and maturing fruit stage were sequenced using PacBio RSII platform to identify isoform transcriptome. We obtained 51,402 unigenes, with average 1781?bp per gene in 82.473?Mb gene lengths. Among 51,402, 3963 unigenes showed variety of isoform. By selection of one representative gene among each of the various isoforms, we finalized 46,306 unique gene set for this herb. We identified 76 cytochrome P450 (CYP450) and related isoforms that are of the wide diversity in the molecular function and biological process. These transcriptome data of Z. planispinum will provide a good resource to study metabolic engineering for the production of valuable medicinal drugs and phytochemicals. Copyright © 2018. Published by Elsevier Inc.


September 22, 2019

Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts.

Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5?UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee.© The Authors 2017. Published by Oxford University Press.


September 22, 2019

Characterization of four C1q/TNF-related proteins (CTRPs) from red-lip mullet (Liza haematocheila) and their transcriptional modulation in response to bacterial and pathogen-associated molecular pattern stimuli.

The structural and evolutionary linkage between tumor necrosis factor (TNF) and the globular C1q (gC1q) domain defines the C1q and TNF-related proteins (CTRPs), which are involved in diverse functions such as immune defense, inflammation, apoptosis, autoimmunity, and cell differentiation. In this study, red-lip mullet (Liza haematocheila) CTRP4-like (MuCTRP4-like), CTRP5 (MuCTRP5), CTRP6 (MuCTRP6), and CTRP7 (MuCTRP7) were identified from the red-lip mullet transcriptome database and molecularly characterized. According to in silico analysis, coding sequences of MuCTRP4-like, MuCTRP5, MuCTRP6, and MuCTRP7 consisted of 1128, 753, 729, and 888 bp open reading frames (ORF), respectively and encoded 375, 250, 242, and 295 amino acids, respectively. All CTRPs possessed a putative C1q domain. Additionally, MuCTRP5, MuCTRP6, and MuCTRP7 consisted of a collagen region. Phylogenetic analysis exemplified that MuCTRPs were distinctly clustered with the respective CTRP orthologs. Tissue-specific expression analysis demonstrated that MuCTRP4-like was mostly expressed in the blood and intestine. Moreover, MuCTRP6 was highly expressed in the blood, whereas MuCTRP5 and MuCTRP7 were predominantly expressed in the muscle and stomach, respectively. According to the temporal expression in blood, all MuCTRPs exhibited significant modulations in response to polyinosinic:polycytidylic acid (poly I:C) and Lactococcus garvieae (L. garvieae). MuCTRP4-like, MuCTRP5, and MuCTRP6 showed significant upregulation in response to lipopolysaccharides (LPS). The results of this study suggest the potential involvement of Mullet CTRPs in post-immune responses. Copyright © 2018. Published by Elsevier Ltd.


September 22, 2019

Researches on transcriptome sequencing in the study of traditional Chinese medicine

Due to its incomparable advantages, the application of transcriptome sequencing in the study of traditional Chinese medicine attracts more and more attention of researchers, which greatly promote the development of traditional Chinese medicine. In this paper, the applications of transcriptome sequencing in traditional Chinese medicine were summarized by reviewing recent related papers.


September 22, 2019

CLK-dependent exon recognition and conjoined gene formation revealed with a novel small molecule inhibitor.

CDC-like kinase phosphorylation of serine/arginine-rich proteins is central to RNA splicing reactions. Yet, the genomic network of CDC-like kinase-dependent RNA processing events remains poorly defined. Here, we explore the connectivity of genomic CDC-like kinase splicing functions by applying graduated, short-exposure, pharmacological CDC-like kinase inhibition using a novel small molecule (T3) with very high potency, selectivity, and cell-based stability. Using RNA-Seq, we define CDC-like kinase-responsive alternative splicing events, the large majority of which monotonically increase or decrease with increasing CDC-like kinase inhibition. We show that distinct RNA-binding motifs are associated with T3 response in skipped exons. Unexpectedly, we observe dose-dependent conjoined gene transcription, which is associated with motif enrichment in the last and second exons of upstream and downstream partners, respectively. siRNA knockdown of CLK2-associated genes significantly increases conjoined gene formation. Collectively, our results reveal an unexpected role for CDC-like kinase in conjoined gene formation, via regulation of 3′-end processing and associated splicing factors.The phosphorylation of serine/arginine-rich proteins by CDC-like kinase is a central regulatory mechanism for RNA splicing reactions. Here, the authors synthesize a novel small molecule CLK inhibitor and map CLK-responsive alternative splicing events and discover an effect on conjoined gene transcription.


September 22, 2019

Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

Parallel sequencing of a single cell’s genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ~3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.


September 22, 2019

Bayesian nonparametric discovery of isoforms and individual specific quantification.

Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. BIISQ shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios.


September 22, 2019

Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family.

Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ~360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ~25-35 Mya and CFHR1 and CFHR3 ~7-13 Mya). Remarkably, all evolutionary breakpoints share a common ~4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has expanded independently throughout primate evolution. This segment is recurrently reused and juxtaposed with a donor duplication containing exons 8 and 9 from ancestral CFH, creating four CFHR fusion genes that include lineage-specific members of the gene family. Combined analysis of >5,000 AMD cases and controls identifies a significant burden of a rare missense mutation that clusters at the N terminus of CFH [P = 5.81 × 10-8, odds ratio (OR) = 9.8 (3.67-Infinity)]. A bipolar clustering pattern of rare nonsynonymous mutations in patients with AMD (P < 10-3) and AHUS (P = 0.0079) maps to functional domains that show evidence of positive selection during primate evolution. Our structural variation analysis in >2,400 individuals reveals five recurrent rearrangement breakpoints that show variable frequency among AMD cases and controls. These data suggest a dynamic and recurrent pattern of mutation critical to the emergence of new CFHR genes but also in the predisposition to complex human genetic disease phenotypes.


September 22, 2019

Differential expression analysis of olfactory genes based on a combination of sequencing platforms and behavioral investigations in Aphidius gifuensis.

Aphidius gifuensis Ashmead is a dominant endoparasitoid of aphids, such as Myzus persicae and Sitobion avenae, and plays an important role in controlling aphids in various habitats, including tobacco plants and wheat in China. A. gifuensis has been successfully applied for the biological control of aphids, especially M. persicae, in green houses and fields in China. The corresponding parasites, as well as its mate-searching behaviors, are subjects of considerable interest. Previous A. gifuensis transcriptome studies have relied on short-read next-generation sequencing (NGS), and the vast majority of the resulting isotigs do not represent full-length cDNA. Here, we employed a combination of NGS and single-molecule real-time (SMRT) sequencing of virgin females (VFs), mated females (MFs), virgin males (VMs), and mated males (MMs) to comprehensively study the A. gifuensis transcriptome. Behavioral responses to the aphid alarm pheromone (E-ß-farnesene, EBF) as well as to A. gifuensis of the opposite sex were also studied. VMs were found to be attracted by female wasps and MFs were repelled by male wasps, whereas MMs and VFs did not respond to the opposite sex. In addition, VFs, MFs, and MMs were attracted by EBF, while VMs did not respond. According to these results, we performed a personalized differential gene expression analysis of olfactory gene sets (66 odorant receptors, 25 inotropic receptors, 16 odorant-binding proteins, and 12 chemosensory proteins) in virgin and mated A. gifuensis of both sexes, and identified 13 candidate genes whose expression levels were highly consistent with behavioral test results, suggesting potential functions for these genes in pheromone perception.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.