Menu
September 22, 2019

PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results.The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI.lfgu@fafu.edu.cn.


September 22, 2019

Metagenomic binning and association of plasmids with bacterial host genomes using DNA methylation.

Shotgun metagenomics methods enable characterization of microbial communities in human microbiome and environmental samples. Assembly of metagenome sequences does not output whole genomes, so computational binning methods have been developed to cluster sequences into genome ‘bins’. These methods exploit sequence composition, species abundance, or chromosome organization but cannot fully distinguish closely related species and strains. We present a binning method that incorporates bacterial DNA methylation signatures, which are detected using single-molecule real-time sequencing. Our method takes advantage of these endogenous epigenetic barcodes to resolve individual reads and assembled contigs into species- and strain-level bins. We validate our method using synthetic and real microbiome sequences. In addition to genome binning, we show that our method links plasmids and other mobile genetic elements to their host species in a real microbiome sample. Incorporation of DNA methylation information into shotgun metagenomics analyses will complement existing methods to enable more accurate sequence binning.


September 22, 2019

Construction of Pará rubber tree genome and multi-transcriptome database accelerates rubber researches.

Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene.A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily.The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .


September 22, 2019

Global dissection of alternative splicing uncovers transcriptional diversity in tissues and associates with the flavonoid pathway in tea plant (Camellia sinensis).

Alternative splicing (AS) regulates mRNA at the post-transcriptional level to change gene function in organisms. However, little is known about the AS and its roles in tea plant (Camellia sinensis), widely cultivated for making a popular beverage tea.In our study, the AS landscape and dynamics were characterized in eight tissues (bud, young leaf, summer mature leaf, winter old leaf, stem, root, flower, fruit) of tea plant by Illumina RNA-Seq and confirmed by Iso-Seq. The most abundant AS (~?20%) was intron retention and involved in RNA processes. The some alternative splicings were found to be tissue specific in stem and root etc. Thirteen co-expressed modules of AS transcripts were identified, which revealed a similar pattern between the bud and young leaves as well as a distinct pattern between seasons. AS events of structural genes including anthocyanidin reductase and MYB transcription factors were involved in biosynthesis of flavonoid, especially in vegetative tissues. The AS isoforms rather than the full-length ones were the major transcripts involved in flavonoid synthesis pathway, and is positively correlated with the catechins content conferring the tea taste. We propose that the AS is an important functional mechanism in regulating flavonoid metabolites.Our study provides the insight into the AS events underlying tea plant’s uniquely different developmental process and highlights the important contribution and efficacy of alternative splicing regulatory function to biosynthesis of flavonoids.


September 22, 2019

MetaSort untangles metagenome assembly by reducing microbial community complexity.

Most current approaches to analyse metagenomic data rely on reference genomes. Novel microbial communities extend far beyond the coverage of reference databases and de novo metagenome assembly from complex microbial communities remains a great challenge. Here we present a novel experimental and bioinformatic framework, metaSort, for effective construction of bacterial genomes from metagenomic samples. MetaSort provides a sorted mini-metagenome approach based on flow cytometry and single-cell sequencing methodologies, and employs new computational algorithms to efficiently recover high-quality genomes from the sorted mini-metagenome by the complementary of the original metagenome. Through extensive evaluations, we demonstrated that metaSort has an excellent and unbiased performance on genome recovery and assembly. Furthermore, we applied metaSort to an unexplored microflora colonized on the surface of marine kelp and successfully recovered 75 high-quality genomes at one time. This approach will greatly improve access to microbial genomes from complex or novel communities.


September 22, 2019

Interactive analysis of Long-read RNA isoforms with Iso-Seq Browser

Background: Long-read RNA sequencing, such as Pacific Biosciences Iso-Seq method, enables generation of sequencing reads that are 10 kilobases or even longer. These reads are ideal for discovering splice junctions and resolving full-length gene transcripts without time-consuming and error-prone techniques such as transcript assembly and junction inference. Results: Iso-Seq Browser is a Web-based visual analytics tool for long-read RNA sequencing data produced by Pacific Biosciences isoform sequencing (Iso-Seq) techniques. Key features of the Iso-Seq Browser are: 1) an exon-only web-based interface with zooming and exon highlighting for exploring reference gene transcripts and novel gene isoforms, 2) automated grouping of transcripts and isoforms by similarity, 3) many customization features for data exploration and creating publication ready figures, and 4) exporting selected isoforms into fasta files for further analysis. Iso-Seq Browser is written in Python using several scientific libraries. The application and analyses described in this paper are freely available to both academic and commercial users at https://github.com/goeckslab/isoseq-browser Conclusions: Iso-Seq Browser enables interactive genome-wide visual analysis of long RNA sequence reads. Through visualization, highlighting, clustering, and filtering of gene isoforms, ISB makes it simple to identify novel isoforms and novel isoform features such as exons, introns and untranslated regions.


September 22, 2019

Species-level bacterial community profiling of the healthy sinonasal microbiome using Pacific Biosciences sequencing of full-length 16S rRNA genes.

Pan-bacterial 16S rRNA microbiome surveys performed with massively parallel DNA sequencing technologies have transformed community microbiological studies. Current 16S profiling methods, however, fail to provide sufficient taxonomic resolution and accuracy to adequately perform species-level associative studies for specific conditions. This is due to the amplification and sequencing of only short 16S rRNA gene regions, typically providing for only family- or genus-level taxonomy. Moreover, sequencing errors often inflate the number of taxa present. Pacific Biosciences’ (PacBio’s) long-read technology in particular suffers from high error rates per base. Herein, we present a microbiome analysis pipeline that takes advantage of PacBio circular consensus sequencing (CCS) technology to sequence and error correct full-length bacterial 16S rRNA genes, which provides high-fidelity species-level microbiome data.Analysis of a mock community with 20 bacterial species demonstrated 100% specificity and sensitivity with regard to taxonomic classification. Examination of a 250-plus species mock community demonstrated correct species-level classification of >?90% of taxa, and relative abundances were accurately captured. The majority of the remaining taxa were demonstrated to be multiply, incorrectly, or incompletely classified. Using this methodology, we examined the microgeographic variation present among the microbiomes of six sinonasal sites, by both swab and biopsy, from the anterior nasal cavity to the sphenoid sinus from 12 subjects undergoing trans-sphenoidal hypophysectomy. We found greater variation among subjects than among sites within a subject, although significant within-individual differences were also observed. Propiniobacterium acnes (recently renamed Cutibacterium acnes) was the predominant species throughout, but was found at distinct relative abundances by site.Our microbial composition analysis pipeline for single-molecule real-time 16S rRNA gene sequencing (MCSMRT, https://github.com/jpearl01/mcsmrt ) overcomes deficits of standard marker gene-based microbiome analyses by using CCS of entire 16S rRNA genes to provide increased taxonomic and phylogenetic resolution. Extensions of this approach to other marker genes could help refine taxonomic assignments of microbial species and improve reference databases, as well as strengthen the specificity of associations between microbial communities and dysbiotic states.


September 22, 2019

Dynamic transcriptome profiling dataset of vaccinia virus obtained from long-read sequencing techniques.

Poxviruses are large DNA viruses that infect humans and animals. Vaccinia virus (VACV) has been applied as a live vaccine for immunization against smallpox, which was eradicated by 1980 as a result of worldwide vaccination. VACV is the prototype of poxviruses in the investigation of the molecular pathogenesis of the virus. Short-read sequencing methods have revolutionized transcriptomics; however, they are not efficient in distinguishing between the RNA isoforms and transcript overlaps. Long-read sequencing (LRS) is much better suited to solve these problems and also allow direct RNA sequencing. Despite the scientific relevance of VACV, no LRS data have been generated for the viral transcriptome to date.For the deep characterization of the VACV RNA profile, various LRS platforms and library preparation approaches were applied. The raw reads were mapped to the VACV reference genome and also to the host (Chlorocebus sabaeus) genome. In this study, we applied the Pacific Biosciences RSII and Sequel platforms, which altogether resulted in 937,531 mapped reads of inserts (1.42 Gb), while we obtained 2,160,348 aligned reads (1.75 Gb) from the different library preparation methods using the MinION device from Oxford Nanopore Technologies.By applying cutting-edge technologies, we were able to generate a large dataset that can serve as a valuable resource for the investigation of the dynamic VACV transcriptome, the virus-host interactions, and RNA base modifications. These data can provide useful information for novel gene annotations in the VACV genome. Our dataset can also be used to analyze the currently available LRS platforms, library preparation methods, and bioinformatics pipelines.


September 22, 2019

Molecular genetic diversity and characterization of conjugation genes in the fish parasite Ichthyophthirius multifiliis.

Ichthyophthirius multifiliis is the etiologic agent of “white spot”, a commercially important disease of freshwater fish. As a parasitic ciliate, I. multifiliis infects numerous host species across a broad geographic range. Although Ichthyophthirius outbreaks are difficult to control, recent sequencing of the I. multifiliis genome has revealed a number of potential metabolic pathways for therapeutic intervention, along with likely vaccine targets for disease prevention. Nonetheless, major gaps exist in our understanding of both the life cycle and population structure of I. multifiliis in the wild. For example, conjugation has never been described in this species, and it is unclear whether I. multifiliis undergoes sexual reproduction, despite the presence of a germline micronucleus. In addition, no good methods exist to distinguish strains, leaving phylogenetic relationships between geographic isolates completely unresolved. Here, we compared nucleotide sequences of SSUrDNA, mitochondrial NADH dehydrogenase subunit I and cox-1 genes, and 14 somatic SNP sites from nine I. multifiliis isolates obtained from four different states in the US since 1995. The mitochondrial sequences effectively distinguished the isolates from one another and divided them into at least two genetically distinct groups. Furthermore, none of the nine isolates shared the same composition of the 14 somatic SNP sites, suggesting that I. multifiliis undergoes sexual reproduction at some point in its life cycle. Finally, compared to the well-studied free-living ciliates Tetrahymena thermophila and Paramecium tetraurelia, I. multifiliis has lost 38% and 29%, respectively, of 16 experimentally confirmed conjugation-related genes, indicating that mechanistic differences in sexual reproduction are likely to exist between I. multifiliis and other ciliate species. Copyright © 2015 Elsevier Inc. All rights reserved.


September 22, 2019

CLK-dependent exon recognition and conjoined gene formation revealed with a novel small molecule inhibitor.

CDC-like kinase phosphorylation of serine/arginine-rich proteins is central to RNA splicing reactions. Yet, the genomic network of CDC-like kinase-dependent RNA processing events remains poorly defined. Here, we explore the connectivity of genomic CDC-like kinase splicing functions by applying graduated, short-exposure, pharmacological CDC-like kinase inhibition using a novel small molecule (T3) with very high potency, selectivity, and cell-based stability. Using RNA-Seq, we define CDC-like kinase-responsive alternative splicing events, the large majority of which monotonically increase or decrease with increasing CDC-like kinase inhibition. We show that distinct RNA-binding motifs are associated with T3 response in skipped exons. Unexpectedly, we observe dose-dependent conjoined gene transcription, which is associated with motif enrichment in the last and second exons of upstream and downstream partners, respectively. siRNA knockdown of CLK2-associated genes significantly increases conjoined gene formation. Collectively, our results reveal an unexpected role for CDC-like kinase in conjoined gene formation, via regulation of 3′-end processing and associated splicing factors.The phosphorylation of serine/arginine-rich proteins by CDC-like kinase is a central regulatory mechanism for RNA splicing reactions. Here, the authors synthesize a novel small molecule CLK inhibitor and map CLK-responsive alternative splicing events and discover an effect on conjoined gene transcription.


September 22, 2019

Androgen receptor variant AR-V9 is co-expressed with AR-V7 in prostate cancer metastases and predicts abiraterone resistance.

Purpose: Androgen receptor (AR) variant AR-V7 is a ligand-independent transcription factor that promotes prostate cancer resistance to AR-targeted therapies.  Accordingly, efforts are underway to develop strategies for monitoring and inhibiting AR-V7 in castration-resistant prostate cancer (CRPC).  The purpose of this study was to understand whether other AR variants may be co-expressed with AR-V7 and promote resistance to AR-targeted therapies. Experimental Design:  We utilized complementary short- and long-read sequencing of intact AR mRNA isoforms to characterize AR expression in CRPC models.  Co-expression of AR-V7 and AR-V9 mRNA in CRPC metastases and circulating tumor cells was assessed by RNA-seq and RT-PCR, respectively.  Expression of AR-V9 protein in CRPC models was evaluated with polyclonal antisera.  Multivariate analysis was performed to test whether AR variant mRNA expression in metastatic tissues was associated with a 12-week progression-free survival endpoint in a prospective clinical trial of 78 CRPC-stage patients initiating therapy with the androgen synthesis inhibitor, abiraterone acetate. Results: AR-V9 was frequently co-expressed with AR-V7.  Both AR variant species were found to share a common 3′ terminal cryptic exon, which rendered AR-V9 susceptible to experimental manipulations that were previously-thought to target AR-V7 uniquely.  AR-V9 promoted ligand-independent growth of prostate cancer cells.  High AR-V9 mRNA expression in CRPC metastases was predictive of primary resistance to abiraterone acetate (HR = 4.0, 95% CI = 1.31-12.2, P = 0.02).   Conclusions:  AR-V9 may be an important component of therapeutic resistance in CRPC. Copyright ©2017, American Association for Cancer Research.


September 22, 2019

Bayesian nonparametric discovery of isoforms and individual specific quantification.

Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. BIISQ shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios.


September 22, 2019

Single-molecule long-read transcriptome profiling of Platysternon megacephalum mitochondrial genome with gene rearrangement and control region duplication.

Platysternon megacephalum is the sole living representative of the poorly studied turtle lineage Platysternidae. Their mitochondrial genome has been subject to gene rearrangement and control region duplication, resulting in a unique mitochondrial gene order in vertebrates. In this study, we sequenced the first full-length turtle (P. megacephalum) liver transcriptome using single-molecule real-time sequencing to study the transcriptional mechanisms of its mitochondrial genome. ND5 and ND6 anti-sense (ND6AS) forms a single transcript with the same expression in the human mitochondrial genome, but here we demonstrated differential expression of the rearranged ND5 and ND6AS genes in P. megacephalum. And some polycistronic transcripts were also reported in this study. Notably, we detected some novel long non-coding RNAs with alternative polyadenylation from the duplicated control region, and a novel ND6AS transcript composed of a long non-coding sequence, ND6AS, and tRNA-GluAS. These results provide the first description of a mtDNA transcriptome with gene rearrangement and control region duplication. These findings further our understanding of the fundamental concepts of mitochondrial gene transcription and RNA processing, and provide a new insight into the mechanism of transcription regulation of the mitochondrial genome.


September 22, 2019

wtf genes are prolific dual poison-antidote meiotic drivers.

Meiotic drivers are selfish genes that bias their transmission into gametes, defying Mendelian inheritance. Despite the significant impact of these genomic parasites on evolution and infertility, few meiotic drive loci have been identified or mechanistically characterized. Here, we demonstrate a complex landscape of meiotic drive genes on chromosome 3 of the fission yeasts Schizosaccharomyces kambucha and S. pombe. We identify S. kambucha wtf4 as one of these genes that acts to kill gametes (known as spores in yeast) that do not inherit the gene from heterozygotes. wtf4 utilizes dual, overlapping transcripts to encode both a gamete-killing poison and an antidote to the poison. To enact drive, all gametes are poisoned, whereas only those that inherit wtf4 are rescued by the antidote. Our work suggests that the wtf multigene family proliferated due to meiotic drive and highlights the power of selfish genes to shape genomes, even while imposing tremendous costs to fertility.


September 22, 2019

Revertant mosaicism repairs skin lesions in a patient with keratitis-ichthyosis-deafness syndrome by second-site mutations in connexin 26.

Revertant mosaicism (RM) is a naturally occurring phenomenon where the pathogenic effect of a germline mutation is corrected by a second somatic event. Development of healthy-looking skin due to RM has been observed in patients with various inherited skin disorders, but not in connexin-related disease. We aimed to clarify the underlying molecular mechanisms of suspected RM in the skin of a patient with keratitis-ichthyosis-deafness (KID) syndrome. The patient was diagnosed with KID syndrome due to characteristic skin lesions, hearing deficiency and keratitis. Investigation of GJB2 encoding connexin (Cx) 26 revealed heterozygosity for the recurrent de novo germline mutation, c.148G?>?A, p.Asp50Asn. At age 20, the patient developed spots of healthy-looking skin that grew in size and number within widespread erythrokeratodermic lesions. Ultra-deep sequencing of two healthy-looking skin biopsies identified five somatic nonsynonymous mutations, independently present in cis with the p.Asp50Asn mutation. Functional studies of Cx26 in HeLa cells revealed co-expression of Cx26-Asp50Asn and wild-type Cx26 in gap junction channel plaques. However, Cx26-Asp50Asn with the second-site mutations identified in the patient displayed no formation of gap junction channel plaques. We argue that the second-site mutations independently inhibit Cx26-Asp50Asn expression in gap junction channels, reverting the dominant negative effect of the p.Asp50Asn mutation. To our knowledge, this is the first time RM has been reported to result in the development of healthy-looking skin in a patient with KID syndrome. © The Author 2017. Published by Oxford University Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.