Menu
September 22, 2019

Isoform sequencing and state-of-art applications for unravelling complexity of plant transcriptomes

Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research.


September 22, 2019

Characterization of the Rosellinia necatrix transcriptome and genes related to pathogenesis by single-molecule mRNA sequencing.

White root rot disease, caused by the pathogen Rosellinia necatrix, is one of the world’s most devastating plant fungal diseases and affects several commercially important species of fruit trees and crops. Recent global outbreaks of R. necatrix and advances in molecular techniques have both increased interest in this pathogen. However, the lack of information regarding the genomic structure and transcriptome of R. necatrix has been a barrier to the progress of functional genomic research and the control of this harmful pathogen. Here, we identified 10,616 novel full-length transcripts from the filamentous hyphal tissue of R. necatrix (KACC 40445 strain) using PacBio single-molecule sequencing technology. After annotation of the unigene sets, we selected 14 cell cycle-related genes, which are likely either positively or negatively involved in hyphal growth by cell cycle control. The expression of the selected genes was further compared between two strains that displayed different growth rates on nutritional media. Furthermore, we predicted pathogen-related effector genes and cell wall-degrading enzymes from the annotated gene sets. These results provide the most comprehensive transcriptomal resources for R. necatrix, and could facilitate functional genomics and further analyses of this important phytopathogen.


September 22, 2019

Metagenomic SMRT sequencing-based exploration of novel lignocellulose-degrading capability in wood detritus from Torreya nucifera in Bija forest on Jeju Island.

Lignocellulose, mostly composed of cellulose, hemicellulose and lignin generated through secondary growth of woody plant, is considered as promising resources for bio-fuel. In order to use lignocellulose as a biofuel, the biodegradation besides high-cost chemical treatments were applied, but its knowledge on decomposition of lignocellulose occurring in a natural environment were insufficient. We analyzed 16S rRNA gene and metagenome to understand how the lignocellulose are decomposed naturally in decayed Torreya nucifera (L) of Bija forest (Bijarim) in Gotjawal, an ecologically distinct environment. A total of 464,360 reads were obtained from 16S rRNA gene sequencing, representing diverse phyla; Proteobacteria (51%), Bacteroidetes (11%) and Actinobacteria (10%). The metagenome analysis using Single Molecules Real-Time Sequencing revealed that the assembled contigs determined by originated from Proteobacteria (58%) and Actinobacteria (10.3%). Carbohydrate Active enZYmes (CAZy) and Protein families (Pfam) based analysis showed that Proteobacteria was involved in degrading whole lignocellulose and Actinobacteria played a role only in a part of hemicellulose degradation. Combining these results, it suggested that Proteobacteria and Actinobacteria had selective biodegradation potential for different lignocellulose substrate. Thus, it is considered that understanding of the systemic microbial degradation pathways may be a useful strategy for recycle of lignocellulosic biomass and the microbial enzymes in Bija forest can be useful natural resources in industrial processes.


September 22, 2019

The industrial melanism mutation in British peppered moths is a transposable element.

Discovering the mutational events that fuel adaptation to environmental change remains an important challenge for evolutionary biology. The classroom example of a visible evolutionary response is industrial melanism in the peppered moth (Biston betularia): the replacement, during the Industrial Revolution, of the common pale typica form by a previously unknown black (carbonaria) form, driven by the interaction between bird predation and coal pollution. The carbonaria locus has been coarsely localized to a 200-kilobase region, but the specific identity and nature of the sequence difference controlling the carbonaria-typica polymorphism, and the gene it influences, are unknown. Here we show that the mutation event giving rise to industrial melanism in Britain was the insertion of a large, tandemly repeated, transposable element into the first intron of the gene cortex. Statistical inference based on the distribution of recombined carbonaria haplotypes indicates that this transposition event occurred around 1819, consistent with the historical record. We have begun to dissect the mode of action of the carbonaria transposable element by showing that it increases the abundance of a cortex transcript, the protein product of which plays an important role in cell-cycle regulation, during early wing disc development. Our findings fill a substantial knowledge gap in the iconic example of microevolutionary change, adding a further layer of insight into the mechanism of adaptation in response to natural selection. The discovery that the mutation itself is a transposable element will stimulate further debate about the importance of ‘jumping genes’ as a source of major phenotypic novelty.


September 22, 2019

Identification of microbial profile of Koji using Single Molecule, Real-Time Sequencing technology.

Koji is a kind of Japanese traditional fermented starter that has been used for centuries. Many fermented foods are made from koji, such as sake, miso, and soy sauce. This study used the single molecule real-time sequencing technology (SMRT) to investigate the bacterial and fungal microbiota of 3 Japanese koji samples. After SMRT analysis, a total of 39121 high-quality sequences were generated, including 14354 bacterial and 24767 fungal sequence reads. The high-quality gene sequences were assigned to 5 bacterial and 2 fungal plyla, dominated by Proteobacteria and Ascomycota, respectively. At the genus level, Ochrobactrum and Wickerhamomyces were the most abundant bacterial and fungal genera, respectively. The predominant bacterial and fungal species were Ochrobactrum lupini and Wickerhamomyces anomalus, respectively. Our study profiled the microbiota composition of 3 Japanese koji samples to the species level precision. The results may be useful for further development of traditional fermented products, especially optimization of koji preparation. Meanwhile, this study has demonstrated that SMRT is a robust tool for analyzing the microbial composition in food samples.© 2017 Institute of Food Technologists®.


September 22, 2019

Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.

The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important ERBB2 oncogene (also known as HER2), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.© 2018 Nattestad et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Dual platform long-read RNA-sequencing dataset of the human Cytomegalovirus Lytic transcriptome

RNA-sequencing has revolutionized transcriptomics and the way we measure gene expression (Wang et al., 2009). As of today, short-read RNA sequencing is more widely used, and due to its low price and high throughput, is the preferred tool for the quantitative analysis of gene expression. However, the annotation of transcript isoforms is rather difficult using only short-read sequencing data, because the reads are shorter than most transcripts (Steijger et al., 2013). Long-read sequencing, on the other hand, can provide full contig information about transcripts, including exon-connectivity, and its merits in transcriptome profiling are being increasingly acknowledged (Sharon et al., 2013; Abdel-Ghany et al., 2016; Wang et al., 2016; Kuo et al., 2017). Due to the relatively low throughput of current long-read sequencing technologies, they can only characterize smaller transcriptomes in high-depth (Weirather et al., 2017). The Human cytomegalovirus (HCMV) is a ubiquitous betaherpesvirus, which can cause mononucleosis-like symptoms in adults (Cohen and Corey, 1985), and severe life-threatening infections in newborns (Wen et al., 2002). Latent HCMV infection has recently been implicated to affect cancer formation (Dziurzynski et al., 2012; Jin et al., 2014). Examining the transcriptome of the virus can go a long way in helping understand its molecular biology. Short-read RNA sequencing studies have discovered splice junctions and non-coding transcripts (Gatherer et al., 2011) and have shown that the most abundant HCMV transcripts are similarly expressed in different cell types (Cheng et al., 2017). Our long-read RNA sequencing experiments using the Pacific Biosciences (PacBio) RSII platform revealed a great number of transcript isoforms, polycistronic RNAs and transcriptional overlaps (Balázs et al., 2017a).


September 22, 2019

Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.


September 22, 2019

Genome and evolution of the shade-requiring medicinal herb Panax ginseng.

Panax ginseng C. A. Meyer, reputed as the king of medicinal herbs, has slow growth, long generation time, low seed production and complicated genome structure that hamper its study. Here, we unveil the genomic architecture of tetraploid P. ginseng by de novo genome assembly, representing 2.98 Gbp with 59 352 annotated genes. Resequencing data indicated that diploid Panax species diverged in association with global warming in Southern Asia, and two North American species evolved via two intercontinental migrations. Two whole genome duplications (WGD) occurred in the family Araliaceae (including Panax) after divergence with the Apiaceae, the more recent one contributing to the ability of P. ginseng to overwinter, enabling it to spread broadly through the Northern Hemisphere. Functional and evolutionary analyses suggest that production of pharmacologically important dammarane-type ginsenosides originated in Panax and are produced largely in shoot tissues and transported to roots; that newly evolved P. ginseng fatty acid desaturases increase freezing tolerance; and that unprecedented retention of chlorophyll a/b binding protein genes enables efficient photosynthesis under low light. A genome-scale metabolic network provides a holistic view of Panax ginsenoside biosynthesis. This study provides valuable resources for improving medicinal values of ginseng either through genomics-assisted breeding or metabolic engineering.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


September 22, 2019

Long-read based assembly and annotation of a Drosophila simulans genome

Long-read sequencing technologies enable high-quality, contiguous genome assemblies. Here we used SMRT sequencing to assemble the genome of a Drosophila simulans strain originating from Madagascar, the ancestral range of the species. We generated 8 Gb of raw data (~50x coverage) with a mean read length of 6,410 bp, a NR50 of 9,125 bp and the longest subread at 49 kb. We benchmarked six different assemblers and merged the best two assemblies from Canu and Falcon. Our final assembly was 127.41 Mb with a N50 of 5.38 Mb and 305 contigs. We anchored more than 4 Mb of novel sequence to the major chromosome arms, and significantly improved the assembly of peri-centromeric and telomeric regions. Finally, we performed full-length transcript sequencing and used this data in conjunction with short-read RNAseq data to annotate 13,422 genes in the genome, improving the annotation in regions with complex, nested gene structures.


September 22, 2019

Transcriptome-referenced association study of clove shape traits in garlic.

Genome-wide association studies are a powerful approach for identifying genes related to complex traits in organisms, but are limited by the requirement for a reference genome sequence of the species under study. To circumvent this problem, we propose a transcriptome-referenced association study (TRAS) that utilizes a transcriptome generated by single-molecule long-read sequencing as a reference sequence to score population variation at both transcript sequence and expression levels. Candidate transcripts are identified when both scores are associated with a trait and their potential interactions are ascertained by expression quantitative trait loci analysis. Applying this method to characterize garlic clove shape traits in 102 landraces, we identified 22 candidate transcripts, most of which showed extensive interactions. Eight transcripts were long non-coding RNAs (lncRNAs), and the others were proteins involved mainly in carbohydrate metabolism, protein degradation, etc. TRAS, as an efficient tool for association study independent of a reference genome, extends the applicability of association studies to a broad range of species.


September 22, 2019

A survey of transcriptome complexity in Sus scrofa using single-molecule long-read sequencing.

Alternative splicing (AS) and fusion transcripts produce a vast expansion of transcriptomes and proteomes diversity. However, the reliability of these events and the extend of epigenetic mechanisms have not been adequately addressed due to its limitation of uncertainties about the complete structure of mRNA. Here we combined single-molecule real-time sequencing, Illumina RNA-seq and DNA methylation data to characterize the landscapes of DNA methylation on AS, fusion isoforms formation and lncRNA feature and further to unveil the transcriptome complexity of pig. Our analysis identified an unprecedented scale of high-quality full-length isoforms with over 28,127 novel isoforms from 26,881 novel genes. More than 92,000 novel AS events were detected and intron retention predominated in AS model, followed by exon skipping. Interestingly, we found that DNA methylation played an important role in generating various AS isoforms by regulating splicing sites, promoter regions and first exons. Furthermore, we identified a large of fusion transcripts and novel lncRNAs, and found that DNA methylation of the promoter and gene body could regulate lncRNA expression. Our results significantly improved existed gene models of pig and unveiled that pig AS and epigenetic modify were more complex than previously thought.


September 22, 2019

Transcriptomic study of Herpes simplex virus type-1 using full-length sequencing techniques

Herpes simplex virus type-1 (HSV-1) is a human pathogenic member of the Alphaherpesvirinae subfamily of herpesviruses. The HSV-1 genome is a large double-stranded DNA specifying about 85 protein coding genes. The latest surveys have demonstrated that the HSV-1 transcriptome is much more complex than it had been thought before. Here, we provide a long-read sequencing dataset, which was generated by using the RSII and Sequel systems from Pacific Biosciences (PacBio), as well as MinION sequencing system from Oxford Nanopore Technologies (ONT). This dataset contains 39,096 reads of inserts (ROIs) mapped to the HSV-1 genome (X14112) in RSII sequencing, while Sequel sequencing yielded 77,851 ROIs. The MinION cDNA sequencing altogether resulted in 158,653 reads, while the direct RNA-seq produced 16,516 reads. This dataset can be utilized for the identification of novel HSV RNAs and transcripts isoforms, as well as for the comparison of the quality and length of the sequencing reads derived from the currently available long- read sequencing platforms. The various library preparation approaches can also be compared with each other.


September 22, 2019

Anthropogenic N deposition alters the composition of expressed class II fungal peroxidases.

Here, we present evidence that ca. 20 years of experimental N deposition altered the composition of lignin-decaying class II peroxidases expressed by forest floor fungi, a response which has occurred concurrently with reductions in plant litter decomposition and a rapid accumulation of soil organic matter. This finding suggests that anthropogenic N deposition has induced changes in the biological mediation of lignin decay, the rate limiting step in plant litter decomposition. Thus, an altered composition of transcripts for a critical gene that is associated with terrestrial C cycling may explain the increased soil C storage under long-term increases in anthropogenic N deposition.IMPORTANCE Fungal class II peroxidases are enzymes that mediate the rate-limiting step in the decomposition of plant material, which involves the oxidation of lignin and other polyphenols. In field experiments, anthropogenic N deposition has increased soil C storage in forests, a result which could potentially arise from anthropogenic N-induced changes in the composition of class II peroxidases expressed by the fungal community. In this study, we have gained unique insight into how anthropogenic N deposition, a widespread agent of global change, affects the expression of a functional gene encoding an enzyme that plays a critical role in a biologically mediated ecosystem process. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Bypassing the Restriction System To Improve Transformation of Staphylococcus epidermidis.

Staphylococcus epidermidis is the leading cause of infections on indwelling medical devices worldwide. Intrinsic antibiotic resistance and vigorous biofilm production have rendered these infections difficult to treat and, in some cases, require the removal of the offending medical prosthesis. With the exception of two widely passaged isolates, RP62A and 1457, the pathogenesis of infections caused by clinical S. epidermidis strains is poorly understood due to the strong genetic barrier that precludes the efficient transformation of foreign DNA into clinical isolates. The difficulty in transforming clinical S. epidermidis isolates is primarily due to the type I and IV restriction-modification systems, which act as genetic barriers. Here, we show that efficient plasmid transformation of clinical S. epidermidis isolates from clonal complexes 2, 10, and 89 can be realized by employing a plasmid artificial modification (PAM) in Escherichia coli DC10B containing a ?dcm mutation. This transformative technique should facilitate our ability to genetically modify clinical isolates of S. epidermidis and hence improve our understanding of their pathogenesis in human infections.IMPORTANCEStaphylococcus epidermidis is a source of considerable morbidity worldwide. The underlying mechanisms contributing to the commensal and pathogenic lifestyles of S. epidermidis are poorly understood. Genetic manipulations of clinically relevant strains of S. epidermidis are largely prohibited due to the presence of a strong restriction barrier. With the introductions of the tools presented here, genetic manipulation of clinically relevant S. epidermidis isolates has now become possible, thus improving our understanding of S. epidermidis as a pathogen. Copyright © 2017 American Society for Microbiology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.