Menu
September 22, 2019

Single-molecule long-read transcriptome profiling of Platysternon megacephalum mitochondrial genome with gene rearrangement and control region duplication.

Platysternon megacephalum is the sole living representative of the poorly studied turtle lineage Platysternidae. Their mitochondrial genome has been subject to gene rearrangement and control region duplication, resulting in a unique mitochondrial gene order in vertebrates. In this study, we sequenced the first full-length turtle (P. megacephalum) liver transcriptome using single-molecule real-time sequencing to study the transcriptional mechanisms of its mitochondrial genome. ND5 and ND6 anti-sense (ND6AS) forms a single transcript with the same expression in the human mitochondrial genome, but here we demonstrated differential expression of the rearranged ND5 and ND6AS genes in P. megacephalum. And some polycistronic transcripts were also reported in this study. Notably, we detected some novel long non-coding RNAs with alternative polyadenylation from the duplicated control region, and a novel ND6AS transcript composed of a long non-coding sequence, ND6AS, and tRNA-GluAS. These results provide the first description of a mtDNA transcriptome with gene rearrangement and control region duplication. These findings further our understanding of the fundamental concepts of mitochondrial gene transcription and RNA processing, and provide a new insight into the mechanism of transcription regulation of the mitochondrial genome.


September 22, 2019

Analysis of RNA base modification and structural rearrangement by single-molecule real-time detection of reverse transcription.

Zero-mode waveguides (ZMWs) are photonic nanostructures that create highly confined optical observation volumes, thereby allowing single-molecule-resolved biophysical studies at relatively high concentrations of fluorescent molecules. This principle has been successfully applied in single-molecule, real-time (SMRT®) DNA sequencing for the detection of DNA sequences and DNA base modifications. In contrast, RNA sequencing methods cannot provide sequence and RNA base modifications concurrently as they rely on complementary DNA (cDNA) synthesis by reverse transcription followed by sequencing of cDNA. Thus, information on RNA modifications is lost during the process of cDNA synthesis.Here we describe an application of SMRT technology to follow the activity of reverse transcriptase enzymes synthesizing cDNA on thousands of single RNA templates simultaneously in real time with single nucleotide turnover resolution using arrays of ZMWs. This method thereby obtains information from the RNA template directly. The analysis of the kinetics of the reverse transcriptase can be used to identify RNA base modifications, shown by example for N6-methyladenine (m6A) in oligonucleotides and in a specific mRNA extracted from total cellular mRNA. Furthermore, the real-time reverse transcriptase dynamics informs about RNA secondary structure and its rearrangements, as demonstrated on a ribosomal RNA and an mRNA template.Our results highlight the feasibility of studying RNA modifications and RNA structural rearrangements in ZMWs in real time. In addition, they suggest that technology can be developed for direct RNA sequencing provided that the reverse transcriptase is optimized to resolve homonucleotide stretches in RNA.


September 22, 2019

Fine mapping and candidate gene identification of the genic male-sterile gene ms3 in cabbage 51S.

The ms3 gene responsible for a male-sterile phenotype in cabbage was mapped to a 187.4-kb genomic fragment. The gene BoTPD1, a homolog of Arabidopsis TPD1, was identified as a strong candidate gene. Cabbage 51S is a spontaneous male-sterile mutant. Phenotypic investigation revealed defects in anther cell differentiation, with failure to form the tapetum layer and complete abortion of microsporocytes before the tetrad stage. Genetic analysis indicated that this male sterility was controlled by a single recessive gene, ms3. Using an F2 population, we mapped ms3 to a 187.4-kb interval. BoTPD1 was identified as a candidate from this interval. Sequence analysis revealed an intronic 182-bp insertion in 51S that interrupted the conserved motif at the 5′ splicing site of the third intron, possibly resulting in a truncated transcript. Analyses of BoTPD1 homologous proteins revealed evolutionarily conserved roles in anther cell fate determination during reproductive development. RT-PCR showed that BoTPD1 was expressed in various tissues, excluding the root, and high expression levels were detected in anthers and buds. A BoTPD1-specific marker based on the 182-bp insertion cosegregated with male sterility and can be used for marker-assisted selection.


September 22, 2019

Using PacBio long-read high-throughput microbial gene amplicon sequencing to evaluate infant formula safety.

Infant formula (IF) requires a strict microbiological standard because of the high vulnerability of infants to foodborne diseases. The current study used the PacBio single molecule real-time (SMRT) sequencing platform to generate full-length 16S rRNA-based bacterial microbiota profiles of thirty Chinese domestic and imported IF samples. A total of 600 species were identified, dominated by Streptococcus thermophilus, Lactococcus lactis and Lactococcus piscium. Distinctive bacterial profiles were observed between the two sample groups, as confirmed with both principal coordinate analysis and multivariate analysis of variance. Moreover, the product whey protein nitrogen index (WPNI), representing the degree of preheating, negatively correlated with the relative abundances of the Bacillus genus. Our study has demonstrated the application of the PacBio SMRT sequencing platform in assessing the bacterial contamination of IF products, which is of interest to the dairy industry for effective monitoring of microbial quality and safety during production.


September 22, 2019

A workflow for studying specialized metabolism in nonmodel eukaryotic organisms

Eukaryotes contain a diverse tapestry of specialized metabolites, many of which are of significant pharmaceutical and industrial importance to humans. Nevertheless, exploration of specialized metabolic pathways underlying specific chemical traits in nonmodel eukaryotic organisms has been technically challenging and historically lagged behind that of the bacterial systems. Recent advances in genomics, metabolomics, phylogenomics, and synthetic biology now enable a new workflow for interrogating unknown specialized metabolic systems in nonmodel eukaryotic hosts with greater efficiency and mechanistic depth. This chapter delineates such workflow by providing a collection of state-of-the-art approaches and tools, ranging from multiomics-guided candidate gene identification to in vitro and in vivo functional and structural characterization of specialized metabolic enzymes. As already demonstrated by several recent studies, this new workflow opens up a gateway into the largely untapped world of natural product biochemistry in eukaryotes. © 2016 Elsevier Inc. All rights reserved.


September 22, 2019

Extensive alternative splicing of KIR transcripts.

The killer-cell Ig-like receptors (KIR) form a multigene entity involved in modulating immune responses through interactions with MHC class I molecules. The complexity of the KIR cluster is reflected by, for instance, abundant levels of allelic polymorphism, gene copy number variation, and stochastic expression profiles. The current transcriptome study involving human and macaque families demonstrates that KIR family members are also subjected to differential levels of alternative splicing, and this seems to be gene dependent. Alternative splicing may result in the partial or complete skipping of exons, or the partial inclusion of introns, as documented at the transcription level. This post-transcriptional process can generate multiple isoforms from a single KIR gene, which diversifies the characteristics of the encoded proteins. For example, alternative splicing could modify ligand interactions, cellular localization, signaling properties, and the number of extracellular domains of the receptor. In humans, we observed abundant splicing for KIR2DL4, and to a lesser extent in the lineage III KIR genes. All experimentally documented splice events are substantiated by in silico splicing strength predictions. To a similar extent, alternative splicing is observed in rhesus macaques, a species that shares a close evolutionary relationship with humans. Splicing profiles of Mamu-KIR1D and Mamu-KIR2DL04 displayed a great diversity, whereas Mamu-KIR3DL20 (lineage V) is consistently spliced to generate a homolog of human KIR2DL5 (lineage I). The latter case represents an example of convergent evolution. Although just a single KIR splice event is shared between humans and macaques, the splicing mechanisms are similar, and the predicted consequences are comparable. In conclusion, alternative splicing adds an additional layer of complexity to the KIR gene system in primates, and results in a wide structural and functional variety of KIR receptors and its isoforms, which may play a role in health and disease.


September 22, 2019

Full-length transcriptome survey and expression analysis of Cassia obtusifolia to discover putative genes related to aurantio-obtusin biosynthesis, seed formation and development, and stress response.

The seed is the pharmaceutical and breeding organ of Cassia obtusifolia, a well-known medical herb containing aurantio-obtusin (a kind of anthraquinone), food, and landscape. In order to understand the molecular mechanism of the biosynthesis of aurantio-obtusin, seed formation and development, and stress response of C. obtusifolia, it is necessary to understand the genomics information. Although previous seed transcriptome of C. obtusifolia has been carried out by short-read next-generation sequencing (NGS) technology, the vast majority of the resulting unigenes did not represent full-length cDNA sequences and supply enough gene expression profile information of the various organs or tissues. In this study, fifteen cDNA libraries, which were constructed from the seed, root, stem, leaf, and flower (three repetitions with each organ) of C. obtusifolia, were sequenced using hybrid approach combining single-molecule real-time (SMRT) and NGS platform. More than 4,315,774 long reads with 9.66 Gb sequencing data and 361,427,021 short reads with 108.13 Gb sequencing data were generated by SMRT and NGS platform, respectively. 67,222 consensus isoforms were clustered from the reads and 81.73% (61,016) of which were longer than 1000 bp. Furthermore, the 67,222 consensus isoforms represented 58,106 nonredundant transcripts, 98.25% (57,092) of which were annotated and 25,573 of which were assigned to specific metabolic pathways by KEGG. CoDXS and CoDXR genes were directly used for functional characterization to validate the accuracy of sequences obtained from transcriptome. A total of 658 seed-specific transcripts indicated their special roles in physiological processes in seed. Analysis of transcripts which were involved in the early stage of anthraquinone biosynthesis suggested that the aurantio-obtusin in C. obtusifolia was mainly generated from isochorismate and Mevalonate/methylerythritol phosphate (MVA/MEP) pathway, and three reactions catalyzed by Menaquinone-specific isochorismate synthase (ICS), 1-deoxy-d-xylulose-5-phosphate synthase (DXS) and isopentenyl diphosphate (IPPS) might be the limited steps. Several seed-specific CYPs, SAM-dependent methyltransferase, and UDP-glycosyltransferase (UDPG) supplied promising candidate genes in the late stage of anthraquinone biosynthesis. In addition, four seed-specific transcriptional factors including three MYB Transcription Factor (MYB) and one MADS-box Transcription Factor (MADS) transcriptional factors) and alternative splicing might be involved with seed formation and development. Meanwhile, most members of Hsp20 genes showed high expression level in seed and flower; seven of which might have chaperon activities under various abiotic stresses. Finally, the expressional patterns of genes with particular interests showed similar trends in both transcriptome assay and qRT-PCR. In conclusion, this is the first full-length transcriptome sequencing reported in Caesalpiniaceae family, and thus providing a more complete insight into aurantio-obtusin biosynthesis, seed formation and development, and stress response as well in C. obtusifolia.


September 22, 2019

Genomic imprinting mediates dosage compensation in a young plant XY system.

Sex chromosomes have repeatedly evolved from a pair of autosomes. Consequently, X and Y chromosomes initially have similar gene content, but ongoing Y degeneration leads to reduced expression and eventual loss of Y genes1. The resulting imbalance in gene expression between Y genes and the rest of the genome is expected to reduce male fitness, especially when protein networks have components from both autosomes and sex chromosomes. A diverse set of dosage compensating mechanisms that alleviates these negative effects has been described in animals2-4. However, the early steps in the evolution of dosage compensation remain unknown, and dosage compensation is poorly understood in plants5. Here, we describe a dosage compensation mechanism in the evolutionarily young XY sex determination system of the plant Silene latifolia. Genomic imprinting results in higher expression from the maternal X chromosome in both males and females. This compensates for reduced Y expression in males, but results in X overexpression in females and may be detrimental. It could represent a transient early stage in the evolution of dosage compensation. Our finding has striking resemblance to the first stage proposed by Ohno6 for the evolution of X inactivation in mammals.


September 22, 2019

Isoform sequencing and state-of-art applications for unravelling complexity of plant transcriptomes

Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research.


September 22, 2019

Cataloguing over-expressed genes in Epstein Barr Virus immortalized lymphoblastoid cell lines through consensus analysis of PacBio transcriptomes corroborates hypomethylation of chromosome 1

The ability of Epstein Barr Virus (EBV) to transform resting cell B-cells into immortalized lymphoblastoid cell lines (LCL) provides a continuous source of peripheral blood lymphocytes that are used to model conditions in which these lymphocytes play a key role. Here, the PacBio generated transcriptome of three LCLs from a parent-daughter trio (SRAid:SRP036136) provided by a previous study [1] were analyzed using a kmer-based version of YeATS (KEATS). The set of over-expressed genes in these cell lines were determined based on a comparison with the PacBio transcriptome of twenty tissues pro- vided by another study (hOPTRS) [2]. MIR155 long non-coding RNA (MIR155HG), Fc fragment of IgE receptor II (FCER2), T-cell leukemia/lymphoma 1A (TCL1A), and germinal center associated signaling and motility (GCSAM) were genes having the highest expression counts in the three LCLs with no expression in hOPTRS. Other over-expressed genes, having low expression in hOPTRS, were membrane spanning 4-domains A1 (MS4A1) and ribosomal protein S2 pseudogene 55 (RPS2P55). While some of these genes are known to be over-expressed in LCLs, this study provides a comprehensive cataloguing of such genes. A recent work involving a patient with EBV-positive large B-cell lymphoma was “unusually lacking various B-cell markers”, but over-expressing CD30 [3] – a gene ranked 79 among uniquely expressed genes here. Hypomethylation of chromosome 1 observed in EBV immortalized LCLs [4, 5] is also corroborated here by mapping the genes to chromosomes. Extending previous work identifying un-annotated genes [6], 80 genes were identified which are expressed in the three LCLs, not in hOPTRS, and missing in the GENCODE, RefSeq and RefSeqGene databases. KEATS introduces a method of determining expression counts based on a partitioning of the known annotated genes, has runtimes of a few hours on a personal workstation and provides detailed reports enabling proper debugging.


September 22, 2019

Comprehensive genomic analysis of malignant pleural mesothelioma identifies recurrent mutations, gene fusions and splicing alterations.

We analyzed transcriptomes (n = 211), whole exomes (n = 99) and targeted exomes (n = 103) from 216 malignant pleural mesothelioma (MPM) tumors. Using RNA-seq data, we identified four distinct molecular subtypes: sarcomatoid, epithelioid, biphasic-epithelioid (biphasic-E) and biphasic-sarcomatoid (biphasic-S). Through exome analysis, we found BAP1, NF2, TP53, SETD2, DDX3X, ULK2, RYR2, CFAP45, SETDB1 and DDX51 to be significantly mutated (q-score = 0.8) in MPMs. We identified recurrent mutations in several genes, including SF3B1 (~2%; 4/216) and TRAF7 (~2%; 5/216). SF3B1-mutant samples showed a splicing profile distinct from that of wild-type tumors. TRAF7 alterations occurred primarily in the WD40 domain and were, except in one case, mutually exclusive with NF2 alterations. We found recurrent gene fusions and splice alterations to be frequent mechanisms for inactivation of NF2, BAP1 and SETD2. Through integrated analyses, we identified alterations in Hippo, mTOR, histone methylation, RNA helicase and p53 signaling pathways in MPMs.


September 22, 2019

Transgenerational attenuation of opioid self-administration as a consequence of adolescent morphine exposure.

The United States is in the midst of an opiate epidemic, with abuse of prescription and illegal opioids increasing steadily over the past decade. While it is clear that there is a genetic component to opioid addiction, there is a significant portion of heritability that cannot be explained by genetics alone. The current study was designed to test the hypothesis that maternal exposure to opioids prior to pregnancy alters abuse liability in subsequent generations. Female adolescent Sprague Dawley rats were administered morphine at increasing doses (5-25 mg/kg, s.c.) or saline for 10 days (P30-39). During adulthood, animals were bred with drug-naïve colony males. Male and female adult offspring (F1 animals) were tested for morphine self-administration acquisition, progressive ratio, extinction, and reinstatement at three doses of morphine (0.25, 0.75, 1.25 mg/kg/infusion). Grand offspring (F2 animals, from the maternal line) were also examined. Additionally, gene expression changes within the nucleus accumbens were examined with RNA deep sequencing (PacBio) and qPCR. There were dose- and sex-dependent effects on all phases of the self-administration paradigm that indicate decreased morphine reinforcement and attenuated relapse-like behavior. Additionally, genes related to synaptic plasticity, as well as myelin basic protein (MBP), were dysregulated. Some, but not all, effects persisted into the subsequent (F2) generation. The results demonstrate that even limited opioid exposure during adolescence can have lasting effects across multiple generations, which has implications for mechanisms of the transmission of drug abuse liability in humans. Copyright © 2016 Elsevier Ltd. All rights reserved.


September 22, 2019

Introduction to isoform sequencing using Pacific Biosciences technology (Iso-Seq)

Alternative RNA splicing is a known phenomenon, but we still do not have a complete catalog of isoforms that explain variability in the human transcriptome. We have made significant progress in developing methods to study variability of the transcriptome, but we are far away of having a complete picture of the transcriptome. The initial methods to study gene expression were based on cloning of cDNAs and Sanger sequencing. The strategy was labor-intensive and expensive. With the development of microarrays, different methods based on exon arrays and tiling arrays provided valuable information about RNA expression. However, the microarray presented significant limitations. Most of the limitations became apparent by 2005, but it was not until 2008 that an alternative method to study the transcriptome was developed. RNA Sequencing using next-generation sequencing (RNA-Seq) quickly became the technology of choice for gene expression profiling. Recently, the precision and sensitivity of RNA-Seq have come into question, especially for transcriptome reconstruction. This chapter will describe a relatively new method, “Isoform Sequencing (Iso-Seq). Iso-Seq was developed by Pacific Biosciences (PacBio), and it is capable of identifying new isoforms with extraordinary precision due to its long-read technology. The technique to create libraries is straightforward, and the PacBio RS II instrument generates the information in hours. The bioinformatics analysis is performed using the freely available SMRT® Portal software. The SMRT Portal is easy to use and capable of performing all the steps necessary to analyze the raw data and to generate high-quality full-length isoforms. For the universal acceptance of the Iso-Seq method, the capacity of the SMRT Cells needs to improve at least 10- to 100-fold to make the system affordable and attractive to users.


September 22, 2019

Next-generation sequencing for pathogen detection and identification

Over the past decade, the field of genomics has seen such drastic improvements in sequencing chemistries that high-throughput sequencing, or next-generation sequencing (NGS), is being applied to generate data across many disciplines. NGS instruments are becoming less expensive, faster, and smaller, and therefore are being adopted in an increasing number of laboratories, including clinical laboratories. Thus far, clinical use of NGS has been mostly focused on the human genome, for purposes such as characterizing the molecular basis of cancer or for diagnosing and understanding the basis of rare genetic disorders. There are, however, an increasing number of examples whereby NGS is employed to discover novel pathogens, and these cases provide precedent for the use of NGS in microbial diagnostics. NGS has many advantages over traditional microbial diagnostic methods, such as unbiased rather than pathogen-specific protocols, ability to detect fastidious or non-culturable organisms, and ability to detect co-infections. One of the most impressive advantages of NGS is that it requires little or no prior knowledge of the pathogen, unlike many other diagnostic assays; therefore for pathogen discovery, NGS is very valuable. However, despite these advantages, there are challenges involved in implementing NGS for routine clinical microbiological diagnosis. We discuss these advantages and challenges in the context of recently described research studies.


September 22, 2019

Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.

Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio’s single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.