Menu
September 22, 2019  |  

Isoform sequencing provides insight into natural genetic diversity in maize.

W64A, as a member of non-stiff stalk maize, has been used to develop current corn in plant breeding, and serving as one of broadest parent line for the commercial hybrid seed production (Huffman, 1984). The inbred had the characteristics of early flowering, average plant and ear height at its maturity, very strong roots and good stalks (Runge, 2004). In addition, W64A serves as an invaluable germplasm to study gene functions especially in the field of corn nutrition and endosperm texture given its nearly complete vitreousness and hardness (Figure 1a). However, little is known about the background of genetic and genomic information for W64A. With the advent of the revolutionary technology of PacBio long-read sequencing, we can simultaneously obtain a large amount of full-length cDNA up to 20 kb (An et al., 2018). This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


September 22, 2019  |  

A genomic case study of mixed fibrolamellar hepatocellular carcinoma.

Mixed fibrolamellar hepatocellular carcinoma (mFL-HCC) is a rare liver tumor defined by the presence of both pure FL-HCC and conventional HCC components, represents up to 25% of cases of FL-HCC, and has been associated with worse prognosis. Recent genomic characterization of pure FL-HCC identified a highly recurrent transcript fusion (DNAJB1:PRKACA) not found in conventional HCC.We performed exome and transcriptome sequencing of a case of mFL-HCC. A novel BAC-capture approach was developed to identify a 400 kb deletion as the underlying genomic mechanism for a DNAJB1:PRKACA fusion in this case. A sensitive Nanostring Elements assay was used to screen for this transcript fusion in a second case of mFL-HCC, 112 additional HCC samples and 44 adjacent non-tumor liver samples.We report the first comprehensive genomic analysis of a case of mFL-HCC. No common HCC-associated mutations were identified. The very low mutation rate of this case, large number of mostly single-copy, long-range copy number variants, and high expression of ERBB2 were more consistent with previous reports of pure FL-HCC than conventional HCC. In particular, the DNAJB1:PRKACA fusion transcript specifically associated with pure FL-HCC was detected at very high expression levels. Subsequent analysis revealed the presence of this fusion in all primary and metastatic samples, including those with mixed or conventional HCC pathology. A second case of mFL-HCC confirmed our finding that the fusion was detectable in conventional components. An expanded screen identified a third case of fusion-positive HCC, which upon review, also had both conventional and fibrolamellar features. This screen confirmed the absence of the fusion in all conventional HCC and adjacent non-tumor liver samples.These results indicate that mFL-HCC is similar to pure FL-HCC at the genomic level and the DNAJB1:PRKACA fusion can be used as a diagnostic tool for both pure and mFL-HCC.© The Author 2016. Published by Oxford University Press on behalf of the European Society for Medical Oncology.


September 22, 2019  |  

Recent developments in using advanced sequencing technologies for the genomic studies of lignin and cellulose degrading microorganisms.

Lignin is a complex polyphenyl aromatic compound which exists in tight associations with cellulose and hemicellulose to form plant primary and secondary cell wall. Lignocellulose is an abundant renewable biomaterial present on the earth. It has gained much attention in the scientific community in recent years because of its potential applications in bio-based industries. Microbial degradation of lignocellulose polymers was well studied in wood decaying fungi. Based on the plant materials they degrade these fungi were classified as white rot, brown rot and soft rot. However, some groups of bacteria belonging to the actinomycetes, a-proteobacteria and ß-proteobacteria were also found to be efficient in degrading lignocellulosic biomass but not well understood unlike the fungi. In this review we focus on recent advancements deployed for finding and understanding the lignocellulose degradation by microorganisms. Conventional molecular methods like sequencing 16s rRNA and Inter Transcribed Spacer (ITS) regions were used for identification and classification of microbes. Recent progression in genomics mainly next generation sequencing technologies made the whole genome sequencing of microbes possible in a great ease. The whole genome sequence studies reveals high quality information about genes and canonical pathways involved in the lignin and other cell wall components degradation.


September 22, 2019  |  

Complete genome sequences of two genotype A2 small ruminant lentiviruses isolated from infected U.S. sheep.

Two distinct subgroups of genotype A2 small ruminant lentiviruses (SRLVs) have been identified in the United States that infect sheep with specific host transmembrane protein 154 (TMEM154) diplotypes. Here, we report the first two complete genome sequences of SRLV strains infecting U.S. sheep belonging to genotype A2, subgroups 1 and 2. Copyright © 2017 Workman et al.


September 22, 2019  |  

Alternative isoform analysis of Ttc8 expression in the rat pineal gland using a multi-platform sequencing approach reveals neural regulation.

Alternative isoform regulation (AIR) vastly increases transcriptome diversity and plays an important role in numerous biological processes and pathologies. However, the detection and analysis of isoform-level differential regulation is difficult, particularly in the face of complex and incompletely-annotated transcriptomes. Here we have used Illumina short-read/high-throughput RNA-Seq to identify 55 genes that exhibit neurally-regulated AIR in the pineal gland, and then used two other complementary experimental platforms to further study and characterize the Ttc8 gene, which is involved in Bardet-Biedl syndrome and non-syndromic retinitis pigmentosa. Use of the JunctionSeq analysis tool led to the detection of several novel exons and splice junctions in this gene, including two novel alternative transcription start sites which were found to display disproportionately strong neurally-regulated differential expression in several independent experiments. These high-throughput sequencing results were validated and augmented via targeted qPCR and long-read Pacific Biosciences SMRT sequencing. We confirmed the existence of numerous novel splice junctions and the selective upregulation of the two novel start sites. In addition, we identified more than 20 novel isoforms of the Ttc8 gene that are co-expressed in this tissue. By using information from multiple independent platforms we not only greatly reduce the risk of errors, biases, and artifacts influencing our results, we also are able to characterize the regulation and splicing of the Ttc8 gene more deeply and more precisely than would be possible via any single platform. The hybrid method outlined here represents a powerful strategy in the study of the transcriptome.


September 22, 2019  |  

Single-cell isoform RNA sequencing characterizes isoforms in thousands of cerebellar cells.

Full-length RNA sequencing (RNA-Seq) has been applied to bulk tissue, cell lines and sorted cells to characterize transcriptomes, but applying this technology to single cells has proven to be difficult, with less than ten single-cell transcriptomes having been analyzed thus far. Although single splicing events have been described for =200 single cells with statistical confidence, full-length mRNA analyses for hundreds of cells have not been reported. Single-cell short-read 3′ sequencing enables the identification of cellular subtypes, but full-length mRNA isoforms for these cell types cannot be profiled. We developed a method that starts with bulk tissue and identifies single-cell types and their full-length RNA isoforms without fluorescence-activated cell sorting. Using single-cell isoform RNA-Seq (ScISOr-Seq), we identified RNA isoforms in neurons, astrocytes, microglia, and cell subtypes such as Purkinje and Granule cells, and cell-type-specific combination patterns of distant splice sites. We used ScISOr-Seq to improve genome annotation in mouse Gencode version 10 by determining the cell-type-specific expression of 18,173 known and 16,872 novel isoforms.


September 22, 2019  |  

Cartography of neurexin alternative splicing mapped by single-molecule long-read mRNA sequencing.

Neurexins are evolutionarily conserved presynaptic cell-adhesion molecules that are essential for normal synapse formation and synaptic transmission. Indirect evidence has indicated that extensive alternative splicing of neurexin mRNAs may produce hundreds if not thousands of neurexin isoforms, but no direct evidence for such diversity has been available. Here we use unbiased long-read sequencing of full-length neurexin (Nrxn)1a, Nrxn1ß, Nrxn2ß, Nrxn3a, and Nrxn3ß mRNAs to systematically assess how many sites of alternative splicing are used in neurexins with a significant frequency, and whether alternative splicing events at these sites are independent of each other. In sequencing more than 25,000 full-length mRNAs, we identified a novel, abundantly used alternatively spliced exon of Nrxn1a and Nrxn3a (referred to as alternatively spliced sequence 6) that encodes a 9-residue insertion in the flexible hinge region between the fifth LNS (laminin-a, neurexin, sex hormone-binding globulin) domain and the third EGF-like sequence. In addition, we observed several larger-scale events of alternative splicing that deleted multiple domains and were much less frequent than the canonical six sites of alternative splicing in neurexins. All of the six canonical events of alternative splicing appear to be independent of each other, suggesting that neurexins may exhibit an even larger isoform diversity than previously envisioned and comprise thousands of variants. Our data are consistent with the notion that a-neurexins represent extracellular protein-interaction scaffolds in which different LNS and EGF domains mediate distinct interactions that affect diverse functions and are independently regulated by independent events of alternative splicing.


September 22, 2019  |  

Current developments in molecular monitoring in chronic myeloid leukemia.

Molecular monitoring plays an essential role in the clinical management of chronic myeloid leukemia (CML) patients, and now guides clinical decision making. Quantitative reverse-transcriptase-polymerase-chain-reaction (qRT-PCR) assessment of BCR-ABL1 transcript levels has become the standard of care protocol in CML. However, further developments are required to assess leukemic burden more efficiently, monitor minimal residual disease (MRD), detect mutations that drive resistance to tyrosine kinase inhibitor (TKI) therapy and identify predictors of response to TKI therapy. Cartridge-based BCR-ABL1 quantitation, digital PCR and next generation sequencing are examples of technologies which are currently being explored, evaluated and translated into the clinic. Here we review the emerging molecular methods/technologies currently being developed to advance molecular monitoring in CML.


September 22, 2019  |  

Shannon: an information-optimal de novo RNA-Seq assembler

De novo assembly of short RNA-Seq reads into transcripts is challenging due to sequence similarities in transcriptomes arising from gene duplications and alternative splicing of transcripts. We present Shannon, an RNA-Seq assembler with an optimality guarantee derived from principles of information theory: Shannon reconstructs nearly all information-theoretically reconstructable transcripts. Shannon is based on a theory we develop for de novo RNA-Seq assembly that reveals differing abundances among transcripts to be the key, rather than the barrier, to effective assembly. The assembly problem is formulated as a sparsest-flow problem on a transcript graph, and the heart of Shannon is a novel iterative flow-decomposition algorithm. This algorithm provably solves the information-theoretically reconstructable instances in linear-time even though the general sparsest-flow problem is NP-hard. Shannon also incorporates several additional new algorithmic advances: a new error-correction algorithm based on successive cancelation, a multi-bridging algorithm that carefully utilizes read information in the k-mer de Bruijn graph, and an approximate graph partitioning algorithm to split the transcriptome de Bruijn graph into smaller components. In tests on large RNA-Seq datasets, Shannon obtains significant increases in sensitivity along with improvements in specificity in comparison to state-of-the-art assemblers.


September 22, 2019  |  

High-quality reference transcript datasets hold the key to transcript-specific RNA-sequencing analysis in plants.

Re-programming of the transcriptome involves both transcription and alternative splicing (AS). Some genes are regulated only at the AS level with no change in expression at the gene level. AS data must be incorporated as an essential aspect of the regulation of gene expression. RNA-sequencing (RNA-seq) can deliver both transcriptional and AS information, but accurate methods to analyse the added complexity in RNA-seq data are needed. The construction of a comprehensive reference transcript dataset (RTD) for a specific plant species, variety or accession, from all available sequence data, will immediately allow more robust analysis of RNA-seq data. RTDs will continually evolve and improve, a process that will be more efficient if resources across a community are shared and pooled.© 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.


September 22, 2019  |  

Improving eukaryotic genome annotation using single molecule mRNA sequencing.

The advantages of Pacific Biosciences (PacBio) single-molecule real-time (SMRT) technology include long reads, low systematic bias, and high consensus read accuracy. Here we use these attributes to improve on the genome annotation of the parasitic hookworm Ancylostoma ceylanicum using PacBio RNA-Seq.We sequenced 192,888 circular consensus sequences (CCS) derived from cDNAs generated using the CloneTech SMARTer system. These SMARTer-SMRT libraries were normalized and size-selected providing a robust population of expressed structural genes for subsequent genome annotation. We demonstrate PacBio mRNA sequences based genome annotation improvement, compared to genome annotation using conventional sequencing-by-synthesis alone, by identifying 1609 (9.2%) new genes, extended the length of 3965 (26.7%) genes and increased the total genomic exon length by 1.9 Mb (12.4%). Non-coding sequence representation (primarily from UTRs based on dT reverse transcription priming) was particularly improved, increasing in total length by fifteen-fold, by increasing both the length and number of UTR exons. In addition, the UTR data provided by these CCS allowed for the identification of a novel SL2 splice leader sequence for A. ceylanicum and an increase in the number and proportion of functionally annotated genes. RNA-seq data also confirmed some of the newly annotated genes and gene features.Overall, PacBio data has supported a significant improvement in gene annotation in this genome, and is an appealing alternative or complementary technique for genome annotation to the other transcript sequencing technologies.


September 22, 2019  |  

Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study.

High-throughput RNA sequencing (RNA-seq) greatly expands the potential for genomics discoveries, but the wide variety of platforms, protocols and performance capabilitites has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We carried out replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (poly-A-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies PGM and Proton, Pacific Biosciences RS and Roche 454). The results show high intraplatform (Spearman rank R > 0.86) and inter-platform (R > 0.83) concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. For intact RNA, gene expression profiles from rRNA-depletion and poly-A enrichment are similar. In addition, rRNA depletion enables effective analysis of degraded RNA samples. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq.


September 22, 2019  |  

A comprehensive approach to expression of L1 loci.

L1 elements represent the only currently active, autonomous retrotransposon in the human genome, and they make major contributions to human genetic instability. The vast majority of the 500 000 L1 elements in the genome are defective, and only a relatively few can contribute to the retrotransposition process. However, there is currently no comprehensive approach to identify the specific loci that are actively transcribed separate from the excess of L1-related sequences that are co-transcribed within genes. We have developed RNA-Seq procedures, as well as a 1200 bp 5? RACE product coupled with PACBio sequencing that can identify the specific L1 loci that contribute most of the L1-related RNA reads. At least 99% of L1-related sequences found in RNA do not arise from the L1 promoter, instead representing pieces of L1 incorporated in other cellular RNAs. In any given cell type a relatively few active L1 loci contribute to the ‘authentic’ L1 transcripts that arise from the L1 promoter, with significantly different loci seen expressed in different tissues.© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.


September 22, 2019  |  

Transcriptome-wide investigation of circular RNAs in rice.

Various stable circular RNAs (circRNAs) are newly identified to be the abundance of noncoding RNAs in Archaea, Caenorhabditis elegans, mice, and humans through high-throughput deep sequencing coupled with analysis of massive transcriptional data. CircRNAs play important roles in miRNA function and transcriptional controlling by acting as competing endogenous RNAs or positive regulators on their parent coding genes. However, little is known regarding circRNAs in plants. Here, we report 2354 rice circRNAs that were identified through deep sequencing and computational analysis of ssRNA-seq data. Among them, 1356 are exonic circRNAs. Some circRNAs exhibit tissue-specific expression. Rice circRNAs have a considerable number of isoforms, including alternative backsplicing and alternative splicing circularization patterns. Parental genes with multiple exons are preferentially circularized. Only 484 circRNAs have backsplices derived from known splice sites. In addition, only 92 circRNAs were found to be enriched for miniature inverted-repeat transposable elements (MITEs) in flanking sequences or to be complementary to at least 18-bp flanking intronic sequences, indicating that there are some other production mechanisms in addition to direct backsplicing in rice. Rice circRNAs have no significant enrichment for miRNA target sites. A transgenic study showed that overexpression of a circRNA construct could reduce the expression level of its parental gene in transgenic plants compared with empty-vector control plants. This suggested that circRNA and its linear form might act as a negative regulator of its parental gene. Overall, these analyses reveal the prevalence of circRNAs in rice and provide new biological insights into rice circRNAs.© 2015 Lu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.


September 22, 2019  |  

The expressed portion of the barley genome

In this chapter, we refer to the expressed portion of the barley genome as the relatively small fraction of the total cellular DNA that either contains the genes that ultimately produce proteins, or that directly/indirectly controls the level, location and/or timing of when these genes are expressed and proteins are produced. We start by describing the dynamics of tissue and time-dependent gene expression and how common patterns across multiple samples can provide clues about gene networks involved in common biological processes. We then describe some of the complexities of how a single mRNA template can be differentially processed by alternative splicing to generate multiple different proteins or provide a mechanism to regulate the amount of functional gene product in a cell at a given point in time. We extend our analysis, using a number of biological examples, to address how diverse families of small non-coding microRNAs specifically regulate gene expression, and complete our appraisal by looking at the physical/molecular environment around genes that can result in either the promotion or repression of gene expression. We conclude by assessing some of the issues that remain around our ability to fully exploit the depth and power of current approaches for analysing gene expression and propose improvements that could be made using new but available sequencing and bioinformatics technologies.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.