Menu
September 22, 2019  |  

Single Molecule Sequencing: new outlooks for solving genome assembly and transcripts identification challenges

In this review, we introduce a novel sequencing technology, named Single Molecule Real Time sequencing. Also called Single Molecule Sequencing, as it do not requires any amplification, this new technology is able to pro- duce much longer reads than previous NGS technologies such as Illumina. This read size improvements, which can reach 150 fold, will solve many challenges caused by the actual NGS technologies. Short NGS reads, reach- ing a maximum size of 300 bp, make it hard to reconstitute a whole genome and are always leading to fragmented genome assembly. It is also difficult to correctly infer transcript quantification and identification when there is a high isoforms diversity. Despite their higher error rate, long reads have shown very promising result concerning these actual issues. We show that longer reads can produce less fragmented assembly, with a better quality, but also sequence from start to end mRNA, making it much more easier to infer correct transcript quantification, and even allow new intron structure and so new isoforms discovery.


September 22, 2019  |  

Introduction to isoform sequencing using Pacific Biosciences technology (Iso-Seq)

Alternative RNA splicing is a known phenomenon, but we still do not have a complete catalog of isoforms that explain variability in the human transcriptome. We have made significant progress in developing methods to study variability of the transcriptome, but we are far away of having a complete picture of the transcriptome. The initial methods to study gene expression were based on cloning of cDNAs and Sanger sequencing. The strategy was labor-intensive and expensive. With the development of microarrays, different methods based on exon arrays and tiling arrays provided valuable information about RNA expression. However, the microarray presented significant limitations. Most of the limitations became apparent by 2005, but it was not until 2008 that an alternative method to study the transcriptome was developed. RNA Sequencing using next-generation sequencing (RNA-Seq) quickly became the technology of choice for gene expression profiling. Recently, the precision and sensitivity of RNA-Seq have come into question, especially for transcriptome reconstruction. This chapter will describe a relatively new method, “Isoform Sequencing (Iso-Seq). Iso-Seq was developed by Pacific Biosciences (PacBio), and it is capable of identifying new isoforms with extraordinary precision due to its long-read technology. The technique to create libraries is straightforward, and the PacBio RS II instrument generates the information in hours. The bioinformatics analysis is performed using the freely available SMRT® Portal software. The SMRT Portal is easy to use and capable of performing all the steps necessary to analyze the raw data and to generate high-quality full-length isoforms. For the universal acceptance of the Iso-Seq method, the capacity of the SMRT Cells needs to improve at least 10- to 100-fold to make the system affordable and attractive to users.


September 22, 2019  |  

Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics.

Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio’s single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.


September 22, 2019  |  

Next generation sequencing technology: Advances and applications.

Impressive progress has been made in the field of Next Generation Sequencing (NGS). Through advancements in the fields of molecular biology and technical engineering, parallelization of the sequencing reaction has profoundly increased the total number of produced sequence reads per run. Current sequencing platforms allow for a previously unprecedented view into complex mixtures of RNA and DNA samples. NGS is currently evolving into a molecular microscope finding its way into virtually every fields of biomedical research. In this chapter we review the technical background of the different commercially available NGS platforms with respect to template generation and the sequencing reaction and take a small step towards what the upcoming NGS technologies will bring. We close with an overview of different implementations of NGS into biomedical research. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.


September 22, 2019  |  

Single-cell mRNA isoform diversity in the mouse brain.

Alternative mRNA isoform usage is an important source of protein diversity in mammalian cells. This phenomenon has been extensively studied in bulk tissues, however, it remains unclear how this diversity is reflected in single cells.Here we use long-read sequencing technology combined with unique molecular identifiers (UMIs) to reveal patterns of alternative full-length isoform expression in single cells from the mouse brain. We found a surprising amount of isoform diversity, even after applying a conservative definition of what constitutes an isoform. Genes tend to have one or a few isoforms highly expressed and a larger number of isoforms expressed at a low level. However, for many genes, nearly every sequenced mRNA molecule was unique, and many events affected coding regions suggesting previously unknown protein diversity in single cells. Exon junctions in coding regions were less prone to splicing errors than those in non-coding regions, indicating purifying selection on splice donor and acceptor efficiency.Our findings indicate that mRNA isoform diversity is an important source of biological variability also in single cells.


September 22, 2019  |  

A quantitative SMRT cell sequencing method for ribosomal amplicons.

Advances in sequencing technologies continue to provide unprecedented opportunities to characterize microbial communities. For example, the Pacific Biosciences Single Molecule Real-Time (SMRT) platform has emerged as a unique approach harnessing DNA polymerase activity to sequence template molecules, enabling long reads at low costs. With the aim to simultaneously classify and enumerate in situ microbial populations, we developed a quantitative SMRT (qSMRT) approach that involves the addition of exogenous standards to quantify ribosomal amplicons derived from environmental samples. The V7-9 regions of 18S SSU rDNA were targeted and quantified from protistan community samples collected in the Ross Sea during the Austral summer of 2011. We used three standards of different length and optimized conditions to obtain accurate quantitative retrieval across the range of expected amplicon sizes, a necessary criterion for analyzing taxonomically diverse 18S rDNA molecules from natural environments. The ability to concurrently identify and quantify microorganisms in their natural environment makes qSMRT a powerful, rapid and cost-effective approach for defining ecosystem diversity and function. Copyright © 2017 Elsevier B.V. All rights reserved.


September 22, 2019  |  

PacBio sequencing and its applications.

Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Additionally, PacBio’s sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.


September 22, 2019  |  

Analyses of alternative polyadenylation: from old school biochemistry to high-throughput technologies.

Alternations in usage of polyadenylation sites during transcription termination yield transcript isoforms from a gene. Recent findings of transcriptome-wide alternative polyadenylation (APA) as a molecular response to changes in biology position APA not only as a molecular event of early transcriptional termination but also as a cellular regulatory step affecting various biological pathways. With the development of high-throughput profiling technologies at a single nucleotide level and their applications targeted to the 3′-end of mRNAs, dynamics in the landscape of mRNA 3′-end is measureable at a global scale. In this review, methods and technologies that have been adopted to study APA events are discussed. In addition, various bioinformatics algorithms for APA isoform analysis using publicly available RNA-seq datasets are introduced. [BMB Reports 2017; 50(4): 201-207].


September 22, 2019  |  

Long reads: their purpose and place.

In recent years long-read technologies have moved from being a niche and specialist field to a point of relative maturity likely to feature frequently in the genomic landscape. Analogous to next generation sequencing, the cost of sequencing using long-read technologies has materially dropped whilst the instrument throughput continues to increase. Together these changes present the prospect of sequencing large numbers of individuals with the aim of fully characterizing genomes at high resolution. In this article, we will endeavour to present an introduction to long-read technologies showing: what long reads are; how they are distinct from short reads; why long reads are useful and how they are being used. We will highlight the recent developments in this field, and the applications and potential of these technologies in medical research, and clinical diagnostics and therapeutics.


September 22, 2019  |  

Fluorescently-tagged human eIF3 for single-molecule spectroscopy.

Human translation initiation relies on the combined activities of numerous ribosome-associated eukaryotic initiation factors (eIFs). The largest factor, eIF3, is an ~800 kDa multiprotein complex that orchestrates a network of interactions with the small 40S ribosomal subunit, other eIFs, and mRNA, while participating in nearly every step of initiation. How these interactions take place during the time course of translation initiation remains unclear. Here, we describe a method for the expression and affinity purification of a fluorescently-tagged eIF3 from human cells. The tagged eIF3 dodecamer is structurally intact, functions in cell-based assays, and interacts with the HCV IRES mRNA and the 40S-IRES complex in vitro. By tracking the binding of single eIF3 molecules to the HCV IRES RNA with a zero-mode waveguides-based instrument, we show that eIF3 samples both wild-type IRES and an IRES that lacks the eIF3-binding region, and that the high-affinity eIF3-IRES interaction is largely determined by slow dissociation kinetics. The application of single-molecule methods to more complex systems involving eIF3 may unveil dynamics underlying mRNA selection and ribosome loading during human translation initiation.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


September 22, 2019  |  

2′-O-methylation in mRNA disrupts tRNA decoding during translation elongation.

Chemical modifications of mRNA may regulate many aspects of mRNA processing and protein synthesis. Recently, 2′-O-methylation of nucleotides was identified as a frequent modification in translated regions of human mRNA, showing enrichment in codons for certain amino acids. Here, using single-molecule, bulk kinetics and structural methods, we show that 2′-O-methylation within coding regions of mRNA disrupts key steps in codon reading during cognate tRNA selection. Our results suggest that 2′-O-methylation sterically perturbs interactions of ribosomal-monitoring bases (G530, A1492 and A1493) with cognate codon-anticodon helices, thereby inhibiting downstream GTP hydrolysis by elongation factor Tu (EF-Tu) and A-site tRNA accommodation, leading to excessive rejection of cognate aminoacylated tRNAs in initial selection and proofreading. Our current and prior findings highlight how chemical modifications of mRNA tune the dynamics of protein synthesis at different steps of translation elongation.


September 22, 2019  |  

Dimer arrangement and monomer flattening determine actin filament formation

Actin filament dynamics underlie key cellular processes, such as cell motility. Although actin filament elongation has been extensively studied under the past decades, the mechanism of filament nucleation remains unclear. Here, we immobilized gelsolin, a pointed-end nucleator, at the bottom of zero-mode waveguides to directly monitor the early steps of filament assembly. Our data revealed extensive dynamics and that only one, of two populations, elongates. Annalysis of the kinetics revealed a more stable trimer but a less stable tetramer in the elongating population compared to the non-elongating one. Furthermore, blocking flattening, the conformational change associated with filament formation, prevented the formation of both types of assemblies. Thus, flattening and the initial monomer arrangement determine gelsolin-mediated filament initiation.


September 22, 2019  |  

DNA N6-adenine methylation in Arabidopsis thaliana.

DNA methylation on N6-adenine (6mA) has recently been found to be a potentially epigenetic mark in several unicellular and multicellular eukaryotes. However, its distribution patterns and potential functions in land plants, which are primary producers for most ecosystems, remain largely unknown. Here we report global profiling of 6mA sites at single-nucleotide resolution in the genome of Arabidopsis thaliana at different developmental stages using single-molecule real-time sequencing. 6mA sites are widely distributed across the Arabidopsis genome and enriched over the pericentromeric heterochromatin regions. 6mA occurs more frequently in gene bodies than intergenic regions. Analysis of 6mA methylomes and RNA sequencing data demonstrates that 6mA frequency positively correlates with the gene expression level and the transition from vegetative to reproductive growth in Arabidopsis. Our results uncover 6mA as a DNA mark associated with actively expressed genes in Arabidopsis, suggesting that 6mA serves as a hitherto unknown epigenetic mark in land plants. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019  |  

Genomic analysis for heavy metal resistance in S. maltophilia

Stenotrophomonas maltophilia is highly resistant to heavy metals, but the genetic knowledge of metal resistance in S. maltophilia is poorly understood. In this study, the genome of S. maltophilia Pho isolated from the contaminated soil near a metalwork factory was sequenced using PacBio RS II. Its genome is composed of a single chromosome with a GC content of 66.4% and 4434 protein-encoding genes. Comparative analysis revealed high syntney between S. maltophilia Pho and the model strain, S. maltophilia K279a. Then, the type and number of mechanisms of heavy metal uptake were analyzed firstly. Results showed that 7 unspecific ion transporter genes and 13 specific ion transporter genes, most of which were involved in iron transport. But the sulfate permeases belonging to the family of SulT/CysP that can uptake chromate and the high affinity ZnuABC/SitABCD were absent. Secondly, the putative genes controlling metal efflux were analyzed. Results showed that this bacterium encoded 5 CDFs, 1 copper exporting ATPase and 4 RND systems, including 2 CzcABC efflux pumps. Moreover, the putative metal transformation genes including arsenate and mercury detoxification genes were also identified. This study may provide useful information on the metal resistance mechanisms of S. maltophilia.


September 22, 2019  |  

Draft genome assembly of the invasive cane toad, Rhinella marina.

The cane toad (Rhinella marina formerly Bufo marinus) is a species native to Central and South America that has spread across many regions of the globe. Cane toads are known for their rapid adaptation and deleterious impacts on native fauna in invaded regions. However, despite an iconic status, there are major gaps in our understanding of cane toad genetics. The availability of a genome would help to close these gaps and accelerate cane toad research.We report a draft genome assembly for R. marina, the first of its kind for the Bufonidae family. We used a combination of long-read Pacific Biosciences RS II and short-read Illumina HiSeq X sequencing to generate 359.5 Gb of raw sequence data. The final hybrid assembly of 31,392 scaffolds was 2.55 Gb in length with a scaffold N50 of 168 kb. BUSCO analysis revealed that the assembly included full length or partial fragments of 90.6% of tetrapod universal single-copy orthologs (n = 3950), illustrating that the gene-containing regions have been well assembled. Annotation predicted 25,846 protein coding genes with similarity to known proteins in Swiss-Prot. Repeat sequences were estimated to account for 63.9% of the assembly.The R. marina draft genome assembly will be an invaluable resource that can be used to further probe the biology of this invasive species. Future analysis of the genome will provide insights into cane toad evolution and enrich our understanding of their interplay with the ecosystem at large.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.