New Iso-Seq Chicken Analysis Finds Missing Genes and Transcript Complexity
Tuesday, May 2, 2017
A new publication in BMC Genomics explores the use of RNA normalization and 5’ cap selection to enhance results from Iso-Seq studies using SMRT Sequencing. Scientists from the University of Edinburgh report that these modifications significantly boosted transcriptome coverage in a study of chicken.
“Normalized long read RNA sequencing in chicken reveals transcriptome complexity similar to human” comes from lead author Richard Kuo, senior author David Burt, and collaborators. The team chose this project because existing chicken annotation resources have far fewer genes than expected, with very little evidence of alternative splicing. This situation was believed to have stemmed from prior technology limitations.
In earlier studies, “researchers had to choose between low-throughput, costly methods to generate accurate full-length transcript models, such as cDNA cloning or high-throughput, cheaper methods to generate imprecise transcript models, such as short read RNA sequencing,” Kuo et al. write. “The current status of chicken annotation represents a prime example of this trade off.” The annotation has just over 17,000 genes and fewer than 18,000 transcripts, far less on both counts than other vertebrates.
RNA sequencing based on short-read data is particularly challenged in identifying essential transcription characteristics, the authors note: transcript start and termination sites, transcriptional noise, and exon chaining. These problems “are practically eliminated with long read sequencing where the full-length of a transcript may be sequenced in a single read,” they add.
For this project, scientists deployed SMRT Sequencing, tweaking the Iso-Seq protocol to incorporate RNA normalization as well as 5’ cap selection. They analyzed RNA from chicken brain and embryonic tissues, normalizing both libraries but using 5’ cap selection only for embryo samples. They also collected short-read data to compare results.
This approach yielded some 60,000 transcripts and 29,000 genes, including more than 20,000 novel lncRNA transcripts. The team also found nearly 15,000 unmapped reads from both libraries, likely representing “a significant number of genes that are not currently represented in the Chicken annotations due to gaps in the genome assembly,” they report. They compared their findings to results from Thomas et al., an earlier publication using SMRT Sequencing for chicken transcriptome analysis that did not include the modifications. Kuo et al. estimate that their normalization protocol “appears to have provided a transcriptome coverage efficiency of more than 5 times that of the previous study,” they write. “This means that for every SMRT cell used with the normalization method, 5 SMRT cells would be required without normalization to achieve the same amount of transcriptome coverage.”
The team’s new PacBio-based transcriptome “suggests a level of transcriptional complexity that is more consistent with expectations based on the well-characterised human genome,” the scientists conclude. “Using PacBio sequencing to create a high quality transcriptome annotation can correct [underrepresentation] issues that are common in many of the public annotations.”
Richard Kuo will be speaking about this research at the SMRT Leiden conference taking place this week in the Netherlands. Follow along at #SMRTLeiden!