In this study we identified two 3′-coterminal RNA molecules in the pseudorabies virus. The highly abundant short transcript (CTO-S) proved to be encoded between the ul21 and ul22 genes in close vicinity of the replication origin (OriL) of the virus. The less abundant long RNA molecule (CTO-L) is a transcriptional readthrough product of the ul21 gene and overlaps OriL. These polyadenylated RNAs were characterized by ascertaining their nucleotide sequences with the Illumina HiScanSQ and Pacific Biosciences Real-Time (PacBio RSII) sequencing platforms and by analyzing their transcription kinetics through use of multi-time-point Real-Time RT-PCR and the PacBio RSII system. It emerged that transcription of the CTOs is fully dependent on the viral transactivator protein IE180 and CTO-S is not a microRNA precursor. We propose an interaction between the transcription and replication machineries at this genomic location, which might play an important role in the regulation of DNA synthesis.
Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms.
Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100?bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50?bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore’s MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.