Long-read assembly of the Aedes aegypti Aag2 cell line genome resolves ancient endogenous viral elements
Transmission of arboviruses such as Dengue Virus by Aedes aegypti causes debilitating disease across the globe. Disease in humans can include severe acute symptoms such as hemorrhagic fever and organ failure, but mosquitoes tolerate high titers of virus in a persistent infection. The mechanisms responsible for this viral tolerance are unclear. Recent publications highlighted the integration of genetic material from non-retroviral RNA viruses into the genome of the host during infection that relies upon endogenous retro-transcriptase activity from transposons. These endogenous viral elements (EVEs) found in the genome are predicted to be ancient, and at least some EVEs are under purifying selection, suggesting they are beneficial to the host. To characterize EVE biogenesis in a tractable system, we sequenced the Ae. aegypti cell line, Aag2, to 58-fold coverage and present a de novo assembly of the genome. The assembly contains 1.7 Gb of genomic and 255 Mb of alternative haplotype specific sequence, consisting of contigs with a N50 of 1.4 Mb; a value that, when compared with other assemblies of the Aedes genus, is from 1-3 orders of magnitude longer. The Aag2 genome is highly repetitive (70%), most of which is classified as transposable elements (60%). We identify EVEs in the genome homologous to a range of extant viruses, many of which cluster in these regions of repetitive DNA. The contiguous assembly allows for more comprehensive identification of the transposable elements and EVEs that are most likely to be lost in assemblies lacking the read length of SMRT Sequencing.