A high-quality genome assembly of SMRT Sequences reveals long-range haplotype structure in the diploid mosquito Aedes aegypti
Aedes aegypti is a tropical and subtropical mosquito vector for Zika, yellow fever, dengue fever, chikungunya, and other diseases. The outbreak of Zika in the Americas, which can cause microcephaly in the fetus of infected women, adds urgency to the need for a high-quality reference genome in order to better understand the organism’s biology and its role in transmitting human disease. We describe the first diploid assembly of an insect genome, using SMRT sequencing and the open-source assembler FALCON-Unzip. This assembly has high contiguity (contig N50 1.3 Mb), is more complete than previous assemblies (Length 1.45 Gb with 87% BUSCO genes complete), and is high quality (mean base >QV30). Long-range haplotype structure, in some cases encompassing more than 4 Mb of extremely divergent homologous sequence, is resolved using a combination of the FALCON-Unzip assembler, genome annotation, coverage depth, and pairwise nucleotide alignments.