June 1, 2021

A high-quality de novo genome assembly from a single mosquito using PacBio sequencing

Author(s): Baybayan, Primo and Heaton, Haynes and Cudini, Juliana and Holroyd, Nancy and Tracey, Alan and Lambert, Christine C. and Kingan, Sarah and Galvin, Brendan and Korlach, Jonas and Berriman, Matthew and Lawniczak, Mara K. N.

A high-quality reference genome is an essential tool for studies of plant and animal genomics. PacBio Single Molecule, Real-Time (SMRT) Sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. While PacBio is the core technology for many large genome initiatives, relatively high DNA input requirements (3 µg for standard library protocol) have placed PacBio out of reach for many projects on small, non-inbred organisms that may have lower DNA content. Here we present high-quality de novo genome assemblies from single invertebrate individuals for two different species: the Anopheles coluzzii mosquito and the Schistosoma mansoni parasitic flatworm. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 150 ng of starting genomic DNA. The libraries were run on the Sequel System with chemistry v3.0 and software v6.0, generating a range of 21-32 Gb of sequence per SMRT Cell with 20-hour movies (10-12 Gb for 10-hour movies), and followed by diploid de novo genome assembly with FALCON-Unzip. The resulting assemblies had high contiguity (contig N50s over 3 Mb for both species) and completeness (as determined by conserved BUSCO gene analysis). We were also able to resolve maternal and paternal haplotypes for 1/3 of the genome in both cases. By sequencing and assembling material from a single diploid individual, only two haplotypes are present, simplifying the assembly process compared to samples from multiple pooled individuals. This new low-input approach puts PacBio-based assemblies in reach for small, highly heterozygous organisms that comprise much of the diversity of life. The method presented here can be applied to samples with starting DNA amounts around 150 ng per 250 Mb – 600 Mb genome size.

Organization: PacBio
Year: 2019

View Conference Poster

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.