Single Molecule Real-Time (SMRT) Sequencing was used to generate long reads for whole genome shotgun sequencing of the genome of the`alala (Hawaiian crow). The ‘alala is endemic to Hawaii, and the only surviving lineage of the crow family, Corvidae, in the Hawaiian Islands. The population declined to less than 20 individuals in the 1990s, and today this charismatic species is extinct in the wild. Currently existing in only two captive breeding facilities, reintroduction of the ‘alala is scheduled to begin in the Fall of 2016. Reintroduction efforts will be assisted by information from the ‘alala genome generated and assembled by SMRT Technology, which will allow detailed analysis of genes associated with immunity, behavior, and learning. Using SMRT Sequencing, we present here best practices for achieving long reads for whole genome shotgun sequencing for complex plant and animal genomes such as the ‘alala genome. With recent advances in SMRTbell library preparation, P6-C4 chemistry and 6-hour movies, the number of useable bases now exceeds 1 Gb per SMRT Cell. Read lengths averaging 10 – 15 kb can be routinely achieved, with the longest reads approaching 70 kb. Furthermore, > 25% of useable bases are in reads greater than 30 kb, advantageous for generating contiguous draft assemblies of contig N50 up to 5 Mb. De novo assemblies of large genomes are now more tractable using SMRT Sequencing as the standalone technology. We also present guidelines for planning out projects for the de novo assembly of large genomes.
Plant and animal whole genome sequencing has proven to be challenging, particularly due to genome size, high density of repetitive elements and heterozygosity. The Sequel System delivers long reads, high consensus accuracy and uniform coverage, enabling more complete, accurate, and contiguous assemblies of these large complex genomes. The latest Sequel chemistry increases yield up to 8 Gb per SMRT Cell for long insert libraries >20 kb and up to 10 Gb per SMRT Cell for libraries >40 kb. In addition, the recently released SMRTbell Express Template Prep Kit reduces the time (~3 hours) and DNA input (~3 µg), making the workflow easy to use for multi- SMRT Cell projects. Here, we recommend the best practices for whole genome sequencing and de novo assembly of complex plant and animal genomes. Guidelines for constructing large-insert SMRTbell libraries (>30 kb) to generate optimal read lengths and yields using the latest Sequel chemistry are presented. We also describe ways to maximize library yield per preparation from as littles as 3 µg of sheared genomic DNA. The combination of these advances makes plant and animal whole genome sequencing a practical application of the Sequel System.