Breaking New Frontiers in Grass Genomics to Understand Drought Tolerance with the 2014 SMRT Grant Program Winner: Oropetium thomaeum
Friday, January 23, 2015
Emerging from a myriad of interesting genome nominations, from the American cranberry to South American prawns and African Guava, Oropetium thomaeum submitted by Todd Mockler at the Donald Danforth Plant Science Center was selected as the first winner of the “Most Interesting Genome in the World” SMRT® grant program in 2014. Also affectionately known as Oro, this grass species can be revived with water after a long drought exposure. At 250 Mb, the genome is also the smallest amongst grasses due to compaction of complex repeat and gene structures, including previously identified expansions in osmoprotectant biosynthesis pathways.
Kicking off the second annual launch of this program, NSF postdoctoral fellow Robert VanBuren in Mockler’s group presented initial results of the Oro genome assembly and analysis at the recent Plant and Animal Genome (PAG) XXIII international conference in San Diego. With 18 Gb of sequencing data at 65x genome coverage and read length N50 at 16,485 bp, this yielded an HGAP genome assembly containing 625 contigs at a contig N50 of 2.39 Mb. The maximum contig length was 7.98 Mb despite having a high repeat content of roughly 50% of the genome, which was more than expected. The impressive assembly, summing to a total of 244.46 Mb, covered 98.3% of the expected genome size. This achievement is heavily attributed to having high-quality, high-molecular-weight genomic DNA where reads longer than 20 Kb provided 10x coverage of the genome.
Both Mockler and VanBuren were blown away by this new record-breaking plant genome assembly. “PacBio is a game changer for plant genomics,” says VanBuren, citing they were also able to identify all the telomeres in the genome based on their tandem repeat signatures. The compact genome serves as an excellent resource for comparative work amongst grass genomes to understand large-scale structural variation, genome structure reorganization, metabolic networks, stress pathways, and other secondary analyses. View the recording of the preliminary analysis presented for the Oro genome at the PAGXXIII conference.
Following this initial success, the 2015 SMRT Grant program is supported with co-sponsorship from Sage Science, Computomics, and the Arizona Genomics Institute. The latest P6-C4 chemistry will be utilized for the winning proposal. This release has been shown to deliver average sequencing read lengths of >10-15 kb with extreme reads in the distribution of > 60 kb on the PacBio® RS II system for complex genome projects. The average throughput from each SMRT Cell ranged from 500 Mb to 1 Gb depending on the application. These features also further accelerated PacBio’s Iso-Seq™ application to deliver whole-transcriptome sequencing of full-length cDNA transcripts to distinguish between isoforms for genome annotation, as well as gene discovery.
Other submissions received in 2014 include a critically endangered Hawaiian crow (Corvus hawaiienis), famine-causing ascomycete fungal pathogen (Cercospora zeina), and hermaphroditic fish (Kryptolebias marmoratus). We look forward to reading (and learning!) about all the exciting work that drives the passion of scientists through the submitted proposals for 2015!
Details for the 2015 “Most Interesting Genome in the World” SMRT Grant program can be found at www.pacb.com/smrtgrant/.