Kim Worley from Baylor’s Human Genome Sequencing Center describes the improvement of the sooty mangabey primate genome. Sooty mangabey is a model organism for HIV research, since this particular primate can be infected with the immunodeficiency virus and never develop any symptoms. Worley and her team used PacBio long reads in conjunction with their own assembly tool, PBJelly, closing 64% and improving another 19% of the gaps.
Judson Ward, principal scientists at Driscoll’s Strawberries in California, introduces a genome assembly for Potentilla micrantha, which is closely related to strawberry but lacks fleshy ‘fruits’ or berries. Comparative genomics between P. micrantha and strawberry will yield significant information regarding the genetic mechanisms controlling fruit development. Using SMRT Sequencing Driscoll’s sequenced the 240 Mb P. micaranthagenome and produced a draft genome assembly, spanning the majority of the predicted sequence length. A comparison of sequence data produced using the Illumina HiSeq2000 and the PacBio RS platform demonstrated that PacBio sequencing produced a significantly longer N50 contig size and permitted a more complete genome…
Simon Chan, UC Davis on how PacBio long read sequencing revealed higher order repeats in centromeres of switchgrass which would have been hidden if you are restricted by the much shorter Sanger reads.
From USDA’s Agricultural Research Service, molecular biologist Sean Gordon discusses the need for long-read sequencing to map an organism’s transcriptome. His team analyzed the wood-decaying fungus Plicaturopsis crispa first with short reads and found that they were missing exons and other important information. They switched to SMRT Sequencing so they could observe, rather than infer, full-length transcripts.
Chongyuan Luo from the Salk Institute for Biological Studies describes sequencing three strains of Arabidopsis thaliana using PacBio technology. The goal: uncover structural variants that have been missed by short-read and other sequencers. Luo notes that PacBio sequencing provides highly accurate SNP detection and also extends the mappability of reads beyond what is possible with short-read data, producing better and more accurate assemblies.
Allen Van Deynze from UC Davis presents the genome sequencing and assembly project for spinach, an organism of 980 Mb. Results indicate a high-accuracy assembly with significantly higher N50 contig length than a previous short-read assembly. The PacBio assembly has allowed for filling gaps in the prior assembly.
Shane Brubaker from renewable oil manufacturer Solazyme reports using the PacBio system to sequence the genome of a GC-rich strain of algae that couldn’t be fully assembled with short-read sequence data. He notes that CCS reads exceed Sanger quality at significantly lower cost.
Michiel van Eijk of KeyGene shared a de novo PacBio assembly of tetraploid cotton. The genome assembly was further enhanced and annotated using Iso-Seq data collected from cotton root, leaf, and stem tissues. The data, full-length cDNA transcripts, captured alternative splicing diversity across these tissue types, allowing for isoform differentiation.
Susan Strickler of the Boyce Thompson Institute presented strategies for assembling the genome of Arabica coffee, an allotetraploid with a genome size of approximately 1.3 Gb. A de novo PacBio assembly was constructed and presented. The new high-quality reference will be used to guide assemblies of the diploid ancestors of Arabica coffee and re-sequencing data for a set of C. arabica accessions to more fully characterize the genetic diversity of this crop species that is highly susceptible to climate change.
Tim Smith of the USDA presents his work to establish a high-quality reference genome of the San Clemente goat. After generating 70-fold PacBio sequence data, the PacBio assembly proved to be far more complete than the existing draft reference genome, with contigs extending 100 times longer on average.
Robert VanBuren of the Danforth Plant Science Center and winner of the 2014 SMRT Grant Program presents a de novo assembly of the Oro grass genome (Oropetium thomaeum). The reference genome will aid scientist studying drought tolerance in common crop species, especially cereals, though comparative genomics to understand potential key genetic underpinnings for this “resurrection” trait. Initial comparative results to Brachypodium and maize are presented, as well as secondary analysis to identify key metabolic traits.
Doreen Ware introduces her team’s new assembly of maize, built with PacBio long-read sequencing and genome maps from BioNano Genomics. With a contig N50 of nearly 10 Mb and more complete information than any previous assembly, Ware says, “This is just an amazing time to be a plant scientist.” Her presentation includes a number of highlights from the new assembly, which may help crop improvement efforts for maize.
Jason Chin, senior director of bioinformatics at PacBio, talks about using long-read sequence data to generate diploid genome assemblies to produce comprehensive haplotype sequence reconstructions. In the presentation, Chin describes the FALCON Unzip process that combines SNP phasing with the assembly process and allows for determination of the haplotype sequences and identification of structural variants. He presents an example of diploid assembly from inbred Arabidopsis strains.
David Kudrna, Rod Wing, and the Arizona Genomics Institute (AGI) plan to fully sequence and annotate the genomes and transcriptomes of 3-4 accessions from each of the estimated 9-15 subpopulation of rice. These subpopulation-specific references will be used to map resequencing data of 3,000 individuals for variation discovery, GWAS, and genomic selection studies to address important traits such as biotic and abiotic stress tolerances, yield, and grain quality. Here Dr. Kudrna presents the first high-quality genome sequence of the rice variety Nagina22. AGI produced and assembled 65-fold coverage of SMRT Sequencing data, resulting in an assembly of 373 Mb with…
Alan Archibald compares two new de novo PacBio pig genome assemblies to a previously released draft genome, finding significant improvement that could be important for breeding programs. In one example, he shows chromosome 1, which was split into more than 9,000 contigs in the draft genome, is now represented in just 10 contigs.