Scientists from WashU, Macrogen, and Mount Sinai are using long-read sequencing with single-molecule, next-generation genome mapping to create gold-quality de novo assemblies of human genomes. Unbiased de novo assembled genomes also highlight the substantial amount of structural variation unique to individuals and populations, which cannot be accessed by short-read technologies that use a reference-based re-sequencing approach.
At the University of California, Davis, Dario Cantu is applying long-read PacBio sequencing to the heterozygous genome of the Cabernet Sauvignon grape. Now, his team has access to whole genome data that could help guard against the effects of climate change and disease.
At DuPont Pioneer, DNA sequencing is paramount for R&D to reveal the genetic basis for traits of interest in commercial crops such as maize, soybean, sorghum, sunflower, alfalfa, canola, wheat, rice, and others. They cannot afford to wait the years it has historically taken for high-quality reference genomes to be produced. Nor can they rely on a single reference to represent the genetic diversity in its germplasm.
The PacBio Platform includes an extensive software portfolio that employs key advantages of SMRT (Single Molecule, Real-Time) Sequencing technology: extraordinarily long reads, highest consensus accuracy, uniform coverage and simultaneous epigenetic characterization. Core elements of our analytical portfolio include SMRT Analysis software, DevNet and SMRT Compatible products.
High oil and protein content make tetraploid peanut a leading oil and food legume. Here we report a high-quality peanut genome sequence, comprising 2.54?Gb with 20 pseudomolecules and 83,709 protein-coding gene models. We characterize gene functional groups implicated in seed size evolution, seed oil content, disease resistance and symbiotic nitrogen fixation. The peanut B subgenome has more genes and general expression dominance, temporally associated with long-terminal-repeat expansion in the A subgenome that also raises questions about the A-genome progenitor. The polyploid genome provided insights into the evolution of Arachis hypogaea and other legume chromosomes. Resequencing of 52 accessions suggests that…
The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which…
Songbirds communicate through learned vocalizations, using a forebrain circuit with convergent similarity to vocal-control circuitry in humans. This circuit is incomplete in female zebra finches, hence only males sing. We show that the UTS2B gene, encoding Urotensin-Related Peptide (URP), is uniquely expressed in a key pre-motor vocal nucleus (HVC), and specifically marks the neurons that form a male-specific projection that encodes timing features of learned song. UTS2B-expressing cells appear early in males, prior to projection formation, but are not observed in the female nucleus. We find no expression evidence for canonical receptors within the vocal circuit, suggesting either signalling to…
In this PacBio User Group Meeting presentation, Zev Kronenberg of PacBio presents on using the combination of PacBio and Phase Genomics data and analysis tools to create highly contiguous genome assemblies.
Jonas Korlach, Chief Scientific Officer at PacBio, discussed the technology waves that have followed the initial human genome sequencing project, where we are today, and where we are going. Today, we are in what Korlach calls the 4th wave, where more comprehensive whole-genome re-sequencing is occurring, and we are nearing the 5th, when we will actually be able to free ourselves from reference genomes and sequence everything de novo.
During this presentation from ASHG 2015, Maria Nattestad of Cold Spring Harbor Laboratory described the study of a Her2-amplified breast cancer cell line using long-read sequencing from PacBio. With reads as long as 71 kb, she was able to characterize extensive and complex rearrangements and found more than 11,000 structural variants. She also used the Iso-Seq method to find gene fusions, including some novel ones.
Yunfei Guo, from the University of Southern California, presents his ASHG 2015 poster on a de novo assembly of a diploid Asian genome. The uniform coverage of long-read sequencing helped access regions previously unresolvable due to high GC bias or long repeats. The assembly allowed scientists to fill some 400 gaps in the latest human reference genome, including some as long as 50 kb.
Yunfei Guo, a grad student at the University of Southern California, discusses the benefits of SMRT Sequencing: very long reads that make it possible to resolve long repetitive regions and discover structural variants, and a random error mode that allows for extremely high accuracy.
Jeong-Sun Seo of Macrogen and Seoul National University College of Medicine reports on sequencing many Asian genomes to better understand genetic variation in that population. He shows that identifying certain structural variants may explain diseases that disproportionately affect Asian people.
In his talk from the AGBT 2015 PacBio workshop, Craig Venter detailed plans to sequence 1 million genomes and gather extensive phenotypic data to make sense of them. Included: generating 30 reference genomes to represent ethnogeographic diversity; the need for long-range continuity in sequencing; and truly predictive genomics.
Yuta Suzuki from the University of Tokyo presents his AGBT poster on heterozygotic DNA methylation patterns. He used kinetic data from SMRT Sequencing to generate epigenetic information on samples ranging from human to medaka fish and was able to analyze haplotype-specific methylation data. He also shows that long reads are better able to capture data about CpG islands than short-read sequences.