As part of our effort to support the National Institutes of Health and the Genome Reference Consortium (GRC) in creating platinum genomes for the research community and improving the reference genome, in 2014 we generated 54X SMRT® Sequencing coverage of the CHM1 cell line, derived from a human haploid hydatidiform mole, using our P5-C3 chemistry, and made it publicly available through the SRA database at NCBI. The CHM1 dataset was quickly taken up by researchers eager to use long, unbiased reads to identify regions of the genome prone to structural variation and to fill in sequence gaps in the GRC-maintained…
Last month we hosted a SMRT® Informatics Developers Conference, bringing together 150 developers with a passion for improving tools and resources. Our team came back brimming with enthusiasm for tools that will be released in the coming months, and humbled by the commitment we saw from the bioinformatics community to help scientists make SMRT Sequencing data increasingly useful. Thanks to the National Institute of Standards and Technology for hosting our meeting on their campus right before the Genome in a Bottle workshop. The big news we shared with attendees is that the PacBio® System will now output industry-standard BAM files…
Sunflowers with verticillium wilt caused by V. dahliae In a new mBio publication, scientists from Wageningen University and KeyGene in The Netherlands report results from several strategies used to assemble the genome of a filamentous fungus, and describe the specific pipeline they recommend for sequencing and assembling eukaryotic genomes. “Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome” comes from lead authors Luigi Faino and Michael Seidl, senior author Bart Thomma, and collaborators. Using Verticillium dahliae as a model, which is a plant pathogen responsible for the damaging verticillium wilt disease in many crop species, they compared…
At the inaugural Festival of Genomics event in Boston, more than 1,500 people turned out to see what was billed as a conference unlike any other. The meeting was indeed unique, featuring a play (starring well-known scientists), a giant chess board, and a Genome Dome, in addition to the more familiar lineup of excellent speakers and workshops. To help kick off the festival, genomic luminaries Craig Venter and James Lupski presented plenary talks on day 1 and set the stage for some exciting science to follow. Lupski’s talk was particularly impactful, as he described how his team at Baylor recently…
A new publication in Nature Biotechnology reports the development of a lightning-fast genome assembly pipeline optimized for long reads. Scientists from the University of Maryland and the National Biodefense Analysis and Countermeasures Center created the MinHash Alignment Process, known as MHAP, to dramatically reduce assembly time and improve assembly quality. Their results are worth celebrating: assembly times were 600-fold faster compared to existing methods. “Using MHAP and the Celera Assembler, single-molecule sequencing can produce de novo near-complete eukaryotic assemblies that are 99.99% accurate when compared with available reference genomes,” the authors write. In the best cases, entire chromosome arms assembled…
Scientists from Argentina and Brazil published the results of a study comparing long-read approaches to characterize the genome structure of a highly complex region of the Y chromosome in Drosophila melanogaster. They found that Single Molecule, Real-Time (SMRT®) Sequencing outperformed synthetic long reads in accurately representing tandem repeats. The study aimed to resolve the structure of the autosomal gene Mst77F, which had previously been found to have multiple tandem copies; the region, however, was known to be grossly misassembled in the reference. The scientists, from Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas and Universidade Federal…