The confocal detection principle is extended to a highly parallel optical system that continuously analyzes thousands of concurrent sample locations. This is achieved through the use of a holographic laser illumination multiplexer combined with a confocal pinhole array before a prism dispersive element used to provide spectroscopic information from each confocal volume. The system is demonstrated to detect and identify single fluorescent molecules from each of several thousand independent confocal volumes in real time.
Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published.A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of…
With the price of next generation sequencing steadily decreasing, bacterial genome assembly is now accessible to a wide range of researchers. It is therefore necessary to understand the best methods for generating a genome assembly, specifically, which combination of sequencing and bioinformatics strategies result in the most accurate assemblies. Here, we sequence three E. coli strains on the Illumina MiSeq, Life Technologies Ion Torrent PGM, and Pacific Biosciences RS. We then perform genome assemblies on all three datasets alone or in combination to determine the best methods for the assembly of bacterial genomes.Three E. coli strains – BL21(DE3), Bal225, and…
First- and second-generation sequencing technologies have led the way in revolutionizing the field of genomics and beyond, motivating an astonishing number of scientific advances, including enabling a more complete understanding of whole genome sequences and the information encoded therein, a more complete characterization of the methylome and transcriptome and a better understanding of interactions between proteins and DNA. Nevertheless, there are sequencing applications and aspects of genome biology that are presently beyond the reach of current sequencing technologies, leaving fertile ground for additional innovation in this space. In this review, we describe a new generation of single-molecule sequencing technologies (third-generation…
Pacific Biosciences has developed a method for real-time sequencing of single DNA molecules (Eid et al., 2009), with intrinsic sequencing rates of several bases per second and read lengths into the kilobase range. Conceptually, this sequencing approach is based on eavesdropping on the activity of DNA polymerase carrying out template-directed DNA polymerization. Performed in a highly parallel operational mode, sequential base additions catalyzed by each polymerase are detected with terminal phosphate-linked, fluorescence-labeled nucleotides. This chapter will first outline the principle of this single-molecule, real-time (SMRT) DNA sequencing method, followed by descriptions of its underlying components and typical sequencing run conditions.…
Third generation single molecule sequencing technology is poised to revolutionize genomics by en- abling the sequencing of long, individual molecules of DNA and RNA. These technologies now routinely produce reads exceeding 5,000 basepairs, and can achieve reads as long as 50,000 basepairs. Here we evaluate the limits of single molecule sequencing by assessing the impact of long read sequencing in the assembly of the human genome and 25 other important genomes across the tree of life. From this, we develop a new data-driven model using support vector regression that can accurately predict assembly performance. We also present a novel hybrid…
The availability of diverse second- and third-generation sequencing technologies enables the rapid determination of the sequences of bacterial genomes. However, identifying the sequencing technology most suitable for producing a finished genome with multiple chromosomes remains a challenge. We evaluated the abilities of the following three second-generation sequencers: Roche 454 GS Junior (GS Jr), Life Technologies Ion PGM (Ion PGM), and Illumina MiSeq (MiSeq) and a third-generation sequencer, the Pacific Biosciences RS sequencer (PacBio), by sequencing and assembling the genome of Vibrio parahaemolyticus, which consists of a 5-Mb genome comprising two circular chromosomes. We sequenced the genome of V. parahaemolyticus with…
Microsatellite sequences are important markers for population genetics studies. In the past, the development of adequate microsatellite primers has been cumbersome. However with the advent of next-generation sequencing technologies, marker identification in genomes of non-model species has been greatly simplified. Here we describe microsatellite discovery on a Pacific Biosciences single molecule real-time sequencer. For the Greater White-fronted Goose (Anser albifrons), we identified 316 microsatellite loci in a single genome shotgun sequencing experiment. We found that the capability of handling large insert sizes and high quality circular consensus sequences provides an advantage over short read technologies for primer design. Combined with…
Abstract Genomic data have become commonplace in most branches of the biological sciences and have fundamentally altered the way research is conducted. However, the predominance of short-read sequence data from second-generation sequencing technologies has commonly resulted in fragmented and partial genomic data characteristics. In this opinion, I will highlight how long, unbiased reads from single molecule, real-time (SMRT) sequencing now allow for a return to more contiguous and comprehensive views of genomes.
Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci…
More than 20% of the world’s population is at risk for infection by filarial nematodes and >180 million people worldwide are already infected. Along with infection comes significant morbidity that has a socioeconomic impact. The eight filarial nematodes that infect humans are Wuchereria bancrofti, Brugia malayi, Brugia timori, Onchocerca volvulus, Loa loa, Mansonella perstans, Mansonella streptocerca, and Mansonella ozzardi, of which three have published draft genome sequences. Since all have humans as the definitive host, standard avenues of research that rely on culturing and genetics have often not been possible. Therefore, genome sequencing provides an important window into understanding the…
DNA modifications, such as methylation guide numerous critical biological processes, yet epigenetic information has not routinely been collected as part of DNA sequence analyses. Recently, the development of single molecule real time (SMRT) DNA sequencing has enabled detection of modified nucleotides (e.g. 6mA, 4mC, 5mC) in parallel with acquisition of primary sequence data, based on analysis of the kinetics of DNA synthesis reactions. In bacteria, genome-wide mapping of methylated and unmethylated loci is now feasible. This technological advance sets the stage for comprehensive, mechanistic assessment of the effects of bacterial DNA methyltransferases (MTases)-which are ubiquitous, extremely diverse, and largely uncharacterized-on…
Targeted genome editing with engineered nucleases has transformed the ability to introduce precise sequence modifications at almost any site within the genome. A major obstacle to probing the efficiency and consequences of genome editing is that no existing method enables the frequency of different editing events to be simultaneously measured across a cell population at any endogenous genomic locus. We have developed a novel method for quantifying individual genome editing outcomes at any site of interest using single molecule real time (SMRT) DNA sequencing. We show that this approach can be applied at various loci, using multiple engineered nuclease platforms…
Whole-genome sequencing is an important method to understand the genetic information, gene function, biological characteristics and survival mechanisms of organisms. Sequencing large genomes is very simple at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. Shotgun sequencing method failed to complete the sequence of this genome.After persevering for 10 years and going over three generations of sequencing techniques, we successfully completed the sequence of the PaP1 genome with a length of 91,715 bp. Single-molecule real-time sequencing results revealed that this genome contains 51?N-6-methyladenines and 152?N-4-methylcytosines. Three significant modified sequence motifs were predicted, but not all of the sites…
The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem.To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps,…