Next-generation sequencing has become a useful tool for studying transcriptomes. However, these methods typically rely on sequencing short fragments of cDNA, then attempting to assemble the pieces into full-length transcripts. Here, we describe a method that uses PacBio long reads to sequence full-length cDNAs from individual transcriptomes and metatranscriptome samples. We have adapted the PacBio Iso-Seq protocol for use with prokaryotic samples by incorporating RNA polyadenylation and rRNA-depletion steps. In conjunction with SMRT Sequencing, which has average readlengths of 10-15 kb, we are able to sequence entire transcripts, including polycistronic RNAs, in a single read. Here, we show full-length bacterial…
SMRT-Cappable-seq combines the isolation of full-length prokaryotic primary transcripts with long read sequencing technology. It is the first experimental methodology to sequence entire prokaryotic transcripts. It identifies the transcription start site and termination site, thereby directly defines the operon structures genome-wide in prokaryotes. Applied to E.coli, SMRT-Cappable-seq identifies a total of ~2300 operons, among which ~900 are novel. Importantly, our result reveals a pervasive read-through of previous experimentally validated transcription termination sites. Termination read-through represents a powerful strategy to control gene expression. Taken together this data provides a first glance at the complexity of the ‘operome’ in bacteria and presents…
Wolbachia, an alpha-proteobacterium closely related to Rickettsia, is a maternally transmitted, intracellular symbiont of arthropods and nematodes. Aedes albopictus mosquitoes are naturally infected with Wolbachia strains wAlbA and wAlbB. Cell line Aa23 established from Ae. albopictus embryos retains only wAlbB and is a key model to study host-endosymbiont interactions. We have assembled the complete circular genome of wAlbB from the Aa23 cell line using long-read PacBio sequencing at 500× median coverage. The assembled circular chromosome is 1.48 megabases in size, an increase of more than 300 kb over the published draft wAlbB genome. The annotation of the genome identified 1,205…
RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire…
The initiating nucleotide found at the 5′ end of primary transcripts has a distinctive triphosphorylated end that distinguishes these transcripts from all other RNA species. Recognizing this distinction is key to deconvoluting the primary transcriptome from the plethora of processed transcripts that confound analysis of the transcriptome. The currently available methods do not use targeted enrichment for the 5’end of primary transcripts, but rather attempt to deplete non-targeted RNA.We developed a method, Cappable-seq, for directly enriching for the 5′ end of primary transcripts and enabling determination of transcription start sites at single base resolution. This is achieved by enzymatically modifying…
We describe the cloning, expression and characterization of the first truly non-specific adenine DNA methyltransferase, M.EcoGII. It is encoded in the genome of the pathogenic strain Escherichia coli O104:H4 C227-11, where it appears to reside on a cryptic prophage, but is not expressed. However, when the gene encoding M.EcoGII is expressed in vivo – using a high copy pRRS plasmid vector and a methylation-deficient E. coli host-extensive in vivo adenine methylation activity is revealed. M.EcoGII methylates adenine residues in any DNA sequence context and this activity extends to dA and rA bases in either strand of a DNA:RNA-hybrid oligonucleotide duplex…
DNA ligases are key enzymes in molecular and synthetic biology that catalyze the joining of breaks in duplex DNA and the end-joining of DNA fragments. Ligation fidelity (discrimination against the ligation of substrates containing mismatched base pairs) and bias (preferential ligation of particular sequences over others) have been well-studied in the context of nick ligation. However, almost no data exist for fidelity and bias in end-joining ligation contexts. In this study, we applied Pacific Biosciences Single-Molecule Real-Time sequencing technology to directly sequence the products of a highly multiplexed ligation reaction. This method has been used to profile the ligation of…
Synthetic biology relies on the manufacture of large and complex DNA constructs from libraries of genetic parts. Golden Gate and other Type IIS restriction enzyme-dependent DNA assembly methods enable rapid construction of genes and operons through one-pot, multifragment assembly, with the ordering of parts determined by the ligation of Watson-Crick base-paired overhangs. However, ligation of mismatched overhangs leads to erroneous assembly, and low-efficiency Watson Crick pairings can lead to truncated assemblies. Using sets of empirically vetted, high-accuracy junction pairs avoids this issue but limits the number of parts that can be joined in a single reaction. Here, we report the…
Staphylococcus aureus displays a clonal population structure in which horizontal gene transfer between different lineages is extremely rare. This is due, in part, to the presence of a Type I DNA restriction–modification (RM) system given the generic name of Sau1, which maintains different patterns of methylation on specific target sequences on the genomes of different lineages. We have determined the target sequences recognized by the Sau1 Type I RM systems present in a wide range of the most prevalent S. aureus lineages and assigned the sequences recognized to particular target recognition domains within the RM enzymes. We used a range…
In this report, we announce the availability of a complete closed genome sequence and methylome analysis of Beggiatoa leptomitiformis neotype strain D-402(T) (DSM 14946, UNIQEM U 779). Copyright © 2015 Fomenkov et al.
Acinetobacter calcoaceticus 65 is the original source strain for the restriction enzyme Acc65I. Its complete sequence and full methylome were determined using single-molecule real-time (SMRT) sequencing. Copyright © 2017 Fomenkov et al.
We report the genome sequence of the dairy yeast Kluyveromyces lactis strain GG799 obtained using the Pacific Biosciences RS II platform. K. lactis strain GG799 is a common host for the expression of proteins at both laboratory and industrial scales. Copyright © 2017 Chuzel et al.
We report the complete, closed genome sequence and complete methylome of Azospirillum thiophilum strain BV-S(T). Copyright © 2016 Fomenkov et al.
In this report, we announce the complete genome sequence of Aeromonas hydrophila strain YL17. Single-molecule real-time (SMRT) DNA sequencing was used to generate the complete genome sequence and the genome-wide DNA methylation profile of this environmental isolate. A total of five unique DNA methyltransferase recognition motifs were reported here. Copyright © 2016 Lim et al.
The creation of restriction enzymes with programmable DNA-binding and -cleavage specificities has long been a goal of modern biology. The recently discovered Type IIL MmeI family of restriction-and-modification (RM) enzymes that possess a shared target recognition domain provides a framework for engineering such new specificities. However, a lack of structural information on Type IIL enzymes has limited the repertoire that can be rationally engineered. We report here a crystal structure of MmeI in complex with its DNA substrate and an S-adenosylmethionine analog (Sinefungin). The structure uncovers for the first time the interactions that underlie MmeI-DNA recognition and methylation (5′-TCCRAC-3′; R…