New England Biolabs Archives

June 1, 2021

SMRT-Cappable-seq reveals the complex operome of bacteria

SMRT-Cappable-seq combines the isolation of full-length prokaryotic primary transcripts with long read sequencing technology. It is the first experimental methodology to sequence entire prokaryotic transcripts. It identifies the transcription start site and termination site, thereby directly defines the operon structures genome-wide in prokaryotes. Applied to E.coli, SMRT-Cappable-seq identifies a total of ~2300 operons, among which ~900 are novel. Importantly, our result reveals a pervasive read-through of previous experimentally validated transcription termination sites. Termination read-through represents a powerful strategy to control gene expression. Taken together this data provides a first glance at the complexity of the ‘operome’ in bacteria and presents an invaluable resource for understanding gene regulation and function in bacteria.

June 1, 2021

Full-length cDNA sequencing of prokaryotic transcriptome and metatranscriptome samples

Next-generation sequencing has become a useful tool for studying transcriptomes. However, these methods typically rely on sequencing short fragments of cDNA, then attempting to assemble the pieces into full-length transcripts. Here, we describe a method that uses PacBio long reads to sequence full-length cDNAs from individual transcriptomes and metatranscriptome samples. We have adapted the PacBio Iso-Seq protocol for use with prokaryotic samples by incorporating RNA polyadenylation and rRNA-depletion steps. In conjunction with SMRT Sequencing, which has average readlengths of 10-15 kb, we are able to sequence entire transcripts, including polycistronic RNAs, in a single read. Here, we show full-length bacterial transcriptomes with the ability to visualize transcription of operons. In the area of metatranscriptomics, long reads reveal unambiguous gene sequences without the need for post-sequencing transcript assembly. We also show full-length bacterial transcripts sequenced after being treated with NEB’s Cappable-Seq, which is an alternative method for depleting rRNA and enriching for full-length transcripts with intact 5’ ends. Combining Cappable-Seq with PacBio long reads allows for the detection of transcription start sites, with the additional benefit of sequencing entire transcripts.

April 21, 2020

Complete Genome Sequence of the Wolbachia wAlbB Endosymbiont of Aedes albopictus.

Wolbachia, an alpha-proteobacterium closely related to Rickettsia, is a maternally transmitted, intracellular symbiont of arthropods and nematodes. Aedes albopictus mosquitoes are naturally infected with Wolbachia strains wAlbA and wAlbB. Cell line Aa23 established from Ae. albopictus embryos retains only wAlbB and is a key model to study host-endosymbiont interactions. We have assembled the complete circular genome of wAlbB from the Aa23 cell line using long-read PacBio sequencing at 500× median coverage. The assembled circular chromosome is 1.48 megabases in size, an increase of more than 300 kb over the published draft wAlbB genome. The annotation of the genome identified 1,205 protein coding genes, 34 tRNA, 3 rRNA, 1 tmRNA, and 3 other ncRNA loci. The long reads enabled sequencing over complex repeat regions which are difficult to resolve with short-read sequencing. Thirteen percent of the genome comprised insertion sequence elements distributed throughout the genome, some of which cause pseudogenization. Prophage WO genes encoding some essential components of phage particle assembly are missing, while the remainder are found in five prophage regions/WO-like islands or scattered around the genome. Orthology analysis identified a core proteome of 535 orthogroups across all completed Wolbachia genomes. The majority of proteins could be annotated using Pfam and eggNOG analyses, including ankyrins and components of the Type IV secretion system. KEGG analysis revealed the absence of five genes in wAlbB which are present in other Wolbachia. The availability of a complete circular chromosome from wAlbB will enable further biochemical, molecular, and genetic analyses on this strain and related Wolbachia. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

April 21, 2020

RADAR-seq: A RAre DAmage and Repair sequencing method for detecting DNA damage on a genome-wide scale.

RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire genome. Here, we use RADAR-seq to measure the frequency and map the location of ribonucleotides in wild-type and RNaseH2-deficient E. coli and Thermococcus kodakarensis strains. Additionally, by tracking ribonucleotides incorporated during in vivo lagging strand DNA synthesis, we determined the replication initiation point in E. coli, and its relation to the origin of replication (oriC). RADAR-seq was also used to map cyclobutane pyrimidine dimers (CPDs) in Escherichia coli (E. coli) genomic DNA exposed to UV-radiation. On a broader scale, RADAR-seq can be applied to understand formation and repair of DNA damage, the correlation between DNA damage and disease initiation and progression, and complex biological pathways, including DNA replication.Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.

September 22, 2019

A novel enrichment strategy reveals unprecedented number of novel transcription start sites at single base resolution in a model prokaryote and the gut microbiome.

The initiating nucleotide found at the 5′ end of primary transcripts has a distinctive triphosphorylated end that distinguishes these transcripts from all other RNA species. Recognizing this distinction is key to deconvoluting the primary transcriptome from the plethora of processed transcripts that confound analysis of the transcriptome. The currently available methods do not use targeted enrichment for the 5’end of primary transcripts, but rather attempt to deplete non-targeted RNA.We developed a method, Cappable-seq, for directly enriching for the 5′ end of primary transcripts and enabling determination of transcription start sites at single base resolution. This is achieved by enzymatically modifying the 5′ triphosphorylated end of RNA with a selectable tag. We first applied Cappable-seq to E. coli, achieving up to 50 fold enrichment of primary transcripts and identifying an unprecedented 16539 transcription start sites (TSS) genome-wide at single base resolution. We also applied Cappable-seq to a mouse cecum sample and identified TSS in a microbiome.Cappable-seq allows for the first time the capture of the 5′ end of primary transcripts. This enables a unique robust TSS determination in bacteria and microbiomes. In addition to and beyond TSS determination, Cappable-seq depletes ribosomal RNA and reduces the complexity of the transcriptome to a single quantifiable tag per transcript enabling digital profiling of gene expression in any microbiome.

September 22, 2019

The non-specific adenine DNA methyltransferase M.EcoGII.

We describe the cloning, expression and characterization of the first truly non-specific adenine DNA methyltransferase, M.EcoGII. It is encoded in the genome of the pathogenic strain Escherichia coli O104:H4 C227-11, where it appears to reside on a cryptic prophage, but is not expressed. However, when the gene encoding M.EcoGII is expressed in vivo – using a high copy pRRS plasmid vector and a methylation-deficient E. coli host-extensive in vivo adenine methylation activity is revealed. M.EcoGII methylates adenine residues in any DNA sequence context and this activity extends to dA and rA bases in either strand of a DNA:RNA-hybrid oligonucleotide duplex and to rA bases in RNAs prepared by in vitro transcription. Using oligonucleotide and bacteriophage M13mp18 virion DNA substrates, we find that M.EcoGII also methylates single-stranded DNA in vitro and that this activity is only slightly less robust than that observed using equivalent double-stranded DNAs. In vitro assays, using purified recombinant M.EcoGII enzyme, demonstrate that up to 99% of dA bases in duplex DNA substrates can be methylated thereby rendering them insensitive to cleavage by multiple restriction endonucleases. These properties suggest that the enzyme could also be used for high resolution mapping of protein binding sites in DNA and RNA substrates.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

September 22, 2019

A single-molecule sequencing assay for the comprehensive profiling of T4 DNA ligase fidelity and bias during DNA end-joining.

DNA ligases are key enzymes in molecular and synthetic biology that catalyze the joining of breaks in duplex DNA and the end-joining of DNA fragments. Ligation fidelity (discrimination against the ligation of substrates containing mismatched base pairs) and bias (preferential ligation of particular sequences over others) have been well-studied in the context of nick ligation. However, almost no data exist for fidelity and bias in end-joining ligation contexts. In this study, we applied Pacific Biosciences Single-Molecule Real-Time sequencing technology to directly sequence the products of a highly multiplexed ligation reaction. This method has been used to profile the ligation of all three-base 5′-overhangs by T4 DNA ligase under typical ligation conditions in a single experiment. We report the relative frequency of all ligation products with or without mismatches, the position-dependent frequency of each mismatch, and the surprising observation that 5′-TNA overhangs ligate extremely inefficiently compared to all other Watson-Crick pairings. The method can easily be extended to profile other ligases, end-types (e.g. blunt ends and overhangs of different lengths), and the effect of adjacent sequence on the ligation results. Further, the method has the potential to provide new insights into the thermodynamics of annealing and the kinetics of end-joining reactions.

September 22, 2019

Comprehensive profiling of four base overhang ligation fidelity by T4 DNA Ligase and application to DNA assembly.

Synthetic biology relies on the manufacture of large and complex DNA constructs from libraries of genetic parts. Golden Gate and other Type IIS restriction enzyme-dependent DNA assembly methods enable rapid construction of genes and operons through one-pot, multifragment assembly, with the ordering of parts determined by the ligation of Watson-Crick base-paired overhangs. However, ligation of mismatched overhangs leads to erroneous assembly, and low-efficiency Watson Crick pairings can lead to truncated assemblies. Using sets of empirically vetted, high-accuracy junction pairs avoids this issue but limits the number of parts that can be joined in a single reaction. Here, we report the use of comprehensive end-joining ligation fidelity and bias data to predict high accuracy junction sets for Golden Gate assembly. The ligation profile accurately predicted junction fidelity in ten-fragment Golden Gate assembly reactions and enabled accurate and efficient assembly of a lac cassette from up to 24-fragments in a single reaction.

July 19, 2019

DNA target recognition domains in the Type I restriction and modification systems of Staphylococcus aureus.

Staphylococcus aureus displays a clonal population structure in which horizontal gene transfer between different lineages is extremely rare. This is due, in part, to the presence of a Type I DNA restriction–modification (RM) system given the generic name of Sau1, which maintains different patterns of methylation on specific target sequences on the genomes of different lineages. We have determined the target sequences recognized by the Sau1 Type I RM systems present in a wide range of the most prevalent S. aureus lineages and assigned the sequences recognized to particular target recognition domains within the RM enzymes. We used a range of biochemical assays on purified enzymes and single molecule real-time sequencing on genomic DNA to determine these target sequences and their patterns of methylation. Knowledge of the main target sequences for Sau1 will facilitate the synthesis of new vectors for transformation of the most prevalent lineages of this ‘untransformable’ bacterium.

July 7, 2019

Complete genome sequence of the freshwater colorless sulfur bacterium Beggiatoa leptomitiformis neotype strain D-402T.

In this report, we announce the availability of a complete closed genome sequence and methylome analysis of Beggiatoa leptomitiformis neotype strain D-402(T) (DSM 14946, UNIQEM U 779). Copyright © 2015 Fomenkov et al.

July 7, 2019

Complete genome sequence and methylome analysis of Acinetobacter calcoaceticus 65.

Acinetobacter calcoaceticus 65 is the original source strain for the restriction enzyme Acc65I. Its complete sequence and full methylome were determined using single-molecule real-time (SMRT) sequencing. Copyright © 2017 Fomenkov et al.

July 7, 2019

Complete genome sequence of Kluyveromyces lactis strain GG799, a common yeast host for heterologous protein expression.

We report the genome sequence of the dairy yeast Kluyveromyces lactis strain GG799 obtained using the Pacific Biosciences RS II platform. K. lactis strain GG799 is a common host for the expression of proteins at both laboratory and industrial scales. Copyright © 2017 Chuzel et al.

July 7, 2019

Complete genome sequence of a strain of Azospirillum thiophilum isolated from a sulfide spring.

July 7, 2019

Complete genome sequence and methylome analysis of Aeromonas hydrophila strain YL17, isolated from a compost pile.

In this report, we announce the complete genome sequence of Aeromonas hydrophila strain YL17. Single-molecule real-time (SMRT) DNA sequencing was used to generate the complete genome sequence and the genome-wide DNA methylation profile of this environmental isolate. A total of five unique DNA methyltransferase recognition motifs were reported here. Copyright © 2016 Lim et al.

July 7, 2019

Structure of Type IIL restriction-modification enzyme MmeI in complex with DNA has implications for engineering new specificities.

The creation of restriction enzymes with programmable DNA-binding and -cleavage specificities has long been a goal of modern biology. The recently discovered Type IIL MmeI family of restriction-and-modification (RM) enzymes that possess a shared target recognition domain provides a framework for engineering such new specificities. However, a lack of structural information on Type IIL enzymes has limited the repertoire that can be rationally engineered. We report here a crystal structure of MmeI in complex with its DNA substrate and an S-adenosylmethionine analog (Sinefungin). The structure uncovers for the first time the interactions that underlie MmeI-DNA recognition and methylation (5′-TCCRAC-3′; R = purine) and provides a molecular basis for changing specificity at four of the six base pairs of the recognition sequence (5′-TCCRAC-3′). Surprisingly, the enzyme is resilient to specificity changes at the first position of the recognition sequence (5′-TCCRAC-3′). Collectively, the structure provides a basis for engineering further derivatives of MmeI and delineates which base pairs of the recognition sequence are more amenable to alterations than others.

Asset Tag: New England Biolabs

SMRT-Cappable-seq reveals the complex operome of bacteria

Full-length cDNA sequencing of prokaryotic transcriptome and metatranscriptome samples

Complete Genome Sequence of the Wolbachia wAlbB Endosymbiont of Aedes albopictus.

RADAR-seq: A RAre DAmage and Repair sequencing method for detecting DNA damage on a genome-wide scale.

A novel enrichment strategy reveals unprecedented number of novel transcription start sites at single base resolution in a model prokaryote and the gut microbiome.

The non-specific adenine DNA methyltransferase M.EcoGII.

A single-molecule sequencing assay for the comprehensive profiling of T4 DNA ligase fidelity and bias during DNA end-joining.

Comprehensive profiling of four base overhang ligation fidelity by T4 DNA Ligase and application to DNA assembly.

DNA target recognition domains in the Type I restriction and modification systems of Staphylococcus aureus.

Complete genome sequence of the freshwater colorless sulfur bacterium Beggiatoa leptomitiformis neotype strain D-402T.

Complete genome sequence and methylome analysis of Acinetobacter calcoaceticus 65.

Complete genome sequence of Kluyveromyces lactis strain GG799, a common yeast host for heterologous protein expression.

Complete genome sequence of a strain of Azospirillum thiophilum isolated from a sulfide spring.

Complete genome sequence and methylome analysis of Aeromonas hydrophila strain YL17, isolated from a compost pile.

Structure of Type IIL restriction-modification enzyme MmeI in complex with DNA has implications for engineering new specificities.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert