The long read lengths of PacBio’s SMRT Sequencing enable detection of linked mutations across multiple kilobases of sequence. This feature is particularly useful in the context of protein engineering, where large numbers of similar constructs are generated routinely to explore the effects of mutations on function and stability. We have developed a PCR-based barcoded sequencing method to generate high quality, full-length sequence data for batches of constructs generated in a common backbone. Individual barcodes are coupled to primers targeting a common region of the vector of interest. The amplified products are pooled into a single DNA library, and sequencing data are clustered by barcode to generate multi-molecule consensus sequences for each construct present in the pool. As a proof-of-concept dataset, we have generated a library of 384 randomly mutated variants of the Phi29 DNA polymerase, a 575 amino acid protein encoded by a 1.7 kb gene. These variants were amplified with a set of barcoded primers, and the resulting library was sequenced on a single SMRT Cell. The data produced sequences that were completely concordant with independent Sanger sequencing, for a 100% accurate reconstruction of the set of clones.
We have developed barcoding reagents and workflows for multiplexing amplicons or fragmented native genomic (DNA) prior to Single Molecule, Real-Time (SMRT) Sequencing. The long reads of PacBio’s SMRT Sequencing enable detection of linked mutations across multiple kilobases (kb) of sequence. This feature is particularly useful in the context of mutational analysis or SNP confirmation, where a large number of samples are generated routinely. To validate this workflow, a set of 384 1.7-kb amplicons, each derived from variants of the Phi29 DNA polymerase gene, were barcoded during amplification, pooled, and sequenced on a single SMRT Cell. To demonstrate the applicability of the method to longer inserts, a library of 96 5-kb clones derived from the E. coli genome was sequenced.
Kaposi Sarcoma-Associated Herpesvirus Glycoprotein H Is Indispensable for Infection of Epithelial, Endothelial, and Fibroblast Cell Types.
Kaposi sarcoma-associated herpesvirus (KSHV) is an emerging pathogen and is the causative infectious agent of Kaposi sarcoma and two malignancies of B cell origin. To date, there is no licensed KSHV vaccine. Development of an effective vaccine against KSHV continues to be limited by a poor understanding of how the virus initiates acute primary infection in vivo in diverse human cell types. The role of glycoprotein H (gH) in herpesvirus entry mechanisms remains largely unresolved. To characterize the requirement for KSHV gH in the viral life cycle and in determination of cell tropism, we generated and characterized a mutant KSHV in which expression of gH was abrogated. Using a bacterial artificial chromosome containing a complete recombinant KSHV genome and recombinant DNA technology, we inserted stop codons into the gH coding region. We used electron microscopy to reveal that the gH-null mutant virus assembled and exited from cells normally, compared to wild-type virus. Using purified virions, we assessed infectivity of the gH-null mutant in diverse mammalian cell types in vitro Unlike wild-type virus or a gH-containing revertant, the gH-null mutant was unable to infect any of the epithelial, endothelial, or fibroblast cell types tested. However, its ability to infect B cells was equivocal and remains to be investigated in vivo due to generally poor infectivity in vitro Together, these results suggest that gH is critical for KSHV infection of highly permissive cell types, including epithelial, endothelial, and fibroblast cells.IMPORTANCE All homologues of herpesvirus gH studied to date have been implicated in playing an essential role in viral infection of diverse permissive cell types. However, the role of gH in the mechanism of KSHV infection remains largely unresolved. In this study, we generated a gH-null mutant KSHV and provided evidence that deficiency of gH expression did not affect viral particle assembly or egress. Using the gH-null mutant, we showed that gH was indispensable for KSHV infection of epithelial, endothelial, and fibroblast cells in vitro This suggests that gH is an important target for the development of a KSHV prophylactic vaccine to prevent initial viral infection.Copyright © 2019 American Society for Microbiology.
Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Vector design Tour de Force: integrating combinatorial and rational approaches to derive novel adeno-associated virus variants.
Methodologies to improve existing adeno-associated virus (AAV) vectors for gene therapy include either rational approaches or directed evolution to derive capsid variants characterized by superior transduction efficiencies in targeted tissues. Here, we integrated both approaches in one unified design strategy of “virtual family shuffling” to derive a combinatorial capsid library whereby only variable regions on the surface of the capsid are modified. Individual sublibraries were first assembled in order to preselect compatible amino acid residues within restricted surface-exposed regions to minimize the generation of dead-end variants. Subsequently, the successful families were interbred to derive a combined library of ~8?×?10(5) complexity. Next-generation sequencing of the packaged viral DNA revealed capsid surface areas susceptible to directed evolution, thus providing guidance for future designs. We demonstrated the utility of the library by deriving an AAV2-based vector characterized by a 20-fold higher transduction efficiency in murine liver, now equivalent to that of AAV8.
Successful antibody development requires both functional binding and desirable biophysical characteristics. In the current study, we analyze the causes of one hurdle to clinical development, off-target reactivity, or nonspecificity. We used a high-throughput nonspecificity assay to isolate panels of nonspecific antibodies from two synthetic single-chain variable fragment libraries expressed on the surface of yeast, identifying both individual amino acids and motifs within the complementarity-determining regions which contribute to the phenotype. We find enrichment of glycine, valine, and arginine as both individual amino acids and as a part of motifs, and additionally enrichment of motifs containing tryptophan. Insertion of any of these motifs into the complementarity-determining region H3 of a “clean” antibody increased its nonspecificity, with greatest increases in antibodies containing Trp or Val motifs. We next applied these rules to the creation of a synthetic diversity library based on natural frameworks with significantly decreased incorporation of such motifs and demonstrated its ability to isolate binders to a wide panel of antigens. This work both provides a greater understanding of the drivers of nonspecificity and provides design rules to increase efficiency in the isolation of antibodies with drug-like properties. Copyright © 2017 Elsevier Ltd. All rights reserved.
Pol V-mediated translesion synthesis elicits localized untargeted mutagenesis during post-replicative gap repair.
In vivo, replication forks proceed beyond replication-blocking lesions by way of downstream repriming, generating daughter strand gaps that are subsequently processed by post-replicative repair pathways such as homologous recombination and translesion synthesis (TLS). The way these gaps are filled during TLS is presently unknown. The structure of gap repair synthesis was assessed by sequencing large collections of single DNA molecules that underwent specific TLS events in vivo. The higher error frequency of specialized relative to replicative polymerases allowed us to visualize gap-filling events at high resolution. Unexpectedly, the data reveal that a specialized polymerase, Pol V, synthesizes stretches of DNA both upstream and downstream of a site-specific DNA lesion. Pol V-mediated untargeted mutations are thus spread over several hundred nucleotides, strongly eliciting genetic instability on either side of a given lesion. Consequently, post-replicative gap repair may be a source of untargeted mutations critical for gene diversification in adaptation and evolution. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Design and large-scale synthesis of DNA has been applied to the functional study of viral and microbial genomes. New and expanded technology development is required to unlock the transformative potential of such bottom-up approaches to the study of larger, mammalian genomes. Two major challenges include assembling and delivering long DNA sequences. Here we describe a pipeline for de novo DNA assembly and delivery that enables functional evaluation of mammalian genes on the length scale of 100 kb. The DNA assembly step is supported by an integrated robotic workcell. We assemble the 101 kb human HPRT1 gene in yeast, deliver it to mouse cells, and show expression of the human protein from its full-length gene. This pipeline provides a framework for producing systematic, designer variants of any mammalian gene locus for functional evaluation in cells.
SMRT Gate: A method for validation of synthetic constructs on Pacific Biosciences sequencing platforms.
Current DNA assembly methods are prone to sequence errors, requiring rigorous quality control (QC) to identify incorrect assemblies or synthesized constructs. Such errors can lead to misinterpretation of phenotypes. Because of this intrinsic problem, routine QC analysis is generally performed on three or more clones using a combination of restriction endonuclease assays, colony PCR, and Sanger sequencing. However, as new automation methods emerge that enable high-throughput assembly, QC using these techniques has become a major bottleneck. Here, we describe a quick and affordable methodology for the QC of synthetic constructs. Our method involves a one-pot digestion-ligation DNA assembly reaction, based on the Golden Gate assembly methodology, that is coupled with Pacific Biosciences’ Single Molecule, Real-Time (PacBio SMRT) sequencing technology.
The Carbohydrate Active Enzyme (CAZy) database indicates that glycoside hydrolase family 55 (GH55) contains both endo- and exo-ß-1,3-glucanases. The founding structure in the GH55 is PcLam55A from the white rot fungus Phanerochaete chrysosporium (Ishida, T., Fushinobu, S., Kawai, R., Kitaoka, M., Igarashi, K., and Samejima, M. (2009) Crystal structure of glycoside hydrolase family 55 ß-1,3-glucanase from the basidiomycete Phanerochaete chrysosporium. J. Biol. Chem. 284, 10100-10109). Here, we present high resolution crystal structures of bacterial SacteLam55A from the highly cellulolytic Streptomyces sp. SirexAA-E with bound substrates and product. These structures, along with mutagenesis and kinetic studies, implicate Glu-502 as the catalytic acid (as proposed earlier for Glu-663 in PcLam55A) and a proton relay network of four residues in activating water as the nucleophile. Further, a set of conserved aromatic residues that define the active site apparently enforce an exo-glucanase reactivity as demonstrated by exhaustive hydrolysis reactions with purified laminarioligosaccharides. Two additional aromatic residues that line the substrate-binding channel show substrate-dependent conformational flexibility that may promote processive reactivity of the bound oligosaccharide in the bacterial enzymes. Gene synthesis carried out on ~30% of the GH55 family gave 34 active enzymes (19% functional coverage of the nonredundant members of GH55). These active enzymes reacted with only laminarin from a panel of 10 different soluble and insoluble polysaccharides and displayed a broad range of specific activities and optima for pH and temperature. Application of this experimental method provides a new, systematic way to annotate glycoside hydrolase phylogenetic space for functional properties.© 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Gene regulatory networks (GRNs) comprising interactions between transcription factors (TFs) and regulatory loci control development and physiology. Numerous disease-associated mutations have been identified, the vast majority residing in non-coding regions of the genome. As current GRN mapping methods test one TF at a time and require the use of cells harboring the mutation(s) of interest, they are not suitable to identify TFs that bind to wild-type and mutant loci. Here, we use gene-centered yeast one-hybrid (eY1H) assays to interrogate binding of 1,086 human TFs to 246 enhancers, as well as to 109 non-coding disease mutations. We detect both loss and gain of TF interactions with mutant loci that are concordant with target gene expression changes. This work establishes eY1H assays as a powerful addition to the toolkit of mapping human GRNs and for the high-throughput characterization of genomic variants that are rapidly being identified by genome-wide association studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Here we report recombinant expression and activity of several type I fatty acid synthases that can function in parallel with the native Escherichia coli fatty acid synthase. Corynebacterium glutamicum FAS1A was the most active in E. coli and this fatty acid synthase was leveraged to produce oleochemicals including fatty alcohols and methyl ketones. Coexpression of FAS1A with the ACP/CoA-reductase Maqu2220 from Marinobacter aquaeolei shifted the chain length distribution of fatty alcohols produced. Coexpression of FAS1A with FadM, FadB, and an acyl-CoA-oxidase from Micrococcus luteus resulted in the production of methyl ketones, although at a lower level than cells using the native FAS. This work, to our knowledge, is the first example of in vivo function of a heterologous fatty acid synthase in E. coli. Using FAS1 enzymes for oleochemical production have several potential advantages, and further optimization of this system could lead to strains with more efficient conversion to desired products. Finally, functional expression of these large enzyme complexes in E. coli will enable their study without culturing the native organisms. Published by Elsevier Inc.
Integration of a transfected gene into the genome of Babesia bovis occurs by legitimate homologous recombination mechanisms.
This study examines the patterns of gene integration of gfp-bsd upon stable transfection into the T3Bo strain of Babesia bovis using a plasmid designed to integrate homologous sequences of the parasite’s two identical ef-1a A and B genes. While the transfected BboTf-149-6 cell line displayed two distinct patterns of gene integration, clonal lines derived from this strain by cell sorting contained only single gfp-bsd insertions. Whole genome sequencing of two selected clonal lines, E9 and C6, indicated two distinct patterns of gfp-bsd insertion occurring by legitimate homologous recombination mechanisms: one into the expected ef-1a orf B, and another into the ef-1a B promoter. The data suggest that expression of the ef-1a orf B is not required for development of B. bovis in cultured erythrocyte stages. Use of legitimate homologous recombination mechanisms in transfected B. bovis supports the future use of transfection methods for developing efficient gene function assignment experiments using gene knockout techniques. Published by Elsevier B.V.
Phylogenomically guided identification of industrially relevant GH1 ß-glucosidases through DNA synthesis and nanostructure-initiator mass spectrometry.
Harnessing the biotechnological potential of the large number of proteins available in sequence databases requires scalable methods for functional characterization. Here we propose a workflow to address this challenge by combining phylogenomic guided DNA synthesis with high-throughput mass spectrometry and apply it to the systematic characterization of GH1 ß-glucosidases, a family of enzymes necessary for biomass hydrolysis, an important step in the conversion of lignocellulosic feedstocks to fuels and chemicals. We synthesized and expressed 175 GH1s, selected from over 2000 candidate sequences to cover maximum sequence diversity. These enzymes were functionally characterized over a range of temperatures and pHs using nanostructure-initiator mass spectrometry (NIMS), generating over 10,000 data points. When combined with HPLC-based sugar profiling, we observed GH1 enzymes active over a broad temperature range and toward many different ß-linked disaccharides. For some GH1s we also observed activity toward laminarin, a more complex oligosaccharide present as a major component of macroalgae. An area of particular interest was the identification of GH1 enzymes compatible with the ionic liquid 1-ethyl-3-methylimidazolium acetate ([C2mim][OAc]), a next-generation biomass pretreatment technology. We thus searched for GH1 enzymes active at 70 °C and 20% (v/v) [C2mim][OAc] over the course of a 24-h saccharification reaction. Using our unbiased approach, we identified multiple enzymes of different phylogentic origin with such activities. Our approach of characterizing sequence diversity through targeted gene synthesis coupled to high-throughput screening technologies is a broadly applicable paradigm for a wide range of biological problems.
Human to yeast pathway transplantation: cross-species dissection of the adenine de novo pathway regulatory node
Pathway transplantation from one organism to another represents a means to a more complete understanding of a biochemical or regulatory process. The purine biosynthesis pathway, a core metabolic function, was transplanted from human to yeast. We replaced the entire Saccharomyces cerevisiae adenine de novo pathway with the cognate human pathway components. A yeast strain was humanized for the full pathway by deleting all relevant yeast genes completely and then providing the human pathway in trans using a neochromosome expressing the human protein coding regions under the transcriptional control of their cognate yeast promoters and terminators. The humanized yeast strain grows in the absence of adenine, indicating complementation of the yeast pathway by the full set of human proteins. While the strain with the neochromosome is indeed prototrophic, it grows slowly in the absence of adenine. Dissection of the phenotype revealed that the human ortholog of ADE4, PPAT, shows only partial complementation. We have used several strategies to understand this phenotype, that point to PPAT/ADE4 as the central regulatory node. Pathway metabolites are responsible for regulating PPATs protein abundance through transcription and proteolysis as well as its enzymatic activity by allosteric regulation in these yeast cells. Extensive phylogenetic analysis of PPATs from diverse organisms hints at adaptations of the enzyme-level regulation to the metabolite levels in the organism. Finally, we isolated specific mutations in PPAT as well as in other genes involved in the purine metabolic network that alleviate incomplete complementation by PPAT and provide further insight into the complex regulation of this critical metabolic pathway.