The long read lengths of PacBio’s SMRT Sequencing enable detection of linked mutations across multiple kilobases of sequence. This feature is particularly useful in the context of protein engineering, where large numbers of similar constructs are generated routinely to explore the effects of mutations on function and stability. We have developed a PCR-based barcoded sequencing method to generate high quality, full-length sequence data for batches of constructs generated in a common backbone. Individual barcodes are coupled to primers targeting a common region of the vector of interest. The amplified products are pooled into a single DNA library, and sequencing data…
We have developed barcoding reagents and workflows for multiplexing amplicons or fragmented native genomic (DNA) prior to Single Molecule, Real-Time (SMRT) Sequencing. The long reads of PacBio’s SMRT Sequencing enable detection of linked mutations across multiple kilobases (kb) of sequence. This feature is particularly useful in the context of mutational analysis or SNP confirmation, where a large number of samples are generated routinely. To validate this workflow, a set of 384 1.7-kb amplicons, each derived from variants of the Phi29 DNA polymerase gene, were barcoded during amplification, pooled, and sequenced on a single SMRT Cell. To demonstrate the applicability of…
Kaposi sarcoma-associated herpesvirus (KSHV) is an emerging pathogen and is the causative infectious agent of Kaposi sarcoma and two malignancies of B cell origin. To date, there is no licensed KSHV vaccine. Development of an effective vaccine against KSHV continues to be limited by a poor understanding of how the virus initiates acute primary infection in vivo in diverse human cell types. The role of glycoprotein H (gH) in herpesvirus entry mechanisms remains largely unresolved. To characterize the requirement for KSHV gH in the viral life cycle and in determination of cell tropism, we generated and characterized a mutant KSHV…
Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because…
Methodologies to improve existing adeno-associated virus (AAV) vectors for gene therapy include either rational approaches or directed evolution to derive capsid variants characterized by superior transduction efficiencies in targeted tissues. Here, we integrated both approaches in one unified design strategy of “virtual family shuffling” to derive a combinatorial capsid library whereby only variable regions on the surface of the capsid are modified. Individual sublibraries were first assembled in order to preselect compatible amino acid residues within restricted surface-exposed regions to minimize the generation of dead-end variants. Subsequently, the successful families were interbred to derive a combined library of ~8?×?10(5) complexity.…
Successful antibody development requires both functional binding and desirable biophysical characteristics. In the current study, we analyze the causes of one hurdle to clinical development, off-target reactivity, or nonspecificity. We used a high-throughput nonspecificity assay to isolate panels of nonspecific antibodies from two synthetic single-chain variable fragment libraries expressed on the surface of yeast, identifying both individual amino acids and motifs within the complementarity-determining regions which contribute to the phenotype. We find enrichment of glycine, valine, and arginine as both individual amino acids and as a part of motifs, and additionally enrichment of motifs containing tryptophan. Insertion of any of…
In vivo, replication forks proceed beyond replication-blocking lesions by way of downstream repriming, generating daughter strand gaps that are subsequently processed by post-replicative repair pathways such as homologous recombination and translesion synthesis (TLS). The way these gaps are filled during TLS is presently unknown. The structure of gap repair synthesis was assessed by sequencing large collections of single DNA molecules that underwent specific TLS events in vivo. The higher error frequency of specialized relative to replicative polymerases allowed us to visualize gap-filling events at high resolution. Unexpectedly, the data reveal that a specialized polymerase, Pol V, synthesizes stretches of DNA both upstream and…
Design and large-scale synthesis of DNA has been applied to the functional study of viral and microbial genomes. New and expanded technology development is required to unlock the transformative potential of such bottom-up approaches to the study of larger, mammalian genomes. Two major challenges include assembling and delivering long DNA sequences. Here we describe a pipeline for de novo DNA assembly and delivery that enables functional evaluation of mammalian genes on the length scale of 100 kb. The DNA assembly step is supported by an integrated robotic workcell. We assemble the 101 kb human HPRT1 gene in yeast, deliver it…
Current DNA assembly methods are prone to sequence errors, requiring rigorous quality control (QC) to identify incorrect assemblies or synthesized constructs. Such errors can lead to misinterpretation of phenotypes. Because of this intrinsic problem, routine QC analysis is generally performed on three or more clones using a combination of restriction endonuclease assays, colony PCR, and Sanger sequencing. However, as new automation methods emerge that enable high-throughput assembly, QC using these techniques has become a major bottleneck. Here, we describe a quick and affordable methodology for the QC of synthetic constructs. Our method involves a one-pot digestion-ligation DNA assembly reaction, based…
The Carbohydrate Active Enzyme (CAZy) database indicates that glycoside hydrolase family 55 (GH55) contains both endo- and exo-ß-1,3-glucanases. The founding structure in the GH55 is PcLam55A from the white rot fungus Phanerochaete chrysosporium (Ishida, T., Fushinobu, S., Kawai, R., Kitaoka, M., Igarashi, K., and Samejima, M. (2009) Crystal structure of glycoside hydrolase family 55 ß-1,3-glucanase from the basidiomycete Phanerochaete chrysosporium. J. Biol. Chem. 284, 10100-10109). Here, we present high resolution crystal structures of bacterial SacteLam55A from the highly cellulolytic Streptomyces sp. SirexAA-E with bound substrates and product. These structures, along with mutagenesis and kinetic studies, implicate Glu-502 as the catalytic…
Gene regulatory networks (GRNs) comprising interactions between transcription factors (TFs) and regulatory loci control development and physiology. Numerous disease-associated mutations have been identified, the vast majority residing in non-coding regions of the genome. As current GRN mapping methods test one TF at a time and require the use of cells harboring the mutation(s) of interest, they are not suitable to identify TFs that bind to wild-type and mutant loci. Here, we use gene-centered yeast one-hybrid (eY1H) assays to interrogate binding of 1,086 human TFs to 246 enhancers, as well as to 109 non-coding disease mutations. We detect both loss and…
Here we report recombinant expression and activity of several type I fatty acid synthases that can function in parallel with the native Escherichia coli fatty acid synthase. Corynebacterium glutamicum FAS1A was the most active in E. coli and this fatty acid synthase was leveraged to produce oleochemicals including fatty alcohols and methyl ketones. Coexpression of FAS1A with the ACP/CoA-reductase Maqu2220 from Marinobacter aquaeolei shifted the chain length distribution of fatty alcohols produced. Coexpression of FAS1A with FadM, FadB, and an acyl-CoA-oxidase from Micrococcus luteus resulted in the production of methyl ketones, although at a lower level than cells using the…
This study examines the patterns of gene integration of gfp-bsd upon stable transfection into the T3Bo strain of Babesia bovis using a plasmid designed to integrate homologous sequences of the parasite’s two identical ef-1a A and B genes. While the transfected BboTf-149-6 cell line displayed two distinct patterns of gene integration, clonal lines derived from this strain by cell sorting contained only single gfp-bsd insertions. Whole genome sequencing of two selected clonal lines, E9 and C6, indicated two distinct patterns of gfp-bsd insertion occurring by legitimate homologous recombination mechanisms: one into the expected ef-1a orf B, and another into the…
Harnessing the biotechnological potential of the large number of proteins available in sequence databases requires scalable methods for functional characterization. Here we propose a workflow to address this challenge by combining phylogenomic guided DNA synthesis with high-throughput mass spectrometry and apply it to the systematic characterization of GH1 ß-glucosidases, a family of enzymes necessary for biomass hydrolysis, an important step in the conversion of lignocellulosic feedstocks to fuels and chemicals. We synthesized and expressed 175 GH1s, selected from over 2000 candidate sequences to cover maximum sequence diversity. These enzymes were functionally characterized over a range of temperatures and pHs using…
Pathway transplantation from one organism to another represents a means to a more complete understanding of a biochemical or regulatory process. The purine biosynthesis pathway, a core metabolic function, was transplanted from human to yeast. We replaced the entire Saccharomyces cerevisiae adenine de novo pathway with the cognate human pathway components. A yeast strain was humanized for the full pathway by deleting all relevant yeast genes completely and then providing the human pathway in trans using a neochromosome expressing the human protein coding regions under the transcriptional control of their cognate yeast promoters and terminators. The humanized yeast strain grows…