Directed evolution represents an attractive approach to derive AAV capsid variants capable of selectively infect specific tissue or cell targets. It involves the generation of an initial library of high complexity followed by cycles of selection during which the library is progressively enriched for target-specific variants. Each selection cycle consists of the following: reconstitution of complete AAV genomes within plasmid molecules; production of virions for which each particular capsid variant is matched with the particular capsid gene encoding it; recovery of capsid gene sequences from target tissue after systemic administration. Prevalent variants are then analyzed and evaluated.
Long-read sequencing is emerging as a promising sequencing technology because it can tackle the short length limitation of second-generation sequencing, which has dominated the sequencing market in past years. However, it has substantially higher error rates compared to short-read sequencing (e.g., 13% vs. 0.1%), and its sequencing cost per base is typically more expensive than that of short-read sequencing. To address these limitations, we present a distributed hybrid error correction framework, called ParLECH, that is scalable and cost-efficient for PacBio long reads. For correcting the errors in the long reads, ParLECH utilizes the Illumina short reads that have the low error rate with high coverage at low cost. To efficiently analyze the high-throughput Illumina short reads, ParLECH is equipped with Hadoop and a distributed NoSQL system. To further improve the accuracy, ParLECH utilizes the k-mer coverage information of the Illumina short reads. Specifically, we develop a distributed version of the widest path algorithm, which maximizes the minimum k-mer coverage in a path of the de Bruijn graph constructed from the Illumina short reads. We replace an error region in a long read with its corresponding widest path. Our experimental results show that ParLECH can handle large-scale real-world datasets in a scalable and accurate manner. Using ParLECH, we can process a 312 GB human genome PacBio dataset, with a 452 GB Illumina dataset, on 128 nodes in less than 29 hours.
Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gene targeting by the TAL effector PthXo2 reveals cryptic resistance gene for bacterial blight of rice.
Bacterial blight of rice is caused by the ?-proteobacterium Xanthomonas oryzae pv. oryzae, which utilizes a group of type III TAL (transcription activator-like) effectors to induce host gene expression and condition host susceptibility. Five SWEET genes are functionally redundant to support bacterial disease, but only two were experimentally proven targets of natural TAL effectors. Here, we report the identification of the sucrose transporter gene OsSWEET13 as the disease-susceptibility gene for PthXo2 and the existence of cryptic recessive resistance to PthXo2-dependent X. oryzae pv. oryzae due to promoter variations of OsSWEET13 in japonica rice. PthXo2-containing strains induce OsSWEET13 in indica rice IR24 due to the presence of an unpredicted and undescribed effector binding site not present in the alleles in japonica rice Nipponbare and Kitaake. The specificity of effector-associated gene induction and disease susceptibility is attributable to a single nucleotide polymorphism (SNP), which is also found in a polymorphic allele of OsSWEET13 known as the recessive resistance gene xa25 from the rice cultivar Minghui 63. The mutation of OsSWEET13 with CRISPR/Cas9 technology further corroborates the requirement of OsSWEET13 expression for the state of PthXo2-dependent disease susceptibility to X. oryzae pv. oryzae. Gene profiling of a collection of 104 strains revealed OsSWEET13 induction by 42 isolates of X. oryzae pv. oryzae. Heterologous expression of OsSWEET13 in Nicotiana benthamiana leaf cells elevates sucrose concentrations in the apoplasm. The results corroborate a model whereby X. oryzae pv. oryzae enhances the release of sucrose from host cells in order to exploit the host resources.© 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Controlled delivery of ß-globin-targeting TALENs and CRISPR/Cas9 into mammalian cells for genome editing using microinjection.
Tal-effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR) with CRISPR-associated (Cas) proteins are genome editing tools with unprecedented potential. However, the ability to deliver optimal amounts of these nucleases into mammalian cells with minimal toxicity poses a major challenge. Common delivery approaches are transfection- and viral-based methods; each associated with significant drawbacks. An alternative method for directly delivering genome-editing reagents into single living cells with high efficiency and controlled volume is microinjection. Here, we characterize a glass microcapillary-based injection system and demonstrate controlled co-injection of TALENs or CRISPR/Cas9 together with donor template into single K562 cells for targeting the human ß-globin gene. We quantified nuclease induced insertions and deletions (indels) and found that, with ß-globin-targeting TALENs, similar levels of on- and off-target activity in cells could be achieved by microinjection compared with nucleofection. Furthermore, we observed 11% and 2% homology directed repair in single K562 cells co-injected with a donor template along with CRISPR/Cas9 and TALENs respectively. These results demonstrate that a high level of targeted gene modification can be achieved in human cells using glass-needle microinjection of genome editing reagents.
Galactofuranose in Mycoplasma mycoides is important for membrane integrity and conceals adhesins but does not contribute to serum resistance.
Mycoplasma mycoides subsp. capri (Mmc) and subsp. mycoides (Mmm) are important ruminant pathogens worldwide causing diseases such as pleuropneumonia, mastitis and septicaemia. They express galactofuranose residues on their surface, but their role in pathogenesis has not yet been determined. The M.?mycoides genomes contain up to several copies of the glf gene, which encodes an enzyme catalysing the last step in the synthesis of galactofuranose. We generated a deletion of the glf gene in a strain of Mmc using genome transplantation and tandem repeat endonuclease coupled cleavage (TREC) with yeast as an intermediary host for the genome editing. As expected, the resulting YCp1.1-?glf strain did not produce the galactofuranose-containing glycans as shown by immunoblots and immuno-electronmicroscopy employing a galactofuranose specific monoclonal antibody. The mutant lacking galactofuranose exhibited a decreased growth rate and a significantly enhanced adhesion to small ruminant cells. The mutant was also ‘leaking’ as revealed by a ß-galactosidase-based assay employing a membrane impermeable substrate. These findings indicate that galactofuranose-containing polysaccharides conceal adhesins and are important for membrane integrity. Unexpectedly, the mutant strain showed increased serum resistance. © 2015 The Authors. Molecular Microbiology published by John Wiley & Sons Ltd.
Well-developed genetic tools for thermophilic microorganisms are scarce, despite their industrial and scientific relevance. Whereas highly efficient CRISPR/Cas9-based genome editing is on the rise in prokaryotes, it has never been employed in a thermophile. Here, we apply Streptococcus pyogenes Cas9 (spCas9)-based genome editing to a moderate thermophile, i.e., Bacillus smithii, including a gene deletion, gene knockout via insertion of premature stop codons, and gene insertion. We show that spCas9 is inactive in vivo above 42 °C, and we employ the wide temperature growth range of B. smithii as an induction system for spCas9 expression. Homologous recombination with plasmid-borne editing templates is performed at 45-55 °C, when spCas9 is inactive. Subsequent transfer to 37 °C allows for counterselection through production of active spCas9, which introduces lethal double-stranded DNA breaks to the nonedited cells. The developed method takes 4 days with 90, 100, and 20% efficiencies for gene deletion, knockout, and insertion, respectively. The major advantage of our system is the limited requirement for genetic parts: only one plasmid, one selectable marker, and a promoter are needed, and the promoter does not need to be inducible or well-characterized. Hence, it can be easily applied for genome editing purposes in both mesophilic and thermophilic nonmodel organisms with a limited genetic toolbox and ability to grow at, or tolerate, temperatures of 37 and at or above 42 °C.
CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions.
The extent to which non-coding mutations contribute to Mendelian disease is a major unknown in human genetics. Relatedly, the vast majority of candidate regulatory elements have yet to be functionally validated. Here, we describe a CRISPR-based system that uses pairs of guide RNAs (gRNAs) to program thousands of kilobase-scale deletions that deeply scan across a targeted region in a tiling fashion (“ScanDel”). We applied ScanDel to HPRT1, the housekeeping gene underlying Lesch-Nyhan syndrome, an X-linked recessive disorder. Altogether, we programmed 4,342 overlapping 1 and 2 kb deletions that tiled 206 kb centered on HPRT1 (including 87 kb upstream and 79 kb downstream) with median 27-fold redundancy per base. We functionally assayed programmed deletions in parallel by selecting for loss of HPRT function with 6-thioguanine. As expected, sequencing gRNA pairs before and after selection confirmed that all HPRT1 exons are needed. However, HPRT1 function was robust to deletion of any intergenic or deeply intronic non-coding region, indicating that proximal regulatory sequences are sufficient for HPRT1 expression. Although our screen did identify the disruption of exon-proximal non-coding sequences (e.g., the promoter) as functionally consequential, long-read sequencing revealed that this signal was driven by rare, imprecise deletions that extended into exons. Our results suggest that no singular distal regulatory element is required for HPRT1 expression and that distal mutations are unlikely to contribute substantially to Lesch-Nyhan syndrome burden. Further application of ScanDel could shed light on the role of regulatory mutations in disease at other loci while also facilitating a deeper understanding of endogenous gene regulation. Copyright © 2017 American Society of Human Genetics. All rights reserved.
Rapid CRISPR/Cas9-mediated cloning of full-length Epstein-Barr virus genomes from latently infected cells.
Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.
The CRISPR-associated protein Cas9 is widely used for genome editing because it cleaves target DNA through the assistance of a single-guide RNA (sgRNA). Structural studies have revealed the multi-domain architecture of Cas9 and suggested sequential domain movements of Cas9 upon binding to the sgRNA and the target DNA These studies also hinted at the flexibility between domains; however, it remains unclear whether these flexible movements occur in solution. Here, we directly observed dynamic fluctuations of multiple Cas9 domains, using single-molecule FRET We found that the flexible domain movements allow Cas9 to adopt transient conformations beyond those captured in the crystal structures. Importantly, the HNH nuclease domain only accessed the DNA cleavage position during such flexible movements, suggesting the importance of this flexibility in the DNA cleavage process. Our FRET data also revealed the conformational flexibility of apo-Cas9, which may play a role in the assembly with the sgRNA Collectively, our results highlight the potential role of domain fluctuations in driving Cas9-catalyzed DNA cleavage.© 2018 The Authors. Published under the terms of the CC BY NC ND 4.0 license.
The applications of probiotics are significant and thus resulted in need of genome analysis of probiotic strains. Various omics methods and systems biology approaches enables us to understand and optimize the metabolic processes. These techniques have increased the researcher’s attention towards gut microbiome and provided a new source for the revelation of uncharacterized biosynthetic pathways which enables novel metabolic engineering approaches. In recent years, the broad and quantitative analysis of modified strains relies on systems biology tools such as in silico design which are commonly used methods for improving strain performance. The genetic manipulation of probiotic microorganisms is crucial for defining their role in intestinal microbiota and exploring their beneficial properties. This review describes an overview of gene editing and systems biology approaches, highlighting the advent of omics methods which allows the study of new routes for studying probiotic bacteria. We have also summarized gene editing tools like TALEN, ZFNs and CRISPR-Cas that edits or cleave the specific target DNA. Furthermore, in this review an overview of proposed design of advanced customized probiotic is also hypothesized to improvise the probiotics.
It has recently become possible to rapidly and accurately detect epigenetic signatures in bacterial genomes using third generation sequencing data. Monitoring the speed at which a single polymerase inserts a base in the read strand enables one to infer whether a modification is present at that specific site on the template strand. These sites can be challenging to detect in the absence of high coverage and reliable reference genomes.Here we provide a new method for detecting epigenetic motifs in bacteria on datasets with low-coverage, with incomplete references, and with mixed samples (i.e. metagenomic data). Our approach treats motif inference as a kmer comparison problem. First, genomes (or contigs) are deconstructed into kmers. Then, native genome-wide distributions of interpulse durations (IPDs) for kmers are compared with corresponding whole genome amplified (WGA, modification free) IPD distributions using log likelihood ratios. Finally, kmers are ranked and greedily selected by iteratively correcting for sequences within a particular kmer’s neighborhood.Our method can detect multiple types of modifications, even at very low-coverage and in the presence of mixed genomes. Additionally, we are able to predict modified motifs when genomes with “neighbor” modified motifs exist within the sample. Lastly, we show that these motifs can provide an alternative source of information by which to cluster metagenomics contigs and that iterative refinement on these clustered contigs can further improve both sensitivity and specificity of motif detection.https://github.com/alibashir/EMMCKmer.
Cultivated bacteria such as actinomycetes are a highly useful source of biomedically important natural products. However, such ‘talented’ producers represent only a minute fraction of the entire, mostly uncultivated, prokaryotic diversity. The uncultured majority is generally perceived as a large, untapped resource of new drug candidates, but so far it is unknown whether taxa containing talented bacteria indeed exist. Here we report the single-cell- and metagenomics-based discovery of such producers. Two phylotypes of the candidate genus ‘Entotheonella’ with genomes of greater than 9 megabases and multiple, distinct biosynthetic gene clusters co-inhabit the chemically and microbially rich marine sponge Theonella swinhoei. Almost all bioactive polyketides and peptides known from this animal were attributed to a single phylotype. ‘Entotheonella’ spp. are widely distributed in sponges and belong to an environmental taxon proposed here as candidate phylum ‘Tectomicrobia’. The pronounced bioactivities and chemical uniqueness of ‘Entotheonella’ compounds provide significant opportunities for ecological studies and drug discovery.
The recent development of third generation sequencing (TGS) generates much longer reads than second generation sequencing (SGS) and thus provides a chance to solve problems that are difficult to study through SGS alone. However, higher raw read error rates are an intrinsic drawback in most TGS technologies. Here we present a computational method, LSC, to perform error correction of TGS long reads (LR) by SGS short reads (SR). Aiming to reduce the error rate in homopolymer runs in the main TGS platform, the PacBio® RS, LSC applies a homopolymer compression (HC) transformation strategy to increase the sensitivity of SR-LR alignment without scarifying alignment accuracy. We applied LSC to 100,000 PacBio long reads from human brain cerebellum RNA-seq data and 64 million single-end 75 bp reads from human brain RNA-seq data. The results show LSC can correct PacBio long reads to reduce the error rate by more than 3 folds. The improved accuracy greatly benefits many downstream analyses, such as directional gene isoform detection in RNA-seq study. Compared with another hybrid correction tool, LSC can achieve over double the sensitivity and similar specificity.
A novel lactobacilli-based teat disinfectant for improving bacterial communities in the milks of cow teats with subclinical mastitis.
Teat disinfection pre- and post-milking is important for the overall health and hygiene of dairy cows. The objective of this study was to evaluate the efficacy of a novel probiotic lactobacilli-based teat disinfectant based on changes in somatic cell count (SCC) and profiling of the bacterial community. A total of 69 raw milk samples were obtained from eleven Holstein-Friesian dairy cows over 12 days of teat dipping in China. Single molecule, real-time sequencing technology (SMRT) was employed to profile changes in the bacterial community during the cleaning protocol and to compare the efficacy of probiotic lactic acid bacteria (LAB) and commercial teat disinfectants. The SCC gradually decreased following the cleaning protocol and the SCC of the LAB group was slightly lower than that of the commercial disinfectant (CD) group. Our SMRT sequencing results indicate that raw milk from both the LAB and CD groups contained diverse microbial populations that changed over the course of the cleaning protocol. The relative abundances of some species were significantly changed during the cleaning process, which may explain the observed bacterial community differences. Collectively, these results suggest that the LAB disinfectant could reduce mastitis-associated bacteria and improve the microbial environment of the cow teat. It could be used as an alternative to chemical pre- and post-milking teat disinfectants to maintain healthy teats and udders. In addition, the Pacific Biosciences SMRT sequencing with the full-length 16S ribosomal RNA gene was shown to be a powerful tool for monitoring changes in the bacterial population during the cleaning protocol.