Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using short-read sequencing technologies. We show that significant structural heterogeneity exists in comparison to the B73 reference genome at multiple scales, from transposon composition and copy number variation to single-nucleotide polymorphisms. The generation of this reference genome enables accurate placement of thousands of Mutator (Mu) and Dissociation (Ds) transposable element insertions for reverse and forward genetics studies. Annotation of the genome has been achieved using RNA-seq analysis, differential nuclease sensitivity profiling and bisulfite sequencing to map open reading frames, open chromatin sites and DNA methylation profiles, respectively. Collectively, the resources developed here integrate W22 as a community reference genome for functional genomics and provide a foundation for the maize pan-genome.
A report on the International Plant and Animal Genomes (PAG) conference held in San Diego, USA, 13-17 January 2018.
Propionibacterium acnes and Staphylococcus epidermidis live in close proximity on human skin, and both bacterial species can be isolated from normal and acne vulgaris-affected skin sites. The antagonistic interactions between the two species are poorly understood, as well as the potential significance of bacterial interferences for the skin microbiota. Here, we performed simultaneous antagonism assays to detect inhibitory activities between multiple isolates of the two species. Selected strains were sequenced to identify the genomic basis of their antimicrobial phenotypes.First, we screened 77 P. acnes strains isolated from healthy and acne-affected skin, and representing all known phylogenetic clades (I, II, and III), for their antimicrobial activities against 12?S. epidermidis isolates. One particular phylogroup (I-2) exhibited a higher antimicrobial activity than other P. acnes phylogroups. All genomes of type I-2 strains carry an island encoding the biosynthesis of a thiopeptide with possible antimicrobial activity against S. epidermidis. Second, 20?S. epidermidis isolates were examined for inhibitory activity against 25 P. acnes strains. The majority of S. epidermidis strains were able to inhibit P. acnes. Genomes of S. epidermidis strains with strong, medium and no inhibitory activities against P. acnes were sequenced. Genome comparison underlined the diversity of S. epidermidis and detected multiple clade- or strain-specific mobile genetic elements encoding a variety of functions important in antibiotic and stress resistance, biofilm formation and interbacterial competition, including bacteriocins such as epidermin. One isolate with an extraordinary antimicrobial activity against P. acnes harbors a functional ESAT-6 secretion system that might be involved in the antimicrobial activity against P. acnes via the secretion of polymorphic toxins.Taken together, our study suggests that interspecies interactions could potentially jeopardize balances in the skin microbiota. In particular, S. epidermidis strains possess an arsenal of different mechanisms to inhibit P. acnes. However, if such interactions are relevant in skin disorders such as acne vulgaris remains questionable, since no difference in the antimicrobial activity against, or the sensitivity towards S. epidermidis could be detected between health- and acne-associated strains of P. acnes.
Identifying and characterizing alternative splicing (AS) enables our understanding of the biological role of transcript isoform diversity. This study describes the use of publicly available RNA-Seq data to identify and characterize the global diversity of AS isoforms in maize using the inbred lines B73 and Mo17, and a related species, sorghum. Identification and characterization of AS within maize tissues revealed that genes expressed in seed exhibit the largest differential AS relative to other tissues examined. Additionally, differences in AS between the two genotypes B73 and Mo17 are greatest within genes expressed in seed. We demonstrate that changes in the level of alternatively spliced transcripts (intron retention and exon skipping) do not solely reflect differences in total transcript abundance, and we present evidence that intron retention may act to fine-tune gene expression across seed development stages. Furthermore, we have identified temperature sensitive AS in maize and demonstrate that drought-induced changes in AS involve distinct sets of genes in reproductive and vegetative tissues. Examining our identified AS isoforms within B73 × Mo17 recombinant inbred lines (RILs) identified splicing QTL (sQTL). The 43.3% of cis-sQTL regulated junctions are actually identified as alternatively spliced junctions in our analysis, while 10 Mb windows on each side of 48.2% of trans-sQTLs overlap with splicing related genes. Using sorghum as an out-group enabled direct examination of loss or conservation of AS between homeologous genes representing the two subgenomes of maize. We identify several instances where AS isoforms that are conserved between one maize homeolog and its sorghum ortholog are absent from the second maize homeolog, suggesting that these AS isoforms may have been lost after the maize whole genome duplication event. This comprehensive analysis provides new insights into the complexity of AS in maize.
Staphylococcus epidermidis is the leading cause of infections on indwelling medical devices worldwide. Intrinsic antibiotic resistance and vigorous biofilm production have rendered these infections difficult to treat and, in some cases, require the removal of the offending medical prosthesis. With the exception of two widely passaged isolates, RP62A and 1457, the pathogenesis of infections caused by clinical S. epidermidis strains is poorly understood due to the strong genetic barrier that precludes the efficient transformation of foreign DNA into clinical isolates. The difficulty in transforming clinical S. epidermidis isolates is primarily due to the type I and IV restriction-modification systems, which act as genetic barriers. Here, we show that efficient plasmid transformation of clinical S. epidermidis isolates from clonal complexes 2, 10, and 89 can be realized by employing a plasmid artificial modification (PAM) in Escherichia coli DC10B containing a ?dcm mutation. This transformative technique should facilitate our ability to genetically modify clinical isolates of S. epidermidis and hence improve our understanding of their pathogenesis in human infections.IMPORTANCEStaphylococcus epidermidis is a source of considerable morbidity worldwide. The underlying mechanisms contributing to the commensal and pathogenic lifestyles of S. epidermidis are poorly understood. Genetic manipulations of clinically relevant strains of S. epidermidis are largely prohibited due to the presence of a strong restriction barrier. With the introductions of the tools presented here, genetic manipulation of clinically relevant S. epidermidis isolates has now become possible, thus improving our understanding of S. epidermidis as a pathogen. Copyright © 2017 American Society for Microbiology.
A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.
Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.© 2018 Wang et al.; Published by Cold Spring Harbor Laboratory Press.
Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.
The human microbiome plays an important and increasingly recognized role in human health. Studies of the microbiome typically use targeted sequencing of the 16S rRNA gene, whole metagenome shotgun sequencing, or other meta-omic technologies to characterize the microbiome’s composition, activity, and dynamics. Processing, analyzing, and interpreting these data involve numerous computational tools that aim to filter, cluster, annotate, and quantify the obtained data and ultimately provide an accurate and interpretable profile of the microbiome’s taxonomy, functional capacity, and behavior. These tools, however, are often limited in resolution and accuracy and may fail to capture many biologically and clinically relevant microbiome features, such as strain-level variation or nuanced functional response to perturbation. Over the past few years, extensive efforts have been invested toward addressing these challenges and developing novel computational methods for accurate and high-resolution characterization of microbiome data. These methods aim to quantify strain-level composition and variation, detect and characterize rare microbiome species, link specific genes to individual taxa, and more accurately characterize the functional capacity and dynamics of the microbiome. These methods and the ability to produce detailed and precise microbiome information are clearly essential for informing microbiome-based personalized therapies. In this review, we survey these methods, highlighting the challenges each method sets out to address and briefly describing methodological approaches. Copyright © 2016 Elsevier Inc. All rights reserved.
Gilliamella apicola and Snodgrassella alvi are dominant members of the honey bee (Apis spp.) and bumble bee (Bombus spp.) gut microbiota. We generated complete genomes of the type strains G. apicola wkB1(T) and S. alvi wkB2(T) (isolated from Apis), as well as draft genomes for four other strains from Bombus. G. apicola and S. alvi were found to occupy very different metabolic niches: The former is a saccharolytic fermenter, whereas the latter is an oxidizer of carboxylic acids. Together, they may form a syntrophic network for partitioning of metabolic resources. Both species possessed numerous genes [type 6 secretion systems, repeats in toxin (RTX) toxins, RHS proteins, adhesins, and type IV pili] that likely mediate cell-cell interactions and gut colonization. Variation in these genes could account for the host fidelity of strains observed in previous phylogenetic studies. Here, we also show the first experimental evidence, to our knowledge, for this specificity in vivo: Strains of S. alvi were able to colonize their native bee host but not bees of another genus. Consistent with specific, long-term host association, comparative genomic analysis revealed a deep divergence and little or no gene flow between Apis and Bombus gut symbionts. However, within a host type (Apis or Bombus), we detected signs of horizontal gene transfer between G. apicola and S. alvi, demonstrating the importance of the broader gut community in shaping the evolution of any one member. Our results show that host specificity is likely driven by multiple factors, including direct host-microbe interactions, microbe-microbe interactions, and social transmission.
Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93?Gb (contig N50: 8.3?Mb, scaffold N50: 22.0?Mb, including 39.3?Mb N-bases), together with 206?Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8?Mb of HX1-specific sequences, including 4.1?Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.
A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.
Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome.
The majority of microbial genomic diversity remains unexplored. This is largely due to our inability to culture most microorganisms in isolation, which is a prerequisite for traditional genome sequencing. Single-cell sequencing has allowed researchers to circumvent this limitation. DNA is amplified directly from a single cell using the whole-genome amplification technique of multiple displacement amplification (MDA). However, MDA from a single chromosome copy suffers from amplification bias and a large loss of specificity from even very small amounts of DNA contamination, which makes assembling a genome difficult and completely finishing a genome impossible except in extraordinary circumstances. Gel microdrop cultivation allows culturing of a diverse microbial community and provides hundreds to thousands of genetically identical cells as input for an MDA reaction. We demonstrate the utility of this approach by comparing sequencing results of gel microdroplets and single cells following MDA. Bias is reduced in the MDA reaction and genome sequencing, and assembly is greatly improved when using gel microdroplets. We acquired multiple near-complete genomes for two bacterial species from human oral and stool microbiome samples. A significant amount of genome diversity, including single nucleotide polymorphisms and genome recombination, is discovered. Gel microdroplets offer a powerful and high-throughput technology for assembling whole genomes from complex samples and for probing the pan-genome of naturally occurring populations.
Comparative genomic analysis of Sulfurospirillum cavolei MES reconstructed from the metagenome of an electrosynthetic microbiome.
Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this potent greenhouse gas.
Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high-quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi-C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome (cv. Williams 82) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co-expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future.