June 1, 2021  |  

Comparative genomics of Shiga toxin-producing Escherichia coli O145:H28 strains associated with the 2007 Belgium and 2010 US outbreaks.

Shiga toxin-producing Escherichia coli (STEC) is an emerging pathogen. Recently there has been a global in the number of outbreaks caused by non-O157 STECs, typically involving six serogroups O26, O45, 0103, 0111, and 0145. STEC O145:H28 has been associated with severe human disease including hemolytic-uremic syndrome (HUS), and is demonstrated by the 2007 Belgian ice-cream-associated outbreak and 2010 US lettuce-associated outbreak, with over 10% of patients developing HUS in each. The goal of this work was to do comparative genomics of strains, clinical and environmental, to investigate genome diversity and virulence evolution of this important foodborne pathogen.


June 1, 2021  |  

Automated, non-hybrid de novo genome assemblies and epigenomes of bacterial pathogens

Understanding the genetic basis of infectious diseases is critical to enacting effective treatments, and several large-scale sequencing initiatives are underway to collect this information. Sequencing bacterial samples is typically performed by mapping sequence reads against genomes of known reference strains. While such resequencing informs on the spectrum of single nucleotide differences relative to the chosen reference, it can miss numerous other forms of variation known to influence pathogenicity: structural variations (duplications, inversions), acquisition of mobile elements (phages, plasmids), homonucleotide length variation causing phase variation, and epigenetic marks (methylation, phosphorothioation) that influence gene expression to switch bacteria from non-pathogenic to pathogenic states. Therefore, sequencing methods which provide complete, de novo genome assemblies and epigenomes are necessary to fully characterize infectious disease agents in an unbiased, hypothesis-free manner. Hybrid assembly methods have been described that combine long sequence reads from SMRT DNA sequencing with short, high-accuracy reads (SMRT (circular consensus sequencing) CCS or second-generation reads) to generate long, highly accurate reads that are then used for assembly. We have developed a new paradigm for microbial de novo assemblies in which long SMRT sequencing reads (average readlengths >5,000 bases) are used exclusively to close the genome through a hierarchical genome assembly process, thereby obviating the need for a second sample preparation, sequencing run and data set. We have applied this method to achieve closed de novo genomes with accuracies exceeding QV50 (>99.999%) to numerous disease outbreak samples, including E. coli, Salmonella, Campylobacter, Listeria, Neisseria, and H. pylori. The kinetic information from the same SMRT sequencing reads is utilized to determine epigenomes. Approximately 70% of all methyltransferase specificities we have determined to date represent previously unknown bacterial epigenetic signatures. The process has been automated and requires less than 1 day from an unknown DNA sample to its complete de novo genome and epigenome.


April 21, 2020  |  

SyRI: identification of syntenic and rearranged regions from whole-genome assemblies

We present SyRI, an efficient tool for genome-wide identification of structural rearrangements (SR) from genome graphs, which are built up from pair-wise whole-genome alignments. Instead of searching for differences, SyRI starts by finding all co-linear regions between the genomes. As all remaining regions are SRs by definition, they can be classified as inversions, translocations, or duplications based on their positions in convoluted networks of repetitive alignments. Finally, SyRI reports local variations like SNPs and indels within syntenic and rearranged regions. We show SyRItextquoterights broad applicability to multiple species and genetically validate the presence of ~100 translocations identified in Arabidopsis.


April 21, 2020  |  

Giant tortoise genomes provide insights into longevity and age-related disease.

Giant tortoises are among the longest-lived vertebrate animals and, as such, provide an excellent model to study traits like longevity and age-related diseases. However, genomic and molecular evolutionary information on giant tortoises is scarce. Here, we describe a global analysis of the genomes of Lonesome George-the iconic last member of Chelonoidis abingdonii-and the Aldabra giant tortoise (Aldabrachelys gigantea). Comparison of these genomes with those of related species, using both unsupervised and supervised analyses, led us to detect lineage-specific variants affecting DNA repair genes, inflammatory mediators and genes related to cancer development. Our study also hints at specific evolutionary strategies linked to increased lifespan, and expands our understanding of the genomic determinants of ageing. These new genome sequences also provide important resources to help the efforts for restoration of giant tortoise populations.


April 21, 2020  |  

Diversity of phytobeneficial traits revealed by whole-genome analysis of worldwide-isolated phenazine-producing Pseudomonas spp.

Plant-beneficial Pseudomonas spp. competitively colonize the rhizosphere and display plant-growth promotion and/or disease-suppression activities. Some strains within the P. fluorescens species complex produce phenazine derivatives, such as phenazine-1-carboxylic acid. These antimicrobial compounds are broadly inhibitory to numerous soil-dwelling plant pathogens and play a role in the ecological competence of phenazine-producing Pseudomonas spp. We assembled a collection encompassing 63 strains representative of the worldwide diversity of plant-beneficial phenazine-producing Pseudomonas spp. In this study, we report the sequencing of 58 complete genomes using PacBio RS II sequencing technology. Distributed among four subgroups within the P. fluorescens species complex, the diversity of our collection is reflected by the large pangenome which accounts for 25 413 protein-coding genes. We identified genes and clusters encoding for numerous phytobeneficial traits, including antibiotics, siderophores and cyclic lipopeptides biosynthesis, some of which were previously unknown in these microorganisms. Finally, we gained insight into the evolutionary history of the phenazine biosynthetic operon. Given its diverse genomic context, it is likely that this operon was relocated several times during Pseudomonas evolution. Our findings acknowledge the tremendous diversity of plant-beneficial phenazine-producing Pseudomonas spp., paving the way for comparative analyses to identify new genetic determinants involved in biocontrol, plant-growth promotion and rhizosphere competence. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.


April 21, 2020  |  

Whole-Genome Alignment and Comparative Annotation.

Rapidly improving sequencing technology coupled with computational developments in sequence assembly are making reference-quality genome assembly economical. Hundreds of vertebrate genome assemblies are now publicly available, and projects are being proposed to sequence thousands of additional species in the next few years. Such dense sampling of the tree of life should give an unprecedented new understanding of evolution and allow a detailed determination of the events that led to the wealth of biodiversity around us. To gain this knowledge, these new genomes must be compared through genome alignment (at the sequence level) and comparative annotation (at the gene level). However, different alignment and annotation methods have different characteristics; before starting a comparative genomics analysis, it is important to understand the nature of, and biases and limitations inherent in, the chosen methods. This review is intended to act as a technical but high-level overview of the field that should provide this understanding. We briefly survey the state of the genome alignment and comparative annotation fields and potential future directions for these fields in a new, large-scale era of comparative genomics.


April 21, 2020  |  

Dynamic virulence-related regions of the plant pathogenic fungus Verticillium dahliae display enhanced sequence conservation.

Plant pathogens continuously evolve to evade host immune responses. During host colonization, many fungal pathogens secrete effectors to perturb such responses, but these in turn may become recognized by host immune receptors. To facilitate the evolution of effector repertoires, such as the elimination of recognized effectors, effector genes often reside in genomic regions that display increased plasticity, a phenomenon that is captured in the two-speed genome hypothesis. The genome of the vascular wilt fungus Verticillium dahliae displays regions with extensive presence/absence polymorphisms, so-called lineage-specific regions, that are enriched in in planta-induced putative effector genes. As expected, comparative genomics reveals differential degrees of sequence divergence between lineage-specific regions and the core genome. Unanticipated, lineage-specific regions display markedly higher sequence conservation in coding as well as noncoding regions than the core genome. We provide evidence that disqualifies horizontal transfer to explain the observed sequence conservation and conclude that sequence divergence occurs at a slower pace in lineage-specific regions of the V. dahliae genome. We hypothesize that differences in chromatin organisation may explain lower nucleotide substitution rates in the plastic, lineage-specific regions of V. dahliae. © 2019 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.


April 21, 2020  |  

Development and Genome Sequencing of a Laboratory-Inbred Miniature Pig Facilitates Study of Human Diabetic Disease.

Pig has been proved to be a valuable large animal model used for research on diabetic disease. However, their translational value is limited given their distinct anatomy and physiology. For the last 30 years, we have been developing a laboratory Asian miniature pig inbred line (Bama miniature pig [BM]) from the primitive Bama xiang pig via long-term selective inbreeding. Here, we assembled a BM reference genome at full chromosome-scale resolution with a total length of 2.49 Gb. Comparative and evolutionary genomic analyses identified numerous variations between the BM and commercial pig (Duroc), particularly those in the genetic loci associated with the features advantageous to diabetes studies. Resequencing analyses revealed many differentiated gene loci associated with inbreeding and other selective forces. These together with transcriptome analyses of diabetic pig models provide a comprehensive genetic basis for resistance to diabetogenic environment, especially related to energy metabolism.Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved.


September 22, 2019  |  

The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).

The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.


September 22, 2019  |  

Complete genome sequence of Paenibacillus polymyxa YC0136, a plant growth–promoting rhizobacterium isolated from tobacco rhizosphere.

Paenibacillus polymyxa strain YC0136 is a plant growth-promoting rhizobacterium with antimicrobial activity, which was isolated from tobacco rhizosphere. Here, we report the complete genome sequence of P. polymyxa YC0136. Several genes with antifungal and antibacterial activity were discovered. Copyright © 2017 Liu et al.


September 22, 2019  |  

Genomic and metabolic diversity of Marine Group I Thaumarchaeota in the mesopelagic of two subtropical gyres.

Marine Group I (MGI) Thaumarchaeota are one of the most abundant and cosmopolitan chemoautotrophs within the global dark ocean. To date, no representatives of this archaeal group retrieved from the dark ocean have been successfully cultured. We used single cell genomics to investigate the genomic and metabolic diversity of thaumarchaea within the mesopelagic of the subtropical North Pacific and South Atlantic Ocean. Phylogenetic and metagenomic recruitment analysis revealed that MGI single amplified genomes (SAGs) are genetically and biogeographically distinct from existing thaumarchaea cultures obtained from surface waters. Confirming prior studies, we found genes encoding proteins for aerobic ammonia oxidation and the hydrolysis of urea, which may be used for energy production, as well as genes involved in 3-hydroxypropionate/4-hydroxybutyrate and oxidative tricarboxylic acid pathways. A large proportion of protein sequences identified in MGI SAGs were absent in the marine cultures Cenarchaeum symbiosum and Nitrosopumilus maritimus, thus expanding the predicted protein space for this archaeal group. Identifiable genes located on genomic islands with low metagenome recruitment capacity were enriched in cellular defense functions, likely in response to viral infections or grazing. We show that MGI Thaumarchaeota in the dark ocean may have more flexibility in potential energy sources and adaptations to biotic interactions than the existing, surface-ocean cultures.


September 22, 2019  |  

Indoleacrylic acid produced by commensal Peptostreptococcus species suppresses inflammation.

Host factors in the intestine help select for bacteria that promote health. Certain commensals can utilize mucins as an energy source, thus promoting their colonization. However, health conditions such as inflammatory bowel disease (IBD) are associated with a reduced mucus layer, potentially leading to dysbiosis associated with this disease. We characterize the capability of commensal species to cleave and transport mucin-associated monosaccharides and identify several Clostridiales members that utilize intestinal mucins. One such mucin utilizer, Peptostreptococcus russellii, reduces susceptibility to epithelial injury in mice. Several Peptostreptococcus species contain a gene cluster enabling production of the tryptophan metabolite indoleacrylic acid (IA), which promotes intestinal epithelial barrier function and mitigates inflammatory responses. Furthermore, metagenomic analysis of human stool samples reveals that the genetic capability of microbes to utilize mucins and metabolize tryptophan is diminished in IBD patients. Our data suggest that stimulating IA production could promote anti-inflammatory responses and have therapeutic benefits. Copyright © 2017 Elsevier Inc. All rights reserved.


September 22, 2019  |  

Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation.

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.© 2018 Fiddes et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019  |  

Comparative genome and transcriptome analysis reveals distinctive surface characteristics and unique physiological potentials of Pseudomonas aeruginosa ATCC 27853.

Pseudomonas aeruginosa ATCC 27853 was isolated from a hospital blood specimen in 1971 and has been widely used as a model strain to survey antibiotics susceptibilities, biofilm development, and metabolic activities of Pseudomonas spp.. Although four draft genomes of P. aeruginosa ATCC 27853 have been sequenced, the complete genome of this strain is still lacking, hindering a comprehensive understanding of its physiology and functional genome.Here we sequenced and assembled the complete genome of P. aeruginosa ATCC 27853 using the Pacific Biosciences SMRT (PacBio) technology and Illumina sequencing platform. We found that accessory genes of ATCC 27853 including prophages and genomic islands (GIs) mainly contribute to the difference between P. aeruginosa ATCC 27853 and other P. aeruginosa strains. Seven prophages were identified within the genome of P. aeruginosa ATCC 27853. Of the predicted 25 GIs, three contain genes that encode monoxoygenases, dioxygenases and hydrolases that could be involved in the metabolism of aromatic compounds. Surveying virulence-related genes revealed that a series of genes that encode the B-band O-antigen of LPS are lacking in ATCC 27853. Distinctive SNPs in genes of cellular adhesion proteins such as type IV pili and flagella biosynthesis were also observed in this strain. Colony morphology analysis confirmed an enhanced biofilm formation capability of ATCC 27853 on solid agar surface compared to Pseudomonas aeruginosa PAO1. We then performed transcriptome analysis of ATCC 27853 and PAO1 using RNA-seq and compared the expression of orthologous genes to understand the functional genome and the genomic details underlying the distinctive colony morphogenesis. These analyses revealed an increased expression of genes involved in cellular adhesion and biofilm maturation such as type IV pili, exopolysaccharide and electron transport chain components in ATCC 27853 compared with PAO1. In addition, distinctive expression profiles of the virulence genes lecA, lasB, quorum sensing regulators LasI/R, and the type I, III and VI secretion systems were observed in the two strains.The complete genome sequence of P. aeruginosa ATCC 27853 reveals the comprehensive genetic background of the strain, and provides genetic basis for several interesting findings about the functions of surface associated proteins, prophages, and genomic islands. Comparative transcriptome analysis of P. aeruginosa ATCC 27853 and PAO1 revealed several classes of differentially expressed genes in the two strains, underlying the genetic and molecular details of several known and yet to be explored morphological and physiological potentials of P. aeruginosa ATCC 27853.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.