Menu
September 22, 2019  |  

Long-read whole genome sequencing and comparative analysis of six strains of the human pathogen Orientia tsutsugamushi.

Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species.We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia.Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.


September 22, 2019  |  

Comparative analysis reveals unexpected genome features of newly isolated Thraustochytrids strains: on ecological function and PUFAs biosynthesis.

Thraustochytrids are unicellular fungal-like marine protists with ubiquitous existence in marine environments. They are well-known for their ability to produce high-valued omega-3 polyunsaturated fatty acids (?-3-PUFAs) (e.g., docosahexaenoic acid (DHA)) and hydrolytic enzymes. Thraustochytrid biomass has been estimated to surpass that of bacterioplankton in both coastal and oceanic waters indicating they have an important role in microbial food-web. Nevertheless, the molecular pathway and regulatory network for PUFAs production and the molecular mechanisms underlying ecological functions of thraustochytrids remain largely unknown.The genomes of two thraustochytrids strains (Mn4 and SW8) with ability to produce DHA were sequenced and assembled with a hybrid sequencing approach utilizing Illumina short paired-end reads and Pacific Biosciences long reads to generate a highly accurate genome assembly. Phylogenomic and comparative genomic analyses found that DHA-producing thraustochytrid strains were highly similar and possessed similar gene content. Analysis of the conventional fatty acid synthesis (FAS) and the polyketide synthase (PKS) systems for PUFAs production only detected incomplete and fragmentary pathways in the genome of these two strains. Surprisingly, secreted carbohydrate active enzymes (CAZymes) were found to be significantly depleted in the genomes of these 2 strains as compared to other sequenced relatives. Furthermore, these two strains possess an expanded gene repertoire for signal transduction and self-propelled movement, which could be important for their adaptations to dynamic marine environments.Our results demonstrate the possibility of a third PUFAs synthesis pathway besides previously described FAS and PKS pathways encoded in the genome of these two thraustochytrid strains. Moreover, lack of a complete set of hydrolytic enzymatic machinery for degrading plant-derived organic materials suggests that these two DHA-producing strains play an important role as a nutritional source rather than a nutrient-producer in marine microbial-food web. Results of this study suggest the existence of two types of saprobic thraustochytrids in the world’s ocean. The first group, which does not produce cellulosic enzymes and live as ‘left-over’ scavenger of bacterioplankton, serves as a dietary source for the plankton of higher trophic levels and the other possesses capacity to live on detrital organic matters in the marine ecosystems.


September 22, 2019  |  

Genomic analysis of the Phalaenopsis pathogen Dickeya sp. PA1, representing the emerging species Dickeya fangzhongdai.

Dickeya sp. strain PA1 is the causal agent of bacterial soft rot in Phalaenopsis, an important indoor orchid in China. PA1 and a few other strains were grouped into a novel species, Dickeya fangzhongdai, and only the orchid-associated strains have been shown to cause soft rot symptoms.We constructed the complete PA1 genome sequence and used comparative genomics to explore the differences in genomic features between D. fangzhongdai and other Dickeya species.PA1 has a 4,979,223-bp circular genome with 4269 predicted protein-coding genes. D. fangzhongdai was phylogenetically similar to Dickeya solani and Dickeya dadantii. The type I to type VI secretion systems (T1SS-T6SS), except for the stt-type T2SS, were identified in D. fangzhongdai. The three phylogenetically similar species varied significantly in terms of their T5SSs and T6SSs, as did the different D. fangzhongdai strains. Genomic island (GI) prediction and synteny analysis (compared to D. fangzhongdai strains) of PA1 also indicated the presence of T5SSs and T6SSs in strain-specific regions. Two typical CRISPR arrays were identified in D. fangzhongdai and in most other Dickeya species, except for D. solani. CRISPR-1 was present in all of these Dickeya species, while the presence of CRISPR-2 varied due to species differentiation. A large polyketide/nonribosomal peptide (PK/NRP) cluster, similar to the zeamine biosynthetic gene cluster in Dickeya zeae rice strains, was discovered in D. fangzhongdai and D. solani. The D. fangzhongdai and D. solani strains might recently have acquired this gene cluster by horizontal gene transfer (HGT).Orchid-associated strains are the typical members of D. fangzhongdai. Genomic analysis of PA1 suggested that this strain presents the genomic characteristics of this novel species. Considering the absence of the stt-type T2SS, the presence of CRISPR loci and the zeamine biosynthetic gene cluster, D. fangzhongdai is likely a transitional form between D. dadantii and D. solani. This is supported by the later acquisition of the zeamine cluster and the loss of CRISPR arrays by D. solani. Comparisons of phylogenetic positions and virulence determinants could be helpful for the effective quarantine and control of this emerging species.


September 22, 2019  |  

Thermosipho spp. immune system differences affect variation in genome size and geographical distributions.

Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.


September 22, 2019  |  

Comparative genomic analysis revealed rapid differentiation in the pathogenicity-related gene repertoires between Pyricularia oryzae and Pyricularia penniseti isolated from a Pennisetum grass.

A number of Pyricularia species are known to infect different grass species. In the case of Pyricularia oryzae (syn. Magnaporthe oryzae), distinct populations are known to be adapted to a wide variety of grass hosts, including rice, wheat and many other grasses. The genome sizes of Pyricularia species are typical for filamentous ascomycete fungi [~?40 Mbp for P. oryzae, and ~?45 Mbp for P. grisea]. Genome plasticity, mediated in part by deletions promoted by recombination between repetitive elements [Genome Res 26:1091-1100, 2016, Nat Rev Microbiol 10:417-430,2012] and transposable elements [Annu Rev Phytopathol 55:483-503,2017] contributes to host adaptation. Therefore, comparisons of genome structure of individual species will provide insight into the evolution of host specificity. However, except for the P. oryzae subgroup, little is known about the gene content or genome organization of other Pyricularia species, such as those infecting Pennisetum grasses.Here, we report the genome sequence of P. penniseti strain P1609 isolated from a Pennisetum grass (JUJUNCAO) using PacBio SMRT sequencing technology. Phylogenomic analysis of 28 Magnaporthales species and 5 non-Magnaporthales species indicated that P1609 belongs to a Pyricularia subclade, which is genetically distant from P. oryzae. Comparative genomic analysis revealed that the pathogenicity-related gene repertoires had diverged between P1609 and the P. oryzae strain 70-15, including the known avirulence genes, other putative secreted proteins, as well as some other predicted Pathogen-Host Interaction (PHI) genes. Genomic sequence comparison also identified many genomic rearrangements relative to P. oryzae.Our results suggested that the genomic sequence of the P. penniseti P1609 could be a useful resource for the genetic study of the Pennisetum-infecting Pyricularia species and provide new insight into evolution of pathogen genomes during host adaptation.


September 21, 2019  |  

Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7.

Although serotype O157:H7 is the predominant enterohemorrhagic Escherichia coli (EHEC), outbreaks of non-O157 EHEC that cause severe foodborne illness, including hemolytic uremic syndrome have increased worldwide. In fact, non-O157 serotypes are now estimated to cause over half of all the Shiga toxin-producing Escherichia coli (STEC) cases, and outbreaks of non-O157 EHEC infections are frequently associated with serotypes O26, O45, O103, O111, O121, and O145. Currently, there are no complete genomes for O145 in public databases.We determined the complete genome sequences of two O145 strains (EcO145), one linked to a US lettuce-associated outbreak (RM13514) and one to a Belgium ice-cream-associated outbreak (RM13516). Both strains contain one chromosome and two large plasmids, with genome sizes of 5,737,294 bp for RM13514 and 5,559,008 bp for RM13516. Comparative analysis of the two EcO145 genomes revealed a large core (5,173 genes) and a considerable amount of strain-specific genes. Additionally, the two EcO145 genomes display distinct chromosomal architecture, virulence gene profile, phylogenetic origin of Stx2a prophage, and methylation profile (methylome). Comparative analysis of EcO145 genomes to other completely sequenced STEC and other E. coli and Shigella genomes revealed that, unlike any other known non-O157 EHEC strain, EcO145 ascended from a common lineage with EcO157/EcO55. This evolutionary relationship was further supported by the pangenome analysis of the 10 EHEC str ains. Of the 4,192 EHEC core genes, EcO145 shares more genes with EcO157 than with the any other non-O157 EHEC strains.Our data provide evidence that EcO145 and EcO157 evolved from a common lineage, but ultimately each serotype evolves via a lineage-independent nature to EHEC by acquisition of the core set of EHEC virulence factors, including the genes encoding Shiga toxin and the large virulence plasmid. The large variation between the two EcO145 genomes suggests a distinctive evolutionary path between the two outbreak strains. The distinct methylome between the two EcO145 strains is likely due to the presence of a BsuBI/PstI methyltransferase gene cassette in the Stx2a prophage of the strain RM13514, suggesting a role of horizontal gene transfer-mediated epigenetic alteration in the evolution of individual EHEC strains.


July 19, 2019  |  

Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data.Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution.While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.


July 19, 2019  |  

Differing patterns of selection and geospatial genetic diversity within two leading Plasmodium vivax candidate vaccine antigens.

Although Plasmodium vivax is a leading cause of malaria around the world, only a handful of vivax antigens are being studied for vaccine development. Here, we investigated genetic signatures of selection and geospatial genetic diversity of two leading vivax vaccine antigens–Plasmodium vivax merozoite surface protein 1 (pvmsp-1) and Plasmodium vivax circumsporozoite protein (pvcsp). Using scalable next-generation sequencing, we deep-sequenced amplicons of the 42 kDa region of pvmsp-1 (n?=?44) and the complete gene of pvcsp (n?=?47) from Cambodian isolates. These sequences were then compared with global parasite populations obtained from GenBank. Using a combination of statistical and phylogenetic methods to assess for selection and population structure, we found strong evidence of balancing selection in the 42 kDa region of pvmsp-1, which varied significantly over the length of the gene, consistent with immune-mediated selection. In pvcsp, the highly variable central repeat region also showed patterns consistent with immune selection, which were lacking outside the repeat. The patterns of selection seen in both genes differed from their P. falciparum orthologs. In addition, we found that, similar to merozoite antigens from P. falciparum malaria, genetic diversity of pvmsp-1 sequences showed no geographic clustering, while the non-merozoite antigen, pvcsp, showed strong geographic clustering. These findings suggest that while immune selection may act on both vivax vaccine candidate antigens, the geographic distribution of genetic variability differs greatly between these two genes. The selective forces driving this diversification could lead to antigen escape and vaccine failure. Better understanding the geographic distribution of genetic variability in vaccine candidate antigens will be key to designing and implementing efficacious vaccines.


July 19, 2019  |  

New insights into dissemination and variation of the health care-associated pathogen Acinetobacter baumannii from genomic analysis.

Acinetobacter baumannii is a globally important nosocomial pathogen characterized by an increasing incidence of multidrug resistance. Routes of dissemination and gene flow among health care facilities are poorly resolved and are important for understanding the epidemiology of A. baumannii, minimizing disease transmission, and improving patient outcomes. We used whole-genome sequencing to assess diversity and genome dynamics in 49 isolates from one United States hospital system during one year from 2007 to 2008. Core single-nucleotide-variant-based phylogenetic analysis revealed multiple founder strains and multiple independent strains recovered from the same patient yet was insufficient to fully resolve strain relationships, where gene content and insertion sequence patterns added additional discriminatory power. Gene content comparisons illustrated extensive and redundant antibiotic resistance gene carriage and direct evidence of gene transfer, recombination, gene loss, and mutation. Evidence of barriers to gene flow among hospital components was not found, suggesting complex mixing of strains and a large reservoir of A. baumannii strains capable of colonizing patients.Genome sequencing was used to characterize multidrug-resistant Acinetobacter baumannii strains from one United States hospital system during a 1-year period to better understand how A. baumannii strains that cause infection are related to one another. Extensive variation in gene content was found, even among strains that were very closely related phylogenetically and epidemiologically. Several mechanisms contributed to this diversity, including transfer of mobile genetic elements, mobilization of insertion sequences, insertion sequence-mediated deletions, and genome-wide homologous recombination. Variation in gene content, however, lacked clear spatial or temporal patterns, suggesting a diverse pool of circulating strains with considerable interaction between strains and hospital locations. Widespread genetic variation among strains from the same hospital and even the same patient, particularly involving antibiotic resistance genes, reinforces the need for molecular diagnostic testing and genomic analysis to determine resistance profiles, rather than a reliance primarily on strain typing and antimicrobial resistance phenotypes for epidemiological studies.


July 19, 2019  |  

Identification of restriction-modification systems of Bifidobacterium animalis subsp. lactis CNCM I-2494 by SMRT Sequencing and associated methylome analysis.

Bifidobacterium animalis subsp. lactis CNCM I-2494 is a component of a commercialized fermented dairy product for which beneficial effects on health has been studied by clinical and preclinical trials. To date little is known about the molecular mechanisms that could explain the beneficial effects that bifidobacteria impart to the host. Restriction-modification (R-M) systems have been identified as key obstacles in the genetic accessibility of bifidobacteria, and circumventing these is a prerequisite to attaining a fundamental understanding of bifidobacterial attributes, including the genes that are responsible for health-promoting properties of this clinically and industrially important group of bacteria. The complete genome sequence of B. animalis subsp. lactis CNCM I-2494 is predicted to harbour the genetic determinants for two type II R-M systems, designated BanLI and BanLII. In order to investigate the functionality and specificity of these two putative R-M systems in B. animalis subsp. lactis CNCM I-2494, we employed PacBio SMRT sequencing with associated methylome analysis. In addition, the contribution of the identified R-M systems to the genetic accessibility of this strain was assessed.


July 19, 2019  |  

The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development.

Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement. Copyright © 2014 Elsevier Inc. All rights reserved.


July 19, 2019  |  

Exploring the roles of DNA methylation in the metal-reducing bacterium Shewanella oneidensis MR-1.

We performed whole-genome analyses of DNA methylation in Shewanella oneidensis MR-1 to examine its possible role in regulating gene expression and other cellular processes. Single-molecule real-time (SMRT) sequencing revealed extensive methylation of adenine (N6mA) throughout the genome. These methylated bases were located in five sequence motifs, including three novel targets for type I restriction/modification enzymes. The sequence motifs targeted by putative methyltranferases were determined via SMRT sequencing of gene knockout mutants. In addition, we found that S. oneidensis MR-1 cultures grown under various culture conditions displayed different DNA methylation patterns. However, the small number of differentially methylated sites could not be directly linked to the much larger number of differentially expressed genes under these conditions, suggesting that DNA methylation is not a major regulator of gene expression in S. oneidensis MR-1. The enrichment of methylated GATC motifs in the origin of replication indicates that DNA methylation may regulate genome replication in a manner similar to that seen in Escherichia coli. Furthermore, comparative analyses suggest that many Gammaproteobacteria, including all members of the Shewanellaceae family, may also utilize DNA methylation to regulate genome replication.


July 19, 2019  |  

The extant World War 1 dysentery bacillus NCTC1: a genomic analysis.

Shigellosis (previously bacillary dysentery) was the primary diarrhoeal disease of World War 1, but outbreaks still occur in military operations, and shigellosis causes hundreds of thousands of deaths per year in developing nations. We aimed to generate a high-quality reference genome of the historical Shigella flexneri isolate NCTC1 and to examine the isolate for resistance to antimicrobials.In this genomic analysis, we sequenced the oldest extant Shigella flexneri serotype 2a isolate using single-molecule real-time (SMRT) sequencing technology. Isolated from a soldier with dysentery from the British forces fighting on the Western Front in World War 1, this bacterium, NCTC1, was the first isolate accessioned into the National Collection of Type Cultures. We created a reference sequence for NCTC1, investigated the isolate for antimicrobial resistance, and undertook comparative genetics with S flexneri reference strains isolated during the 100 years since World War 1.We discovered that NCTC1 belonged to a 2a lineage of S flexneri, with which it shares common characteristics and a large core genome. NCTC1 was resistant to penicillin and erythromycin, and contained a complement of chromosomal antimicrobial resistance genes similar to that of more recent isolates. Genomic islands gained in the S flexneri 2a lineage over time were predominately associated with additional antimicrobial resistances, virulence, and serotype conversion.This S flexneri 2a lineage is a well adapted pathogen that has continued to respond to selective pressures. We have created a valuable historical benchmark for shigellae in the form of a high-quality reference sequence for a publicly available isolate.The Wellcome Trust. Copyright © 2014 Baker et al. Open Access article distributed under the terms of CC BY. Published by Elsevier Ltd. All rights reserved.


July 19, 2019  |  

Comparative genome analysis of Wolbachia strain wAu

BACKGROUND:Wolbachia intracellular bacteria can manipulate the reproduction of their arthropod hosts, including inducing sterility between populations known as cytoplasmic incompatibility (CI). Certain strains have been identified that are unable to induce or rescue CI, including wAu from Drosophila. Genome sequencing and comparison with CI-inducing related strain wMel was undertaken in order to better understand the molecular basis of the phenotype.RESULTS:Although the genomes were broadly similar, several rearrangements were identified, particularly in the prophage regions. Many orthologous genes contained single nucleotide polymorphisms (SNPs) between the two strains, but a subset containing major differences that would likely cause inactivation in wAu were identified, including the absence of the wMel ortholog of a gene recently identified as a CI candidate in a proteomic study. The comparative analyses also focused on a family of transcriptional regulator genes implicated in CI in previous work, and revealed numerous differences between the strains, including those that would have major effects on predicted function.CONCLUSIONS:The study provides support for existing candidates and novel genes that may be involved in CI, and provides a basis for further functional studies to examine the molecular basis of the phenotype.


July 19, 2019  |  

Genome sequencing and comparative genomics provides insights on the evolutionary dynamics and pathogenic potential of different H-serotypes of Shiga toxin-producing Escherichia coli O104.

Various H-serotypes of the Shiga toxin-producing Escherichia coli (STEC) O104, including H4, H7, H21, and H¯, have been associated with sporadic cases of illness and have caused food-borne outbreaks globally. In the U.S., STEC O104:H21 caused an outbreak associated with milk in 1994. However, there is little known on the evolutionary origins of STEC O104 strains, and how genotypic diversity contributes to pathogenic potential of various O104 H-antigen serotypes isolated from different ecological niches and/or geographical regions.Two STEC O104:H21 (milk outbreak strain) and O104:H7 (cattle isolate) strains were shot-gun sequenced, and the genomes were closed. The intimin (eae) gene, involved in the attaching-effacing phenotype of diarrheagenic E. coli, was not found in either strain. Examining various O104 genome sequences, we found that two “complete” left and right end portions of the locus of enterocyte effacement (LEE) pathogenicity island were present in 13 O104 strains; however, the central portion of LEE was missing, where the eae gene is located. In O104:H4 strains, the missing central portion of the LEE locus was replaced by a pathogenicity island carrying the aidA (adhesin involved in diffuse adherence) gene and antibiotic resistance genes commonly carried on plasmids. Enteroaggregative E. coli-specific virulence genes and European outbreak O104:H4-specific stx2-encoding Escherichia P13374 or Escherichia TL-2011c bacteriophages were missing in some of the O104:H4 genome sequences available from public databases. Most of the genomic variations in the strains examined were due to the presence of different mobile genetic elements, including prophages and genomic island regions. The presence of plasmids carrying virulence-associated genes may play a role in the pathogenic potential of O104 strains.The two strains sequenced in this study (O104:H21 and O104:H7) are genetically more similar to each other than to the O104:H4 strains that caused an outbreak in Germany in 2011 and strains found in Central Africa. A hypothesis on strain evolution and pathogenic potential of various H-serotypes of E. coli O104 strains is proposed.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.