Menu
September 22, 2019  |  

Detecting epigenetic motifs in low coverage and metagenomics settings.

It has recently become possible to rapidly and accurately detect epigenetic signatures in bacterial genomes using third generation sequencing data. Monitoring the speed at which a single polymerase inserts a base in the read strand enables one to infer whether a modification is present at that specific site on the template strand. These sites can be challenging to detect in the absence of high coverage and reliable reference genomes.Here we provide a new method for detecting epigenetic motifs in bacteria on datasets with low-coverage, with incomplete references, and with mixed samples (i.e. metagenomic data). Our approach treats motif inference as a kmer comparison problem. First, genomes (or contigs) are deconstructed into kmers. Then, native genome-wide distributions of interpulse durations (IPDs) for kmers are compared with corresponding whole genome amplified (WGA, modification free) IPD distributions using log likelihood ratios. Finally, kmers are ranked and greedily selected by iteratively correcting for sequences within a particular kmer’s neighborhood.Our method can detect multiple types of modifications, even at very low-coverage and in the presence of mixed genomes. Additionally, we are able to predict modified motifs when genomes with “neighbor” modified motifs exist within the sample. Lastly, we show that these motifs can provide an alternative source of information by which to cluster metagenomics contigs and that iterative refinement on these clustered contigs can further improve both sensitivity and specificity of motif detection.https://github.com/alibashir/EMMCKmer.


September 22, 2019  |  

SMRT-Cappable-seq reveals complex operon variants in bacteria.

Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo processing of transcripts confounds the accurate identification of the 5′ and 3′ ends of operons. Here we develop a methodology called SMRT-Cappable-seq that combines the isolation of un-fragmented primary transcripts with single-molecule long read sequencing. Applied to E. coli, this technology results in an accurate definition of the transcriptome with 34% of known operons from RegulonDB being extended by at least one gene. Furthermore, 40% of transcription termination sites have read-through that alters the gene content of the operons. As a result, most of the bacterial genes are present in multiple operon variants reminiscent of eukaryotic splicing. By providing such granularity in the operon structure, this study represents an important resource for the study of prokaryotic gene network and regulation.


September 22, 2019  |  

The methylome of the gut microbiome: disparate Dam methylation patterns in intestinal Bacteroides dorei

Despite the large interest in the human microbiome in recent years, there are no reports of bacterial DNA methylation in the microbiome. Here metagenomic sequencing using the Pacific Biosciences platform allowed for rapid identification of bacterial GATC methylation status of a bacterial species in human stool samples. For this work, two stool samples were chosen that were dominated by a single species, Bacteroides dorei. Based on 16S rRNA analysis, this species represented over 45% of the bacteria present in these two samples. The B. dorei genome sequence from these samples was determined and the GATC methylation sites mapped. The Bacteroides dorei genome from one subject lacked any GATC methylation and lacked the DNA adenine methyltransferase genes. In contrast, B. dorei from another subject contained 20,551 methylated GATC sites. Of the 4970 open reading frames identified in the GATC methylated B. dorei genome, 3184 genes were methylated as well as 1735 GATC methylations in intergenic regions. These results suggest that DNA methylation patterns are important to consider in multi-omic analyses of microbiome samples seeking to discover the diversity of bacterial functions and may differ between disease states.


September 22, 2019  |  

Molecular genetic diversity and characterization of conjugation genes in the fish parasite Ichthyophthirius multifiliis.

Ichthyophthirius multifiliis is the etiologic agent of “white spot”, a commercially important disease of freshwater fish. As a parasitic ciliate, I. multifiliis infects numerous host species across a broad geographic range. Although Ichthyophthirius outbreaks are difficult to control, recent sequencing of the I. multifiliis genome has revealed a number of potential metabolic pathways for therapeutic intervention, along with likely vaccine targets for disease prevention. Nonetheless, major gaps exist in our understanding of both the life cycle and population structure of I. multifiliis in the wild. For example, conjugation has never been described in this species, and it is unclear whether I. multifiliis undergoes sexual reproduction, despite the presence of a germline micronucleus. In addition, no good methods exist to distinguish strains, leaving phylogenetic relationships between geographic isolates completely unresolved. Here, we compared nucleotide sequences of SSUrDNA, mitochondrial NADH dehydrogenase subunit I and cox-1 genes, and 14 somatic SNP sites from nine I. multifiliis isolates obtained from four different states in the US since 1995. The mitochondrial sequences effectively distinguished the isolates from one another and divided them into at least two genetically distinct groups. Furthermore, none of the nine isolates shared the same composition of the 14 somatic SNP sites, suggesting that I. multifiliis undergoes sexual reproduction at some point in its life cycle. Finally, compared to the well-studied free-living ciliates Tetrahymena thermophila and Paramecium tetraurelia, I. multifiliis has lost 38% and 29%, respectively, of 16 experimentally confirmed conjugation-related genes, indicating that mechanistic differences in sexual reproduction are likely to exist between I. multifiliis and other ciliate species. Copyright © 2015 Elsevier Inc. All rights reserved.


September 22, 2019  |  

Isoform sequencing and state-of-art applications for unravelling complexity of plant transcriptomes

Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research.


September 22, 2019  |  

Next generation sequencing technology: Advances and applications.

Impressive progress has been made in the field of Next Generation Sequencing (NGS). Through advancements in the fields of molecular biology and technical engineering, parallelization of the sequencing reaction has profoundly increased the total number of produced sequence reads per run. Current sequencing platforms allow for a previously unprecedented view into complex mixtures of RNA and DNA samples. NGS is currently evolving into a molecular microscope finding its way into virtually every fields of biomedical research. In this chapter we review the technical background of the different commercially available NGS platforms with respect to template generation and the sequencing reaction and take a small step towards what the upcoming NGS technologies will bring. We close with an overview of different implementations of NGS into biomedical research. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.


September 22, 2019  |  

Bacteroides dorei dominates gut microbiome prior to autoimmunity in Finnish children at high risk for type 1 diabetes.

The incidence of the autoimmune disease, type 1 diabetes (T1D), has increased dramatically over the last half century in many developed countries and is particularly high in Finland and other Nordic countries. Along with genetic predisposition, environmental factors are thought to play a critical role in this increase. As with other autoimmune diseases, the gut microbiome is thought to play a potential role in controlling progression to T1D in children with high genetic risk, but we know little about how the gut microbiome develops in children with high genetic risk for T1D. In this study, the early development of the gut microbiomes of 76 children at high genetic risk for T1D was determined using high-throughput 16S rRNA gene sequencing. Stool samples from children born in the same hospital in Turku, Finland were collected at monthly intervals beginning at 4-6 months after birth until 2.2 years of age. Of those 76 children, 29 seroconverted to T1D-related autoimmunity (cases) including 22 who later developed T1D, the remaining 47 subjects remained healthy (controls). While several significant compositional differences in low abundant species prior to seroconversion were found, one highly abundant group composed of two closely related species, Bacteroides dorei and Bacteroides vulgatus, was significantly higher in cases compared to controls prior to seroconversion. Metagenomic sequencing of samples high in the abundance of the B. dorei/vulgatus group before seroconversion, as well as longer 16S rRNA sequencing identified this group as Bacteroides dorei. The abundance of B. dorei peaked at 7.6 months in cases, over 8 months prior to the appearance of the first islet autoantibody, suggesting that early changes in the microbiome may be useful for predicting T1D autoimmunity in genetically susceptible infants. The cause of increased B. dorei abundance in cases is not known but its timing appears to coincide with the introduction of solid food.


September 22, 2019  |  

Identification by high-throughput imaging of the histone methyltransferase EHMT2 as an epigenetic regulator of VEGFA alternative splicing.

Recent evidence points to a role of chromatin in regulation of alternative pre-mRNA splicing (AS). In order to identify novel chromatin regulators of AS, we screened an RNAi library of chromatin proteins using a cell-based high-throughput in vivo assay. We identified a set of chromatin proteins that regulate AS. Using simultaneous genome-wide expression and AS analysis, we demonstrate distinct and non-overlapping functions of these chromatin modifiers on transcription and AS. Detailed mechanistic characterization of one dual function chromatin modifier, the H3K9 methyltransferase EHMT2 (G9a), identified VEGFA as a major chromatin-mediated AS target. Silencing of EHMT2, or its heterodimer partner EHMT1, affects AS by promoting exclusion of VEGFA exon 6a, but does not alter total VEGFA mRNA levels. The epigenetic regulatory mechanism of AS by EHMT2 involves an adaptor system consisting of the chromatin modulator HP1?, which binds methylated H3K9 and recruits splicing regulator SRSF1. The epigenetic regulation of VEGFA is physiologically relevant since EHMT2 is transcriptionally induced in response to hypoxia and triggers concomitant changes in AS of VEGFA. These results characterize a novel epigenetic regulatory mechanism of AS and they demonstrate separate roles of epigenetic modifiers in transcription and alternative splicing. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.


September 22, 2019  |  

PacBio sequencing and its applications.

Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Additionally, PacBio’s sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.


September 22, 2019  |  

Long reads: their purpose and place.

In recent years long-read technologies have moved from being a niche and specialist field to a point of relative maturity likely to feature frequently in the genomic landscape. Analogous to next generation sequencing, the cost of sequencing using long-read technologies has materially dropped whilst the instrument throughput continues to increase. Together these changes present the prospect of sequencing large numbers of individuals with the aim of fully characterizing genomes at high resolution. In this article, we will endeavour to present an introduction to long-read technologies showing: what long reads are; how they are distinct from short reads; why long reads are useful and how they are being used. We will highlight the recent developments in this field, and the applications and potential of these technologies in medical research, and clinical diagnostics and therapeutics.


September 22, 2019  |  

Prey range and genome evolution of Halobacteriovorax marinus predatory bacteria from an estuary

Halobacteriovorax strains are saltwater-adapted predatory bacteria that attack Gram-negative bacteria and may play an important role in shaping microbial communities. To understand how Halobacteriovorax strains impact ecosystems and develop them as biocontrol agents, it is important to characterize variation in predation phenotypes and investigate Halobacteriovorax genome evolution. We isolated Halobacteriovorax marinus BE01 from an estuary in Rhode Island using Vibrio from the same site as prey. Small, fast-moving, attack-phase BE01 cells attach to and invade prey cells, consistent with the intraperiplasmic predation strategy of the H. marinus type strain, SJ. BE01 is a prey generalist, forming plaques on Vibrio strains from the estuary, Pseudomonas from soil, and Escherichia coli. Genome analysis revealed extremely high conservation of gene order and amino acid sequences between BE01 and SJ, suggesting strong selective pressure to maintain the genome in this H. marinus lineage. Despite this, we identified two regions of gene content difference that likely resulted from horizontal gene transfer. Analysis of modal codon usage frequencies supports the hypothesis that these regions were acquired from bacteria with different codon usage biases than H. marinus. In one of these regions, BE01 and SJ carry different genes associated with mobile genetic elements. Acquired functions in BE01 include the dnd operon, which encodes a pathway for DNA modification, and a suite of genes involved in membrane synthesis and regulation of gene expression that was likely acquired from another Halobacteriovorax lineage. This analysis provides further evidence that horizontal gene transfer plays an important role in genome evolution in predatory bacteria. IMPORTANCE Predatory bacteria attack and digest other bacteria and therefore may play a role in shaping microbial communities. To investigate phenotypic and genotypic variation in saltwater-adapted predatory bacteria, we isolated Halobacteriovorax marinus BE01 from an estuary in Rhode Island, assayed whether it could attack different prey bacteria, and sequenced and analyzed its genome. We found that BE01 is a prey generalist, attacking bacteria from different phylogenetic groups and environments. Gene order and amino acid sequences are highly conserved between BE01 and the H. marinus type strain, SJ. By comparative genomics, we detected two regions of gene content difference that likely occurred via horizontal gene transfer events. Acquired genes encode functions such as modification of DNA, membrane synthesis and regulation of gene expression. Understanding genome evolution and variation in predation phenotypes among predatory bacteria will inform their development as biocontrol agents and clarify how they impact microbial communities.


September 22, 2019  |  

Packaging of Dinoroseobacter shibae DNA into gene transfer agent particles is not random.

Gene transfer agents (GTAs) are phage-like particles which contain a fragment of genomic DNA of the bacterial or archaeal producer and deliver this to a recipient cell. GTA gene clusters are present in the genomes of almost all marine Rhodobacteraceae (Roseobacters) and might be important contributors to horizontal gene transfer in the world’s oceans. For all organisms studied so far, no obvious evidence of sequence specificity or other nonrandom process responsible for packaging genomic DNA into GTAs has been found. Here, we show that knock-out of an autoinducer synthase gene of Dinoroseobacter shibae resulted in overproduction and release of functional GTA particles (DsGTA). Next-generation sequencing of the 4.2-kb DNA fragments isolated from DsGTAs revealed that packaging was not random. DNA from low-GC conjugative plasmids but not from high-GC chromids was excluded from packaging. Seven chromosomal regions were strongly overrepresented in DNA isolated from DsGTA. These packaging peaks lacked identifiable conserved sequence motifs that might represent recognition sites for the GTA terminase complex. Low-GC regions of the chromosome, including the origin and terminus of replication, were underrepresented in DNA isolated from DsGTAs. DNA methylation reduced packaging frequency while the level of gene expression had no influence. Chromosomal regions found to be over- and underrepresented in DsGTA-DNA were regularly spaced. We propose that a “headful” type of packaging is initiated at the sites of coverage peaks and, after linearization of the chromosomal DNA, proceeds in both directions from the initiation site. GC-content, DNA-modifications, and chromatin structure might influence at which sides GTA packaging can be initiated.© The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


September 22, 2019  |  

The novel phages phiCD5763 and phiCD2955 represent two groups of big plasmidial Siphoviridae phages of Clostridium difficile.

Until recently, Clostridium difficile phages were limited to Myoviruses and Siphoviruses of medium genome length (32–57 kb). Here we report the finding of phiCD5763, a Siphovirus with a large extrachromosomal circular genome (132.5 kb, 172 ORFs) and a large capsid (205.6 ± 25.6 nm in diameter) infecting MLST Clade 1 strains of C. difficile. Two subgroups of big phage genomes similar to phiCD5763 were identified in 32 NAPCR1/RT012/ST-54 C. difficile isolates from Costa Rica and in whole genome sequences (WGS) of 41 C. difficile isolates of Clades 1, 2, 3, and 4 from Canada, USA, UK, Belgium, Iraq, and China. Through comparative genomics we discovered another putative big phage genome in a non-NAPCR1 isolate from Costa Rica, phiCD2955, which represents other big phage genomes found in 130 WGS of MLST Clade 1 and 2 isolates from Canada, USA, Hungary, France, Austria, and UK. phiCD2955 (131.6 kb, 172 ORFs) is related to a previously reported C. difficile phage genome, phiCD211/phiCDIF1296T. Detailed genome analyses of phiCD5763, phiCD2955, phiCD211/phiCDIF1296T, and seven other putative C. difficile big phage genome sequences of 131–136 kb reconstructed from publicly available WGS revealed a modular gene organization and high levels of sequence heterogeneity at several hotspots, suggesting that these genomes correspond to biological entities undergoing recombination. Compared to other C. difficile phages, these big phages have unique predicted terminase, capsid, portal, neck and tail proteins, receptor binding proteins (RBPs), recombinases, resolvases, primases, helicases, ligases, and hypothetical proteins. Moreover, their predicted gene load suggests a complex regulation of both phage and host functions. Overall, our results indicate that the prevalence of C. difficile big bacteriophages is more widespread than realized and open new avenues of research aiming to decipher how these viral elements influence the biology of this emerging pathogen.


September 22, 2019  |  

The DNA methylome of the hyperthermoacidophilic crenarchaeon Sulfolobus acidocaldarius.

DNA methylation is the most common epigenetic modification observed in the genomic DNA (gDNA) of prokaryotes and eukaryotes. Methylated nucleobases, N6-methyl-adenine (m6A), N4-methyl-cytosine (m4C), and 5-methyl-cytosine (m5C), detected on gDNA represent the discrimination mark between self and non-self DNA when they are part of restriction-modification systems in prokaryotes (Bacteria and Archaea). In addition, m5C in Eukaryotes and m6A in Bacteria play an important role in the regulation of key cellular processes. Although archaeal genomes present modified bases as in the two other domains of life, the significance of DNA methylations as regulatory mechanisms remains largely uncharacterized in Archaea. Here, we began by investigating the DNA methylome of Sulfolobus acidocaldarius. The strategy behind this initial study entailed the use of combined digestion assays, dot blots, and genome resequencing, which utilizes specific restriction enzymes, antibodies specifically raised against m6A and m5C and single-molecule real-time (SMRT) sequencing, respectively, to identify DNA methylations occurring in exponentially growing cells. The previously identified restriction-modification system, specific of S. acidocaldarius, was confirmed by digestion assay and SMRT sequencing while, the presence of m6A was revealed by dot blot and identified on the characteristic Dam motif by SMRT sequencing. No m5C was detected by dot blot under the conditions tested. Furthermore, by comparing the distribution of both detected methylations along the genome and, by analyzing DNA methylation profiles in synchronized cells, we investigated in which cellular pathways, in particular the cell cycle, this m6A methylation could be a key player. The analysis of sequencing data rejected a role for m6A methylation in another defense system and also raised new questions about a potential involvement of this modification in the regulation of other biological functions in S. acidocaldarius.


September 22, 2019  |  

Characterizing the DNA methyltransferases of Haloferax volcanii via bioinformatics, gene deletion, and SMRT Sequencing.

DNA methyltransferases (MTases), which catalyze the methylation of adenine and cytosine bases in DNA, can occur in bacteria and archaea alongside cognate restriction endonucleases (REases) in restriction-modification (RM) systems or independently as orphan MTases. Although DNA methylation and MTases have been well-characterized in bacteria, research into archaeal MTases has been limited. A previous study examined the genomic DNA methylation patterns (methylome) of the halophilic archaeonHaloferax volcanii, a model archaeal system which can be easily manipulated in laboratory settings, via single-molecule real-time (SMRT) sequencing and deletion of a putative MTase gene (HVO_A0006). In this follow-up study, we deleted other putative MTase genes inH. volcaniiand sequenced the methylomes of the resulting deletion mutants via SMRT sequencing to characterize the genes responsible for DNA methylation. The results indicate that deletion of putative RM genesHVO_0794,HVO_A0006, andHVO_A0237in a single strain abolished methylation of the sole cytosine motif in the genome (Cm4TAG). Amino acid alignments demonstrated thatHVO_0794shares homology with characterized cytosine CTAG MTases in other organisms, indicating that this MTase is responsible for Cm4TAG methylation inH. volcanii. The CTAG motif has high density at only one of the origins of replication, and there is no relative increase in CTAG motif frequency in the genome ofH. volcanii, indicating that CTAG methylation might not have effectively taken over the role of regulating DNA replication and mismatch repair in the organism as previously predicted. Deletion of the putative Type I RM operonrmeRMS(HVO_2269-2271) resulted in abolished methylation of the adenine motif in the genome (GCAm6BN6VTGC). Alignments of the MTase (HVO_2270) and site specificity subunit (HVO_2271) demonstrate homology with other characterized Type I MTases and site specificity subunits, indicating that thermeRMSoperon is responsible for adenine methylation inH. volcanii. Together with HVO_0794, these genes appear to be responsible for all detected methylation inH. volcanii, even though other putative MTases (HVO_C0040,HVO_A0079) share homology with characterized MTases in other organisms. We also report the construction of a multi-RM deletion mutant (?RM), with multiple RM genes deleted and with no methylation detected via SMRT sequencing, which we anticipate will be useful for future studies on DNA methylation inH. volcanii.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.