Menu
July 19, 2019

Identification and analysis of adenine N6-methylation sites in the rice genome.

DNA N6-methyladenine (6mA) is a non-canonical DNA modification that is present at low levels in different eukaryotes1-8, but its prevalence and genomic function in higher plants are unclear. Using mass spectrometry, immunoprecipitation and validation with analysis of single-molecule real-time sequencing, we observed that about 0.2% of all adenines are 6mA methylated in the rice genome. 6mA occurs most frequently at GAGG motifs and is mapped to about 20% of genes and 14% of transposable elements. In promoters, 6mA marks silent genes, but in bodies correlates with gene activity. 6mA overlaps with 5-methylcytosine (5mC) at CG sites in gene bodies and is complementary to 5mC at CHH sites in transposable elements. We show that OsALKBH1 may be potentially involved in 6mA demethylation in rice. The results suggest that 6mA is complementary to 5mC as an epigenomic mark in rice and reinforce a distinct role for 6mA as a gene expression-associated epigenomic mark in eukaryotes.


July 19, 2019

Deep genome annotation of the opportunistic human pathogen Streptococcus pneumoniae D39.

A precise understanding of the genomic organization into transcriptional units and their regulation is essential for our comprehension of opportunistic human pathogens and how they cause disease. Using single-molecule real-time (PacBio) sequencing we unambiguously determined the genome sequence of Streptococcus pneumoniae strain D39 and revealed several inversions previously undetected by short-read sequencing. Significantly, a chromosomal inversion results in antigenic variation of PhtD, an important surface-exposed virulence factor. We generated a new genome annotation using automated tools, followed by manual curation, reflecting the current knowledge in the field. By combining sequence-driven terminator prediction, deep paired-end transcriptome sequencing and enrichment of primary transcripts by Cappable-Seq, we mapped 1015 transcriptional start sites and 748 termination sites. We show that the pneumococcal transcriptional landscape is complex and includes many secondary, antisense and internal promoters. Using this new genomic map, we identified several new small RNAs (sRNAs), RNA switches (including sixteen previously misidentified as sRNAs), and antisense RNAs. In total, we annotated 89 new protein-encoding genes, 34 sRNAs and 165 pseudogenes, bringing the S. pneumoniae D39 repertoire to 2146 genetic elements. We report operon structures and observed that 9% of operons are leaderless. The genome data are accessible in an online resource called PneumoBrowse (https://veeninglab.com/pneumobrowse) providing one of the most complete inventories of a bacterial genome to date. PneumoBrowse will accelerate pneumococcal research and the development of new prevention and treatment strategies.


July 19, 2019

De novo assembly of two Swedish genomes reveals missing segments from the human GRCh38 reference and improves variant calling of population-scale sequencing data.

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data.


July 7, 2019

Defining the sequence requirements for the positioning of base J in DNA using SMRT sequencing.

Base J (ß-D-glucosyl-hydroxymethyluracil) replaces 1% of T in the Leishmania genome and is only found in telomeric repeats (99%) and in regions where transcription starts and stops. This highly restricted distribution must be co-determined by the thymidine hydroxylases (JBP1 and JBP2) that catalyze the initial step in J synthesis. To determine the DNA sequences recognized by JBP1/2, we used SMRT sequencing of DNA segments inserted into plasmids grown in Leishmania tarentolae. We show that SMRT sequencing recognizes base J in DNA. Leishmania DNA segments that normally contain J also picked up J when present in the plasmid, whereas control sequences did not. Even a segment of only 10 telomeric (GGGTTA) repeats was modified in the plasmid. We show that J modification usually occurs at pairs of Ts on opposite DNA strands, separated by 12 nucleotides. Modifications occur near G-rich sequences capable of forming G-quadruplexes and JBP2 is needed, as it does not occur in JBP2-null cells. We propose a model whereby de novo J insertion is mediated by JBP2. JBP1 then binds to J and hydroxylates another T 13 bp downstream (but not upstream) on the complementary strand, allowing JBP1 to maintain existing J following DNA replication. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Short communication: Single molecule, real-time sequencing technology revealed species- and strain-specific methylation patterns of 2 Lactobacillus strains.

Pacific Biosciences’ (Menlo Park, CA) single molecule, real-time sequencing technology was reported to have some advantages in generating finished genomes and characterizing the epigenome of bacteria. In the present study, this technology was used to sequence 2 Lactobacillus strains, Lactobacillus casei Zhang and Lactobacillus plantarum P-8. Previously, the former bacterium was sequenced by an Applied Biosystems 3730 DNA analyzer (Grand Island, NY), whereas the latter one was analyzed with Roche 454 (Indianapolis, IN) and Illumina sequencing technologies (San Diego, CA). The results showed that single molecule, real-time sequencing resulted in high-quality, finished genomes for both strains. Interestingly, epigenome analysis indicates the presence of 1 active N(6)-methyladenine methyltransferase in L. casei Zhang, but none in L. plantarum P-8. Our study revealed for the first time a completely different methylation pattern in 2 Lactobacillus strains. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.


July 7, 2019

Biochemical characterization of a Naegleria TET-like oxygenase and its application in single molecule sequencing of 5-methylcytosine.

Modified DNA bases in mammalian genomes, such as 5-methylcytosine ((5m)C) and its oxidized forms, are implicated in important epigenetic regulation processes. In human or mouse, successive enzymatic conversion of (5m)C to its oxidized forms is carried out by the ten-eleven translocation (TET) proteins. Previously we reported the structure of a TET-like (5m)C oxygenase (NgTET1) from Naegleria gruberi, a single-celled protist evolutionarily distant from vertebrates. Here we show that NgTET1 is a 5-methylpyrimidine oxygenase, with activity on both (5m)C (major activity) and thymidine (T) (minor activity) in all DNA forms tested, and provide unprecedented evidence for the formation of 5-formyluridine ((5f)U) and 5-carboxyuridine ((5ca)U) in vitro. Mutagenesis studies reveal a delicate balance between choice of (5m)C or T as the preferred substrate. Furthermore, our results suggest substrate preference by NgTET1 to (5m)CpG and TpG dinucleotide sites in DNA. Intriguingly, NgTET1 displays higher T-oxidation activity in vitro than mammalian TET1, supporting a closer evolutionary relationship between NgTET1 and the base J-binding proteins from trypanosomes. Finally, we demonstrate that NgTET1 can be readily used as a tool in (5m)C sequencing technologies such as single molecule, real-time sequencing to map (5m)C in bacterial genomes at base resolution.


July 7, 2019

Complete genome sequence analysis of Bacillus subtilis T30.

The complete genome sequence of Bacillus subtilis T30 was determined by SMRT sequencing. The entire genome contains 4,138 predicted genes. The genome carries one intact prophage sequence (37.4 kb) similar to Bacillus phage SPBc2 and one incomplete prophage genome of 39.9 kb similar to Bacillus phage phi105. Copyright © 2015 Xu et al.


July 7, 2019

Genome sequence of Xanthomonas sacchari R1, a biocontrol bacterium isolated from the rice seed.

Xanthomonas sacchari, was first identified as a pathogenic bacterium isolated from diseased sugarcane in Guadeloupe. In this study, R1 was first isolated from rice seed samples from Philippines in 2002. The antagonistic ability against several rice pathogens raises our attention. The genomic feature of this strain was described in this paper. The total genome size of X. sacchari R1 is 5,000,479bp with 4315 coding sequences (CDS), 59 tRNAs, 2rRNAs and one plasmid. Copyright © 2015. Published by Elsevier B.V.


July 7, 2019

Genome sequence of Pseudomonas parafulva CRS01-1, an antagonistic bacterium isolated from rice field.

Pseudomonas parafulva (formerly known as Pseudomonas fulva) is an antagonistic bacterium against several rice bacterial and fungal diseases. The total genome size of P. parafulva CRS01-1 is 5,087,619bp with 4389 coding sequences (CDSs), 77 tRNAs, and 7 rRNAs. The annotated full genome sequence of the P. parafulva CRS01-1 strain might shed light on its role as an antagonistic bacterium. Copyright © 2015. Published by Elsevier B.V.


July 7, 2019

Complete genome sequence of ER2796, a DNA methyltransferase-deficient strain of Escherichia coli K-12.

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.


July 7, 2019

Full genome sequence of Brevibacillus laterosporus strain B9, a biological control strain isolated from Zhejiang, China.

Brevibacillus laterosporus was newly classified from Bacillus laterosporus, which has ability to be used as a biological control agent in crop field. B. laterosporus strain B9 is an aerobic, motile, Gram-positive, spore-forming rod that was isolated from a field of Oryza sativa in Zhejiang, China in 2011. This bacterium has been confirmed to be a strong antagonist against bacterial brown strip of rice caused by Acidovorex avenae subsp. avenae. Here we describe the features of B. laterosporus strain B9, together with the complete genome sequence and its annotation. The 5,272,435bp genome contains 4804 protein-coding genes and 227 RNA-only encoding genes with 2 plasmids. Copyright © 2015. Published by Elsevier B.V.


July 7, 2019

Complete genome sequence of Salmonella enterica subsp. enterica serovar Agona 460004 2-1, associated with a multistate outbreak in the United States.

Within the last several years, Salmonella enterica subsp. enterica serovar Agona has been among the 20 most frequently isolated serovars in clinical cases of salmonellosis. In this report, the complete genome sequence of S. Agona strain 460004 2-1 isolated from unsweetened puffed-rice cereal during a multistate outbreak in 2008 was sequenced using single-molecule real-time DNA sequencing. Copyright © 2015 Hoffmann et al.


July 7, 2019

Whole genome sequence of Pseudomonas aeruginosa F9676, an antagonistic bacterium isolated from rice seed.

Pseudomonas aeruginosa is a group of bacteria, which can be isolated from diverse ecological niches. P. aeruginosa strain F9676 was first isolated from a rice seed sample in 2003. It showed strong antagonism against several plant pathogens. In this study, whole genome sequencing was carried out. The total genome size of F9676 is 6368,008bp with 5586 coding genes (CDS), 67 tRNAs and 3 rRNAs. The genome sequence of F9676 may shed a light on antagonism P. aeruginosa. Copyright © 2015 Elsevier B.V. All rights reserved.


July 7, 2019

Mutation assay using single-molecule real-time (SMRT) sequencing technology

Introduction We present here a simple, phenotype-independent mutation assay using a PacBio RSII DNA sequencer employing single-molecule real-time (SMRT) sequencing technology. Salmonella typhimurium YG7108 was treated with the alkylating agent N-ethyl-N-nitrosourea (ENU) and grown though several generations to fix the induced mutations, the DNA was extracted and the mutations were analyzed by using the SMRT DNA sequencer. Results The ENU-induced base-substitution frequency was 15.4 per Megabase pair, which is highly consistent with our previous results based on colony isolation and next-generation sequencing. The induced mutation spectrum (95% G:C???A:T, 5% A:T???G:C) is also consistent with the known ENU signature. The base-substitution frequency of the control was calculated to be less than 0.12 per Megabase pair. A current limitation of the approach is the high frequency of artifactual insertion and deletion mutations it detects. Conclusions Ultra-low frequency base-substitution mutations can be detected directly by using the SMRT DNA sequencer, and this technology provides a phenotype-independent mutation assay.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.