XL chemistry Archives

July 19, 2019 |

Performance comparison of second- and third-generation sequencers using a bacterial genome with two chromosomes.

The availability of diverse second- and third-generation sequencing technologies enables the rapid determination of the sequences of bacterial genomes. However, identifying the sequencing technology most suitable for producing a finished genome with multiple chromosomes remains a challenge. We evaluated the abilities of the following three second-generation sequencers: Roche 454 GS Junior (GS Jr), Life Technologies Ion PGM (Ion PGM), and Illumina MiSeq (MiSeq) and a third-generation sequencer, the Pacific Biosciences RS sequencer (PacBio), by sequencing and assembling the genome of Vibrio parahaemolyticus, which consists of a 5-Mb genome comprising two circular chromosomes. We sequenced the genome of V. parahaemolyticus with GS Jr, Ion PGM, MiSeq, and PacBio and performed de novo assembly with several genome assemblers. Although GS Jr generated the longest mean read length of 418 bp among the second-generation sequencers, the maximum contig length of the best assembly from GS Jr was 165 kbp, and the number of contigs was 309. Single runs of Ion PGM and MiSeq produced data of considerably greater sequencing coverage, 279× and 1,927×, respectively. The optimized result for Ion PGM contained 61 contigs assembled from reads of 77× coverage, and the longest contig was 895 kbp in size. Those for MiSeq were 34 contigs, 58×?coverage, and 733 kbp, respectively. These results suggest that higher coverage depth is unnecessary for a better assembly result. We observed that multiple rRNA coding regions were fragmented in the assemblies from the second-generation sequencers, whereas PacBio generated two exceptionally long contigs of 3,288,561 and 1,875,537 bps, each of which was from a single chromosome, with 73× coverage and mean read length 3,119 bp, allowing us to determine the absolute positions of all rRNA operons. PacBio outperformed the other sequencers in terms of the length of contigs and reconstructed the greatest portion of the genome, achieving a genome assembly of “finished grade” because of its long reads. It showed the potential to assemble more complex genomes with multiple chromosomes containing more repetitive sequences.

July 19, 2019 |

Reducing assembly complexity of microbial genomes with single-molecule sequencing.

The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem.To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads.Automated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.

July 7, 2019 |

Genome sequence of a urease-positive Campylobacter lari strain.

Campylobacter lari is frequently isolated from shore birds and can cause illness in humans. Here, we report the draft whole-genome sequence of a urease-positive strain of C. lari that was isolated in estuarial water on the coast of Delaware, USA. Copyright © 2015 Meinersmann et al.

July 7, 2019 |

Complete genome sequence of Microbacterium sp. CGR1, bacterium tolerant to wide abiotic conditions isolated from the Atacama Desert.

Microbacterium sp. CGR1 (RGM2230) is an isolate from the Atacama Desert that displays a wide pH, salinity and temperature tolerance. This strain exhibits riboflavin overproducer features and traits for developing an environmental arsenic biosensor. Here, we report the complete genome sequence of this strain, which represents the first genome of the genus Microbacterium sequenced and assembled in a single contig. The genome contains 3,634,864bp, 3299 protein-coding genes, 45 tRNAs, six copies of 5S-16S-23S rRNA and a high genome average GC-content of 68.04%. Copyright © 2015 Elsevier B.V. All rights reserved.

July 7, 2019 |

SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information.

The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data.Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes.The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner.

July 7, 2019 |

The genomic landscape of the verrucomicrobial methanotroph Methylacidiphilum fumariolicum SolV.

Aerobic methanotrophs can grow in hostile volcanic environments and use methane as their sole source of energy. The discovery of three verrucomicrobial Methylacidiphilum strains has revealed diverse metabolic pathways used by these methanotrophs, including mechanisms through which methane is oxidized. The basis of a complete understanding of these processes and of how these bacteria evolved and are able to thrive in such extreme environments partially resides in the complete characterization of their genome and its architecture.In this study, we present the complete genome sequence of Methylacidiphilum fumariolicum SolV, obtained using Pacific Biosciences single-molecule real-time (SMRT) sequencing technology. The genome assembles to a single 2.5 Mbp chromosome with an average GC content of 41.5%. The genome contains 2,741 annotated genes and 314 functional subsystems including all key metabolic pathways that are associated with Methylacidiphilum strains, including the CBB pathway for CO2 fixation. However, it does not encode the serine cycle and ribulose monophosphate pathways for carbon fixation. Phylogenetic analysis of the particulate methane mono-oxygenase operon separates the Methylacidiphilum strains from other verrucomicrobial methanotrophs. RNA-Seq analysis of cell cultures growing in three different conditions revealed the deregulation of two out of three pmoCAB operons. In addition, genes involved in nitrogen fixation were upregulated in cell cultures growing in nitrogen fixing conditions, indicating the presence of active nitrogenase. Characterization of the global methylation state of M. fumariolicum SolV revealed methylation of adenines and cytosines mainly in the coding regions of the genome. Methylation of adenines was predominantly associated with 5′-m6ACN4GT-3′ and 5′-CCm6AN5CTC-3′ methyltransferase recognition motifs whereas methylated cytosines were not associated with any specific motif.Our findings provide novel insights into the global methylation state of verrucomicrobial methanotroph M. fumariolicum SolV. However, partial conservation of methyltransferases between M. fumariolicum SolV and M. infernorum V4 indicates potential differences in the global methylation state of Methylacidiphilum strains. Unravelling the M. fumariolicum SolV genome and its epigenetic regulation allow for robust characterization of biological processes that are involved in oxidizing methane. In turn, they offer a better understanding of the evolution, the underlying physiological and ecological properties of SolV and other Methylacidiphilum strains.

July 7, 2019 |

Genomic mapping of phosphorothioates reveals partial modification of short consensus sequences.

Bacterial phosphorothioate (PT) DNA modifications are incorporated by Dnd proteins A-E and often function with DndF-H as a restriction-modification (R-M) system, as in Escherichia coli B7A. However, bacteria such as Vibrio cyclitrophicus FF75 lack dndF-H, which points to other PT functions. Here we report two novel, orthogonal technologies to map PTs across the genomes of B7A and FF75 with >90% agreement: single molecule, real-time sequencing and deep sequencing of iodine-induced cleavage at PT (ICDS). In B7A, we detect PT on both strands of GpsAAC/GpsTTC motifs, but with only 12% of 40,701 possible sites modified. In contrast, PT in FF75 occurs as a single-strand modification at CpsCA, again with only 14% of 160,541 sites modified. Single-molecule analysis indicates that modification could be partial at any particular genomic site even with active restriction by DndF-H, with direct interaction of modification proteins with GAAC/GTTC sites demonstrated with oligonucleotides. These results point to highly unusual target selection by PT-modification proteins and rule out known R-M mechanisms.

July 7, 2019 |

Twenty-one novel microsatellite loci for the endangered Florida salt marsh vole (Microtus pennsylvanicus dukecampbelli)

We present 21 microsatellite loci developed for Florida salt marsh voles (Microtus pennsylvanicus dukecampbelli). Microsatellites were identified from single molecule real time sequencing (Pacific Biosciences). We screened 30 loci and identified 21 loci as suitable for genotyping. We screened 17 individuals from Long Cabbage Key, and 3 individuals from an unnamed island. There was no significant departure from Hardy–Weinberg equilibrium or linkage equilibrium. Fifteen of the 21 loci were variable, with overall observed heterozygosity averaging 0.39, and a mean number of alleles of 3.14. Linkage disequilibrium estimate of Ne was 10.7 (95 % CI 6.1–20.1). These markers will be useful for conservation genetics studies of this endangered species.

July 7, 2019 |

Comparative genomics of early-diverging mushroom-forming fungi provides insights into the origins of lignocellulose decay capabilities.

Evolution of lignocellulose decomposition was one of the most ecologically important innovations in fungi. White-rot fungi in the Agaricomycetes (mushrooms and relatives) are the most effective microorganisms in degrading both cellulose and lignin components of woody plant cell walls (PCW). However, the precise evolutionary origins of lignocellulose decomposition are poorly understood, largely because certain early-diverging clades of Agaricomycetes and its sister group, the Dacrymycetes, have yet to be sampled, or have been undersampled, in comparative genomic studies. Here, we present new genome sequences of ten saprotrophic fungi, including members of the Dacrymycetes and early-diverging clades of Agaricomycetes (Cantharellales, Sebacinales, Auriculariales, and Trechisporales), which we use to refine the origins and evolutionary history of the enzymatic toolkit of lignocellulose decomposition. We reconstructed the origin of ligninolytic enzymes, focusing on class II peroxidases (AA2), as well as enzymes that attack crystalline cellulose. Despite previous reports of white rot appearing as early as the Dacrymycetes, our results suggest that white-rot fungi evolved later in the Agaricomycetes, with the first class II peroxidases reconstructed in the ancestor of the Auriculariales and residual Agaricomycetes. The exemplars of the most ancient clades of Agaricomycetes that we sampled all lack class II peroxidases, and are thus concluded to use a combination of plesiomorphic and derived PCW degrading enzymes that predate the evolution of white rot.

July 7, 2019 |

High quality maize centromere 10 sequence reveals evidence of frequent recombination events.

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10(-6) and 5 × 10(-5) for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres.

July 7, 2019 |

BAC-pool sequencing and assembly of 19 Mb of the complex sugarcane genome.

Sequencing plant genomes are often challenging because of their complex architecture and high content of repetitive sequences. Sugarcane has one of the most complex genomes. It is highly polyploid, preserves intact homeologous chromosomes from its parental species and contains >55% repetitive sequences. Although bacterial artificial chromosome (BAC) libraries have emerged as an alternative for accessing the sugarcane genome, sequencing individual clones is laborious and expensive. Here, we present a strategy for sequencing and assembly reads produced from the DNA of pooled BAC clones. A set of 178 BAC clones, randomly sampled from the SP80-3280 sugarcane BAC library, was pooled and sequenced using the Illumina HiSeq2000 and PacBio platforms. A hybrid assembly strategy was used to generate 2,451 scaffolds comprising 19.2 MB of assembled genome sequence. Scaffolds of =20 Kb corresponded to 80% of the assembled sequences, and the full sequences of forty BACs were recovered in one or two contigs. Alignment of the BAC scaffolds with the chromosome sequences of sorghum showed a high degree of collinearity and gene order. The alignment of the BAC scaffolds to the 10 sorghum chromosomes suggests that the genome of the SP80-3280 sugarcane variety is ~19% contracted in relation to the sorghum genome. In conclusion, our data show that sequencing pools composed of high numbers of BAC clones may help to construct a reference scaffold map of the sugarcane genome.

July 7, 2019 |

Finished genome sequence and methylome of the cyanide-degrading Pseudomonas pseudoalcaligenes strain CECT5344 as resolved by single-molecule real-time sequencing.

Pseudomonas pseudoalcaligenes CECT5344 tolerates cyanide and is also able to utilize cyanide and cyano-derivatives as a nitrogen source under alkaline conditions. The strain is considered as candidate for bioremediation of habitats contaminated with cyanide-containing liquid wastes. Information on the genome sequence of the strain CECT5344 became available previously. The P. pseudoalcaligenes CECT5344 genome was now resequenced by applying the single molecule, real-time (SMRT(®)) sequencing technique developed by Pacific Biosciences. The complete and finished genome sequence of the strain consists of a 4,696,984 bp chromosome featuring a GC-content of 62.34%. Comparative analyses between the new and previous versions of the P. pseudoalcaligenes CECT5344 genome sequence revealed additional regions in the new sequence that were missed in the older version. These additional regions mostly represent mobile genetic elements. Moreover, five additional genes predicted to play a role in sulfoxide reduction are present in the newly established genome sequence. The P. pseudoalcaligenes CECT5344 genome sequence is highly related to the genome sequences of different Pseudomonas mendocina strains. Approximately, 70% of all genes are shared between P. pseudoalcaligenes and P. mendocina. In contrast to P. mendocina, putative pathogenicity genes were not identified in the P. pseudoalcaligenes CECT5344 genome. P. pseudoalcaligenes CECT5344 possesses unique genes for nitrilases and mercury resistance proteins that are of importance for survival in habitats contaminated with cyano- and mercury compounds. As an additional feature of the SMRT sequencing technology, the methylome of P. pseudoalcaligenes was established. Six sequence motifs featuring methylated adenine residues (m6A) were identified in the genome. The genome encodes several methyltransferases, some of which may be considered for methylation of the m6A motifs identified. The complete genome sequence of the strain CECT5344 now provides the basis for exploitation of genetic features for biotechnological purposes. Copyright © 2016 Elsevier B.V. All rights reserved.

July 7, 2019 |

Complete genome sequence of Lactobacillus rhamnosus strain BPL5 (CECT 8800), a probiotic for treatment of bacterial vaginosis.

Lactobacillus rhamnosus BPL5 (CECT 8800), is a probiotic strain suitable for the treatment of bacterial vaginosis. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insight into its functional activity. Copyright © 2016 Chenoll et al.

July 7, 2019 |

Complete genome sequences of 12 species of stable defined moderately diverse mouse microbiota 2.

We report here the complete genome sequences of 12 bacterial species of stable defined moderately diverse mouse microbiota 2 (sDMDMm2) used to colonize germ-free mice with defined microbes. Whole-genome sequencing of these species was performed using the PacBio sequencing platform yielding circularized genome sequences of all 12 species. Copyright © 2016 Uchimura et al.

Asset Tag: XL chemistry

Performance comparison of second- and third-generation sequencers using a bacterial genome with two chromosomes.

Reducing assembly complexity of microbial genomes with single-molecule sequencing.

Genome sequence of a urease-positive Campylobacter lari strain.

Complete genome sequence of Microbacterium sp. CGR1, bacterium tolerant to wide abiotic conditions isolated from the Atacama Desert.

SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information.

The genomic landscape of the verrucomicrobial methanotroph Methylacidiphilum fumariolicum SolV.

Genomic mapping of phosphorothioates reveals partial modification of short consensus sequences.

Twenty-one novel microsatellite loci for the endangered Florida salt marsh vole (Microtus pennsylvanicus dukecampbelli)

Comparative genomics of early-diverging mushroom-forming fungi provides insights into the origins of lignocellulose decay capabilities.

High quality maize centromere 10 sequence reveals evidence of frequent recombination events.

BAC-pool sequencing and assembly of 19 Mb of the complex sugarcane genome.

Finished genome sequence and methylome of the cyanide-degrading Pseudomonas pseudoalcaligenes strain CECT5344 as resolved by single-molecule real-time sequencing.

Complete genome sequence of Lactobacillus rhamnosus strain BPL5 (CECT 8800), a probiotic for treatment of bacterial vaginosis.

Complete genome sequences of 12 species of stable defined moderately diverse mouse microbiota 2.

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert