A novel Gram-stain-positive, motile, white color and endospore-forming bacterium, designated 18JY67-1T, was isolated from soil in Jeju Island, Korea. The strain grow at 15-42 °C (optimum 30 °C) in R2A medium at pH (6.0-9.5) (optimum 7.5). Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain 18JY67-1T formed a distinct lineage within the family Paenibacillaceae (order Bacillales, class Bacilli), and was closely related to Paenibacillus rhizoryzae (KP675984; 96.9% 16S rRNA gene sequence similarity). The major cellular fatty acids of the strain 18JY67-1T were C16:0 and anteiso-C15:0. The predominant respiratory quinones were MK-7. The major polar lipid was identified as diphosphatidylglycerol. On the basis of phenotypic, chemotaxonomic and genotypic properties clearly indicated that isolate 18JY67-1T represents a novel species within the genus Paenibacillus, for which the name Paenibacillus flavus sp. nov. is proposed. The type strain of Paenibacillus flavus is 18JY67-1T (=?KCTC 33959T =?JCM 33184T).
Contaminant sequences that appear in published genomes can cause numerous problems for downstream analyses, particularly for evolutionary studies and metagenomics projects. Our large-scale scan of complete and draft bacterial and archaeal genomes in the NCBI RefSeq database reveals that 2250 genomes are contaminated by human sequence. The contaminant sequences derive primarily from high-copy human repeat regions, which themselves are not adequately represented in the current human reference genome, GRCh38. The absence of the sequences from the human assembly offers a likely explanation for their presence in bacterial assemblies. In some cases, the contaminating contigs have been erroneously annotated as containing protein-coding sequences, which over time have propagated to create spurious protein “families” across multiple prokaryotic and eukaryotic genomes. As a result, 3437 spurious protein entries are currently present in the widely used nr and TrEMBL protein databases. We report here an extensive list of contaminant sequences in bacterial genome assemblies and the proteins associated with them. We found that nearly all contaminants occurred in small contigs in draft genomes, which suggests that filtering out small contigs from draft genome assemblies may mitigate the issue of contamination while still keeping nearly all of the genuine genomic sequences. © 2019 Breitwieser et al.; Published by Cold Spring Harbor Laboratory Press.
Complete Genome Sequence of Paenibacillus sp. CAA11: A Promising Microbial Host for Lignocellulosic Biorefinery with Consolidated Processing.
Several bioprocessing technologies, such as separate hydrolysis and fermentation (SHF), simultaneous saccharification and fermentation (SSF), and consolidated bioprocessing (CBP), have been highlighted to produce bio-based fuels and chemicals from lignocellulosic biomass. Successful CBP, an efficient and economical lignocellulosic biorefinery process compared with other processes, requires microorganisms with sufficient cellulolytic activity and biofuel/chemical-producing ability. Here, we report the complete genome of Paenibacillus sp. CAA11, a newly isolated promising microbial host for CBP-producing ethanol and organic acids from cellulose. The genome of Paenibacillus sp. CAA11 comprises one 4,888,410 bp chromosome with a G + C content of 48.68% containing 4418 protein-coding genes, 102 tRNA genes, and 39 rRNA genes. The functionally active cellulase, encoded by CAA_GH5 was identified to belong to glycosyl hydrolase family 5 (GH5) and consisted of a catalytic domain and a cellulose-binding domain 3 (CBM3). When cellulolytic activity of CAA_GH5 was assayed through Congo red method by measuring the size of halo zone, the recombinant Bacillus subtilis RIK1285 expressing CAA_GH5 showed a comparable cellulolytic activity to B. subtilis RIK1285 expressing Cel5, a previously verified powerful bacterial cellulase. This study demonstrates the potential of Paenibacillus sp. CAA11 as a CBP-enabling microbe for cost-effective biofuels/chemicals production from lignocellulosic biomass.