De novo assembly is the process of reconstructing genomes from DNA fragments (reads), which may contain redundancy and errors. Longer reads simplify assembly and improve contiguity of the output, but current long-read technologies come with high error rates. A crucial step of de novo genome assembly for long reads consists of finding overlapping reads. We present Berkeley Long-Read to Long-Read Aligner and Overlapper (BELLA), which implement a novel approach to compute overlaps using Sparse Generalized Matrix Multiplication (SpGEMM). We present a probabilistic model which demonstrates the soundness of using short, fixed length k-mers to detect overlaps, avoiding expensive pairwise alignment of all reads against all others. We then introduce a notion of reliable k-mers based on our probabilistic model. The use of reliable k-mers eliminates both the k-mer set explosion that would otherwise happen with highly erroneous reads and the spurious overlaps due to k-mers originating from repetitive regions. Finally, we present a new method to separate true alignments from false positives depending on the alignment score. Using this methodology, which is employed in BELLAtextquoterights precise mode, the probability of false positives drops exponentially as the length of overlap between sequences increases. On simulated data, BELLA achieves an average of 2.26% higher recall than state-of-the-art tools in its sensitive mode and 18.90% higher precision than state-of-the-art tools in its precise mode, while being performance competitive.
Genomics and biochemistry investigation on the metabolic pathway of milled wood and alkali lignin-derived aromatic metabolites of Comamonas serinivorans SP-35.
The efficient depolymerization and utilization of lignin are one of the most important goals for the renewable use of lignocelluloses. The degradation and complete mineralization of lignin by bacteria represent a key step for carbon recycling in land ecosystems as well. However, many aspects of this process remain unclear, for example, the complex network of metabolic pathways involved in the degradation of lignin and the catabolic pathway of intermediate aromatic metabolites. To address these subjects, we characterized the deconstruction and mineralization of lignin with milled wood lignin (MWL, the most representative molecule of lignin in its native state) and alkali lignin (AL), and elucidated metabolic pathways of their intermediate metabolites by a bacterium named Comamonas serinivorans SP-35.The degradation rate of MWL reached 30.9%, and its particle size range was decreased from 6 to 30 µm to 2-4 µm-when cultured with C. serinivorans SP35 over 7 days. FTIR analysis showed that the C-C and C-O-C bonds between the phenyl propane structures of lignin were oxidized and cleaved and the side chain structure was modified. More than twenty intermediate aromatic metabolites were identified in the MWL and AL cultures based on GC-MS analysis. Through genome sequencing and annotation, and from GC-MS analysis, 93 genes encoding 33 enzymes and 5 regulatory factors that may be involved in lignin degradation were identified and more than nine metabolic pathways of lignin and its intermediates were predicted. Of particular note is that the metabolic pathway to form the powerful antioxidant 3,4-dihydroxyphenylglycol is described for the first time in bacteria.Elucidation of the ß-aryl ether cleavage pathway in the strain SP-35 indicates that the ß-aryl ether catabolic system is not only present in the family of Sphingomonadaceae, but also other species of bacteria kingdom. These newly elucidated catabolic pathways of lignin in strain SP-35 and the enzymes responsible for them provide exciting biotechnological opportunities for lignin valorization in future.
Emergence of tigecycline resistance in Escherichia coli co-producing MCR-1 and NDM-5 during tigecycline salvage treatment.
Here, we report a case of severe infection caused by Escherichia coli that harbored mcr-1, blaNDM-5, and acquired resistance to tigecycline during tigecycline salvage therapy.Antimicrobial susceptibility testing, Southern blot hybridization, and complete genome sequence of the strains were carried out. The genetic characteristics of the mcr-1 and blaNDM-5 plasmids were analyzed. The whole genome sequencing of mcr-1-containing plasmid was completed. Finally, putative single nucleotide polymorphisms and deletion mutations in the tigecycline-resistant strain were predicted.Three E. coli isolates were obtained from ascites, pleural effusion, and stool of a patient; they were resistant to almost all the tested antibiotics. The first two strains separated from ascites (E-FQ) and hydrothorax (E-XS) were susceptible to amikacin and tigecycline; however, the third strain from stool (E-DB) was resistant to tigecycline after nearly 3 weeks’ treatment with tigecycline. All three isolates possessed both mcr-1 and blaNDM-5. The blaNDM-5 gene was found on the IncX3 plasmid, whereas the mcr-1, fosA3 and blaCTX-M-14 were located on the IncHI2 plasmid. Mutations in acrB and lon were the reason for the resistance to tigecycline.This is the first report of a colistin-, carbapenem-, and tigecycline-resistant E. coli in China. Tigecycline resistance acquired during tigecycline therapy is of great concern for us because tigecycline is a drug of last resort to treat carbapenem-resistant Gram-negative bacterial infections. Furthermore, the transmission of such extensively drug-resistant isolates may pose a great threat to public health.
The bacterium Vibrio cholerae exhibits two distinct lifestyles, one as an aquatic bacterium and the other as the etiological agent of the pandemic human disease cholera. Here, we report closed genome sequences of two seventh pandemic V. cholerae O1 El Tor strains, A1552 and N16961, and the environmental strain Sa5Y.
Draft genome sequence of French Guiana Leishmania (Viannia) guyanensis strain 204-365, assembled using long reads.
We present here the draft genome sequence for Leishmania (Viannia) guyanensis. The isolate was obtained from a clinical case of cutaneous leishmaniasis in French Guiana. Genomic DNA was sequenced using PacBio and MiSeq platforms.
Paenibacillus bacteria are recovered from varied niches, including human lung, rhizosphere, marine sediments, and hemolymph. Paenibacilli can have plant growth-promoting activities and be antibiotic producers. They can produce exopolysaccharides and enzymes of industrial interest. Illumina and PacBio reads were used to produce a complete genome sequence of the colistin producer Paenibacillus sp. strain B-LR.
Genome sequence of Oenococcus oeni UNQOe19, the first fully assembled genome sequence of a Patagonian psychrotrophic oenological strain.
Oenococcus oeni UNQOe19 is a native strain isolated from a Patagonian pinot noir wine undergoing spontaneous malolactic fermentation. Here, we present the 1.83-Mb genome sequence of O. oeni UNQOe19, the first fully assembled genome sequence of a psychrotrophic strain from an Argentinean wine.
Fusarium oxysporum is a pathogenic fungus that infects hundreds of plant species. This paper reports the improved genome assembly of a reference strain, F. oxysporum f. sp. lycopersici Fol4287, a tomato pathogen.
Velez bacillusL-1The pear Botrytis cinerea and Penicillium bacteria of suppression role evaluation and all Genome Analysis
[Objective] Clear Velez bacillus(Bacillus S rDNA Sequence) L-1The pear Botrytis cinerea and Penicillium bacteria of suppression role clear Bacteria L-1Sterile fermentation broth antagonistic activity of stability and may be of Antagonistic mechanism. [Methods] by in vitro determination, living determination and pathogenic bacteria mycelium morphology observation evaluation StrainL-1The pear Botrytis cinerea and Penicillium bacteria of antagonistic activity. To pear Botrytis cinerea bacteria for try pathogenic bacteria use Oxford Cup method determination StrainL-1Sterile fermentation broth antagonistic activity of stability. UsePacbio rsiiThree generations sequencing technology determinationL-1Of all gene sequence will all gene sequence and gene protein sequence databaseBLASTComparison Analysis prediction StrainL-1May be of secondary metabolism product and potential of role mechanism. [Results] The StrainL-1The pear Botrytis cinerea and Penicillium bacteria of living inhibition rate respectively92.88%And77.47%Can caused by pathogenic bacteria mycelium enlargement, deformity. StrainL-1In10% NaClOf culture medium in can still normal growth its sterile fermentation broth high temperature resistant, acid, alkali, UV irradiation and protease degradation on pathogenic bacteria has stability of antagonistic activity. All gene sequence analysis results showed that strainL-1Yes112A Gene Involved in the many kinds of carbon source of metabolism can use many kinds of carbon source the growth; containing involved in spermidine, trehalose and strain stress resistance related compounds synthesis of gene; secondary metabolism prediction results display:L-1Containing SynthesisSurfactin,Fengycin,Bacillibactin,Bacillaene,Macolactin,Difficidin,BacilysinAnd many kinds of peptide chitosan and polyketide sugar resistance compounds of gene cluster and can degradation pathogenic bacteria cell wallß-1,3-Glucanase and chitinase related of gene; in addition StrainL-1Containing generation acetoin and can induced Plant Resistance of gene. [Conclusion] StrainL-1Can effective antagonistic many kinds of pear of after disease resistance strong antagonistic activity stability prediction StrainL-1Can by producing many kinds of antagonistic activity compounds and cell wall hydrolysis enzymes and induced Plant Resistance implementation disease prevention effect has very big of application potential.
Complete genome sequence of Lactobacillus koreensis 26-25, a ginsenoside converting bacterium, isolated from Korean kimchi
A Gram-positive, rod-shaped, ivory colored, and motile, Lactobacillus koreensis 26-25 was isolated from Korean kimchi. Strain 26-25 showed the ability of conversion from major ginsenosides into minor ginsenosides for which whole genome was sequenced. The whole genome sequence of Lactobacillus koreensis 26-25 consisted of one circular chromosome comprised of 3,006,812 bp, with a DNA G + C content of 49.23%. The whole genome analysis of strain 26-25 showed many glycosides hydrolase genes, which may contribute to identify the genes responsible for transformation of major ginsenosides into minor ginsenosides for its high pharmacological effects.
Complete genome of the multidrug-resistant Escherichia coli strain KBN10P04869 isolated from a patient with acute myeloid leukemia
Recently, we isolated a multidrug-resistant Escherichia coli strain KBN10P04869 from a patient with acute myeloid leukemia. We report the complete genome of this strain which consists of 5,104,264 bp with 4,457 protein-coding genes, 88 tRNAs, and 22 rRNAs, and the co-occurrence of multidrug- resistant genes including bla CMY-2, bla TEM-1, bla CTX-M-15, bla NDM-5, and blaOXA-18.
Complete Closed Genome Sequences of Three Salmonella enterica subsp. enterica Serovar Dublin Strains Isolated from Cattle at Harvest.
Salmonella enterica subsp. enterica serovar Dublin is a host-adapted pathogen for cattle that can cause invasive disease in humans. To facilitate genomic comparisons characterizing virulence determinants of this pathogen, we present the complete genome sequences of three S. Dublin strains isolated from bovine sources at harvest.
Streptococcus thermophilus is one of the most used dairy starters for the production of yogurt and cheese. We report here the complete genome sequence of the industrial strain S. thermophilus N4L, which is used in dairy technology for its fast-acidifying phenotype.
This tutorial provides a high-level overview of the features contained within the SMRT Link software. SMRT Link is the web-based end-to-end software workflow manager for run design and set-up on…