Bioinformatics Archives - Page 249 of 267

July 7, 2019

SRinversion: a tool for detecting short inversions by splitting and re-aligning poorly mapped and unmapped sequencing reads.

Rapid development in sequencing technologies has dramatically improved our ability to detect genetic variants in human genome. However, current methods have variable sensitivities in detecting different types of genetic variants. One type of such genetic variants that is especially hard to detect is inversions. Analysis of public databases showed that few short inversions have been reported so far. Unlike reads that contain small insertions or deletions, which will be considered through gap alignment, reads carrying short inversions often have poor mapping quality or are unmapped, thus are often not further considered. As a result, the majority of short inversions might have been overlooked and require special algorithms for their detection.Here, we introduce SRinversion, a framework to analyze poorly mapped or unmapped reads by splitting and re-aligning them for the purpose of inversion detection. SRinversion is very sensitive to small inversions and can detect those less than 10?bp in size. We applied SRinversion to both simulated data and high-coverage sequencing data from the 1000 Genomes Project and compared the results with those from Pindel, BreakDancer, DELLY, Gustaf and MID. A better performance of SRinversion was achieved for both datasets for the detection of small inversions.SRinversion is implemented in Perl and is publicly available at http://paed.hku.hk/genome/software/SRinversion/index.html CONTACT: yangwl@hku.hkSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

CoLoRMap: Correcting Long Reads by Mapping short reads.

Second generation sequencing technologies paved the way to an exceptional increase in the number of sequenced genomes, both prokaryotic and eukaryotic. However, short reads are difficult to assemble and often lead to highly fragmented assemblies. The recent developments in long reads sequencing methods offer a promising way to address this issue. However, so far long reads are characterized by a high error rate, and assembling from long reads require a high depth of coverage. This motivates the development of hybrid approaches that leverage the high quality of short reads to correct errors in long reads.We introduce CoLoRMap, a hybrid method for correcting noisy long reads, such as the ones produced by PacBio sequencing technology, using high-quality Illumina paired-end reads mapped onto the long reads. Our algorithm is based on two novel ideas: using a classical shortest path algorithm to find a sequence of overlapping short reads that minimizes the edit score to a long read and extending corrected regions by local assembly of unmapped mates of mapped short reads. Our results on bacterial, fungal and insect data sets show that CoLoRMap compares well with existing hybrid correction methods.The source code of CoLoRMap is freely available for non-commercial use at https://github.com/sfu-compbio/colormapehaghshe@sfu.ca or cedric.chauve@sfu.caSupplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

Genome-guided design of a defined mouse microbiota that confers colonization resistance against Salmonella enterica serovar Typhimurium.

Protection against enteric infections, also termed colonization resistance, results from mutualistic interactions of the host and its indigenous microbes. The gut microbiota of humans and mice is highly diverse and it is therefore challenging to assign specific properties to its individual members. Here, we have used a collection of murine bacterial strains and a modular design approach to create a minimal bacterial community that, once established in germ-free mice, provided colonization resistance against the human enteric pathogen Salmonella enterica serovar Typhimurium (S. Tm). Initially, a community of 12 strains, termed Oligo-Mouse-Microbiota (Oligo-MM(12)), representing members of the major bacterial phyla in the murine gut, was selected. This community was stable over consecutive mouse generations and provided colonization resistance against S. Tm infection, albeit not to the degree of a conventional complex microbiota. Comparative (meta)genome analyses identified functions represented in a conventional microbiome but absent from the Oligo-MM(12). By genome-informed design, we created an improved version of the Oligo-MM community harbouring three facultative anaerobic bacteria from the mouse intestinal bacterial collection (miBC) that provided conventional-like colonization resistance. In conclusion, we have established a highly versatile experimental system that showed efficacy in an enteric infection model. Thus, in combination with exhaustive bacterial strain collections and systems-based approaches, genome-guided design can be used to generate insights into microbe-microbe and microbe-host interactions for the investigation of ecological and disease-relevant mechanisms in the intestine.

July 7, 2019

Genomic sequencing-based mutational enrichment analysis identifies motility genes in a genetically intractable gut microbe.

A major roadblock to understanding how microbes in the gastrointestinal tract colonize and influence the physiology of their hosts is our inability to genetically manipulate new bacterial species and experimentally assess the function of their genes. We describe the application of population-based genomic sequencing after chemical mutagenesis to map bacterial genes responsible for motility in Exiguobacterium acetylicum, a representative intestinal Firmicutes bacterium that is intractable to molecular genetic manipulation. We derived strong associations between mutations in 57 E. acetylicum genes and impaired motility. Surprisingly, less than half of these genes were annotated as motility-related based on sequence homologies. We confirmed the genetic link between individual mutations and loss of motility for several of these genes by performing a large-scale analysis of spontaneous suppressor mutations. In the process, we reannotated genes belonging to a broad family of diguanylate cyclases and phosphodiesterases to highlight their specific role in motility and assigned functions to uncharacterized genes. Furthermore, we generated isogenic strains that allowed us to establish that Exiguobacterium motility is important for the colonization of its vertebrate host. These results indicate that genetic dissection of a complex trait, functional annotation of new genes, and the generation of mutant strains to define the role of genes in complex environments can be accomplished in bacteria without the development of species-specific molecular genetic tools.

July 7, 2019

Serinibacter

The genus Serinibacter belongs, based on the phylogenetic analysis of the nearly full-length 16S rRNA gene, to the Beutenbergiaceae together with the genera Beutenbergia, Salana, and Miniimonas. The two species of the genus Serinibacter shared 99.6% 16S rRNA gene sequence similarity but low DNA DNA relatedness. Cells are irregular rods, Gram-stain positive, not acid-fast. Endospores are not formed. Nonmotile. Aerobic to anaerobic. Oxidase-negative, catalase-positive. The peptidoglycan type is A4a with an l-Ser residue at position 1 of the peptide subunit. The acyl type is acetyl. The major cell-wall sugar is galactose. The predominant menaquinone is MK-8(H4). The major polar lipids consist of phosphatidylglycerol, diphosphatidylglycerol, phosphatidylinositol, and unidentified phospholipids. Phosphatidylethanolamine is absent. The cellular fatty acid profile is dominated by the occurrence of iso- and anteiso-branched-chain acids. Mycolic acids are absent. The genomic G+C content is 70.7 to 72.8 mol%.

July 7, 2019

Complete genome sequence and transcriptome regulation of the pentose utilizing yeast Sugiyamaella lignohabitans.

Efficient conversion of hexoses and pentoses into value-added chemicals represents one core step for establishing economically feasible biorefineries from lignocellulosic material. While extensive research efforts have recently provided advances in the overall process performance, the quest for new microbial cell factories and novel enzymes sources is still open. As demonstrated recently the yeast Sugiyamaella lignohabitans (formerly Candida lignohabitans) represents a promising microbial cell factory for the production of organic acids from lignocellulosic hydrolysates. We report here the de novo genome assembly of S. lignohabitans using the Single Molecule Real-Time platform, with gene prediction refined by using RNA-seq. The sequencing revealed a 15.98 Mb genome, subdivided into four chromosomes. By phylogenetic analysis, Blastobotrys (Arxula) adeninivorans and Yarrowia lipolytica were found to be close relatives of S. lignohabitans Differential gene expression was evaluated in typical growth conditions on glucose and xylose and allowed a first insight into the transcriptional response of S. lignohabitans to different carbon sources and different oxygenation conditions. Novel sequences for enzymes and transporters involved in the central carbon metabolism, and therefore of potential biotechnological interest, were identified. These data open the way for a better understanding of the metabolism of S. lignohabitans and provide resources for further metabolic engineering.© FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

D1FHS, the type strain of the ammonia-oxidizing bacterium Nitrosococcus wardiae spec. nov.: enrichment, isolation, phylogenetic, and growth physiological characterization.

An ammonia-oxidizing bacterium, strain D1FHS, was enriched into pure culture from a sediment sample retrieved in Jiaozhou Bay, a hyper-eutrophic semi-closed water body hosting the metropolitan area of Qingdao, China. Based on initial 16S rRNA gene sequence analysis, strain D1FHS was classified in the genus Nitrosococcus, family Chromatiaceae, order Chromatiales, class Gammaproteobacteria; the 16S rRNA gene sequence with highest level of identity to that of D1FHS was obtained from Nitrosococcus halophilus Nc4(T). The average nucleotide identity between the genomes of strain D1FHS and N. halophilus strain Nc4 is 89.5%. Known species in the genus Nitrosococcus are obligate aerobic chemolithotrophic ammonia-oxidizing bacteria adapted to and restricted to marine environments. The optimum growth (maximum nitrite production) conditions for D1FHS in a minimal salts medium are: 50 mM ammonium and 700 mM NaCl at pH of 7.5 to 8.0 and at 37°C in dark. Because pertinent conditions for other studied Nitrosococcus spp. are 100-200 mM ammonium and <700 mM NaCl at pH of 7.5 to 8.0 and at 28-32°C, D1FHS is physiologically distinct from other Nitrosococcus spp. in terms of substrate, salt, and thermal tolerance.

July 7, 2019

Genomic and transcriptomic analyses reveal the characterization of a crude oil degrading bacterial strain: Pedobacter steynii DX4

Pedobacter steynii DX4, isolated from Qinghai-Tibet plateau, exhibited capability to effectively degrade crude oil at low temperature. In order to illustrate its biodegradation mechanism, whole genome and transcriptome sequencing were performed. It is the first genome of crude oil degrading strain in Pedobacter genus. The P. steynii DX4 genome consists of a single circular chromosome of 6,581,659 bp with an average G+C content of 41.31% and encodes 5464 genes in all. GIs were predicted and comparison analysis was performed between relative species. Genome annotation predicted several hydrocarbon oxygenases, chemotaxis proteins and biosurfactant synthetases. The transcriptional sequences profiled a lot of differently expressed genes when cells respectively grown on crude oil and pyruvate mediums. Crude oil significantly stimulated the expression of the genes related to the hydrocarbon oxidation and resparitory chain. Genomic and transcriptomic analysis of P. steynii DX4 have revealed the machenism of the crude oil degradation in Pedobacter steynii DX4 and provided us with valuable knowledge base to make effective strategy to mitigate the ecological damage caused by crude oil pollution.

July 7, 2019

Spontaneous chloroplast mutants mostly occur by replication slippage and show a biased pattern in the plastome of Oenothera.

Spontaneous plastome mutants have been used as a research tool since the beginning of genetics. However, technical restrictions have severely limited their contributions to research in physiology and molecular biology. Here, we used full plastome sequencing to systematically characterize a collection of 51 spontaneous chloroplast mutants in Oenothera (evening primrose). Most mutants carry only a single mutation. Unexpectedly, the vast majority of mutations do not represent single nucleotide polymorphisms but are insertions/deletions originating from DNA replication slippage events. Only very few mutations appear to be caused by imprecise double-strand break repair, nucleotide misincorporation during replication, or incorrect nucleotide excision repair following oxidative damage. U-turn inversions were not detected. Replication slippage is induced at repetitive sequences that can be very small and tend to have high A/T content. Interestingly, the mutations are not distributed randomly in the genome. The underrepresentation of mutations caused by faulty double-strand break repair might explain the high structural conservation of seed plant plastomes throughout evolution. In addition to providing a fully characterized mutant collection for future research on plastid genetics, gene expression, and photosynthesis, our work identified the spectrum of spontaneous mutations in plastids and reveals that this spectrum is very different from that in the nucleus.© 2016 American Society of Plant Biologists. All rights reserved.

July 7, 2019

Transfer of the potato plant isolates of Pectobacterium wasabiae to Pectobacterium parmentieri sp. nov.

Pectobacterium wasabiae was originally isolated from Japanese horseradish (Eutrema wasabi), but recently some Pectobacterium isolates collected from potato plants and tubers displaying blackleg and soft rot symptoms were also assigned to P. wasabiae. Here, combining genomic and phenotypical data, we re-evaluated their taxonomic position. PacBio and Illumina technologies were used to complete the genome sequences of P. wasabiae CFBP 3304T and RNS 08-42-1A. Multi-locus sequence analysis showed that the P. wasabiae strains RNS 08-42-1A, SCC3193, CFIA1002 and WPP163, which were collected from potato plant environment, constituted a separate clade from the original Japanese horseradish P. wasabiae. The taxonomic position of these strains was also supported by calculation of the in-silico DNA-DNA hybridization, genome average nucleotide indentity, alignment fraction and average nucleotide indentity values. In addition, they were phenotypically distinguished from P. wasabiae strains by producing acids from (+)-raffinose, a-d(+)-a-lactose, d(+)-galactose and (+)-melibiose but not from methyl a-d-glycopyranoside, (+)-maltose or malonic acid. The name Pectobacterium parmentieri sp. nov. is proposed for this taxon; the type strain is RNS 08-42-1AT (=CFBP 8475T=LMG 29774T).

July 7, 2019

Origins of the current seventh cholera pandemic.

Vibrio cholerae has caused seven cholera pandemics since 1817, imposing terror on much of the world, but bacterial strains are currently only available for the sixth and seventh pandemics. The El Tor biotype seventh pandemic began in 1961 in Indonesia, but did not originate directly from the classical biotype sixth-pandemic strain. Previous studies focused mainly on the spread of the seventh pandemic after 1970. Here, we analyze in unprecedented detail the origin, evolution, and transition to pandemicity of the seventh-pandemic strain. We used high-resolution comparative genomic analysis of strains collected from 1930 to 1964, covering the evolution from the first available El Tor biotype strain to the start of the seventh pandemic. We define six stages leading to the pandemic strain and reveal all key events. The seventh pandemic originated from a nonpathogenic strain in the Middle East, first observed in 1897. It subsequently underwent explosive diversification, including the spawning of the pandemic lineage. This rapid diversification suggests that, when first observed, the strain had only recently arrived in the Middle East, possibly from the Asian homeland of cholera. The lineage migrated to Makassar, Indonesia, where it gained the important virulence-associated elements Vibrio seventh pandemic island I (VSP-I), VSP-II, and El Tor type cholera toxin prophage by 1954, and it then became pandemic in 1961 after only 12 additional mutations. Our data indicate that specific niches in the Middle East and Makassar were important in generating the pandemic strain by providing gene sources and the driving forces for genetic events.

July 7, 2019

Characterization of tet(Y)-carrying LowGC plasmids exogenously captured from cow manure at a conventional dairy farm.

Manure from dairy farms has been shown to contain diverse tetracycline resistance genes that are transferable to soil. Here, we focus on conjugative plasmids that may spread tetracycline resistance at a conventional dairy farm. We performed exogenous plasmid isolation from cattle feces using chlortetracycline for transconjugant selection. The transconjugants obtained harbored LowGC-type plasmids and tet(Y). A representative plasmid (pFK2-7) was fully sequenced and this was compared with previously described LowGC plasmids from piggery manure-treated soil and a GenBank record from Acinetobacter nosocomialis that we also identified as a LowGC plasmid. The pFK2-7 plasmid had the conservative backbone typical of LowGC plasmids, though this region was interrupted with an insert containing the tet(Y)-tet(R) tetracycline resistance genes and the strA-strB streptomycin resistance genes. Despite Acinetobacter populations being considered natural hosts of LowGC plasmids, these plasmids were not found in three Acinetobacter isolates from the study farm. The isolates harbored tet(Y)-tet(R) genes in identical genetic surroundings as pFK2-7, however, suggesting genetic exchange between Acinetobacter and LowGC plasmids. Abundance of LowGC plasmids and tet(Y) was correlated in manure and soil samples from the farm, indicating that LowGC plasmids may be involved in the spread of tet(Y) in the environment.© FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019

Genomic insights into a sustained national outbreak of Yersinia pseudotuberculosis.

In 2014, a sustained outbreak of yersiniosis due to Yersinia pseudotuberculosis occurred across all major cities in New Zealand (NZ), with a total of 220 laboratory-confirmed cases, representing one of the largest ever reported outbreaks of Y. pseudotuberculosis. Here, we performed whole genome sequencing of outbreak-associated isolates to produce the largest population analysis to date of Y. pseudotuberculosis, giving us unprecedented capacity to understand the emergence and evolution of the outbreak clone. Multivariate analysis incorporating our genomic and clinical epidemiological data strongly suggested a single point-source contamination of the food chain, with subsequent nationwide distribution of contaminated produce. We additionally uncovered significant diversity in key determinants of virulence, which we speculate may help explain the high morbidity linked to this outbreak.

July 7, 2019

Systems biology-guided biodesign of consolidated lignin conversion

Lignin is the second most abundant biopolymer on the earth, yet its utilization for fungible products is complicated by its recalcitrant nature and remains a major challenge for sustainable lignocellulosic biorefineries. In this study, we used a systems biology approach to reveal the carbon utilization pattern and lignin degradation mechanisms in a unique lignin-utilizing Pseudomonas putida strain (A514). The mechanistic study further guided the design of three functional modules to enable a consolidated lignin bioconversion route. First, P. putida A514 mobilized a dye peroxidase-based enzymatic system for lignin depolymerization. This system could be enhanced by overexpressing a secreted multifunctional dye peroxidase to promote a two-fold enhancement of cell growth on insoluble kraft lignin. Second, A514 employed a variety of peripheral and central catabolism pathways to metabolize aromatic compounds, which can be optimized by overexpressing key enzymes. Third, the ß-oxidation of fatty acid was up-regulated, whereas fatty acid synthesis was down-regulated when A514 was grown on lignin and vanillic acid. Therefore, the functional module for polyhydroxyalkanoate (PHA) production was designed to rechannel ß-oxidation products. As a result, PHA content reached 73% per cell dry weight (CDW). Further integrating the three functional modules enhanced the production of PHA from kraft lignin and biorefinery waste. Thus, this study elucidated lignin conversion mechanisms in bacteria with potential industrial implications and laid out the concept for engineering a consolidated lignin conversion route.

July 7, 2019

svclassify: a method to establish benchmark structural variant calls.

The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but no similar high-quality benchmark SV calls exist for this genome. Since SV callers output highly discordant results, we developed methods to combine multiple forms of evidence from multiple sequencing technologies to classify candidate SVs into likely true or false positives. Our method (svclassify) calculates annotations from one or more aligned bam files from many high-throughput sequencing technologies, and then builds a one-class model using these annotations to classify candidate SVs as likely true or false positives.We first used pedigree analysis to develop a set of high-confidence breakpoint-resolved large deletions. We then used svclassify to cluster and classify these deletions as well as a set of high-confidence deletions from the 1000 Genomes Project and a set of breakpoint-resolved complex insertions from Spiral Genetics. We find that likely SVs cluster separately from likely non-SVs based on our annotations, and that the SVs cluster into different types of deletions. We then developed a supervised one-class classification method that uses a training set of random non-SV regions to determine whether candidate SVs have abnormal annotations different from most of the genome. To test this classification method, we use our pedigree-based breakpoint-resolved SVs, SVs validated by the 1000 Genomes Project, and assembly-based breakpoint-resolved insertions, along with semi-automated visualization using svviz.We find that candidate SVs with high scores from multiple technologies have high concordance with PCR validation and an orthogonal consensus method MetaSV (99.7 % concordant), and candidate SVs with low scores are questionable. We distribute a set of 2676 high-confidence deletions and 68 high-confidence insertions with high svclassify scores from these call sets for benchmarking SV callers. We expect these methods to be particularly useful for establishing high-confidence SV calls for benchmark samples that have been characterized by multiple technologies.

Auto Tag: Bioinformatics

SRinversion: a tool for detecting short inversions by splitting and re-aligning poorly mapped and unmapped sequencing reads.

CoLoRMap: Correcting Long Reads by Mapping short reads.

Genome-guided design of a defined mouse microbiota that confers colonization resistance against Salmonella enterica serovar Typhimurium.

Genomic sequencing-based mutational enrichment analysis identifies motility genes in a genetically intractable gut microbe.

Serinibacter

Complete genome sequence and transcriptome regulation of the pentose utilizing yeast Sugiyamaella lignohabitans.

D1FHS, the type strain of the ammonia-oxidizing bacterium Nitrosococcus wardiae spec. nov.: enrichment, isolation, phylogenetic, and growth physiological characterization.

Genomic and transcriptomic analyses reveal the characterization of a crude oil degrading bacterial strain: Pedobacter steynii DX4

Spontaneous chloroplast mutants mostly occur by replication slippage and show a biased pattern in the plastome of Oenothera.

Transfer of the potato plant isolates of Pectobacterium wasabiae to Pectobacterium parmentieri sp. nov.

Origins of the current seventh cholera pandemic.

Characterization of tet(Y)-carrying LowGC plasmids exogenously captured from cow manure at a conventional dairy farm.

Genomic insights into a sustained national outbreak of Yersinia pseudotuberculosis.

Systems biology-guided biodesign of consolidated lignin conversion

svclassify: a method to establish benchmark structural variant calls.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert