Menu
July 7, 2019

De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.

Long-read sequencing technologies such as Pacific Biosciences and Oxford Nanopore MinION are capable of producing long sequencing reads with average fragment lengths of over 10,000 base-pairs and maximum lengths reaching 100,000 base- pairs. Compared with short reads, the assemblies obtained from long-read sequencing platforms have much higher contig continuity and genome completeness as long fragments are able to extend paths into problematic or repetitive regions. Many successful assembly applications of the Pacific Biosciences technology have been reported ranging from small bacterial genomes to large plant and animal genomes. Recently, genome assemblies using Oxford Nanopore MinION data have attracted much attention due to the portability and low cost of this novel sequencing instrument. In this paper, we re-sequenced a well characterized genome, the Saccharomyces cerevisiae S288C strain using three different platforms: MinION, PacBio and MiSeq. We present a comprehensive metric comparison of assemblies generated by various pipelines and discuss how the platform associated data characteristics affect the assembly quality. With a given read depth of 31X, the assemblies from both Pacific Biosciences and Oxford Nanopore MinION show excellent continuity and completeness for the 16 nuclear chromosomes, but not for the mitochondrial genome, whose reconstruction still represents a significant challenge.


July 7, 2019

Genome stability in engineered strains of the extremely thermophilic lignocellulose-degrading bacterium Caldicellulosiruptor bescii.

Caldicellulosiruptor bescii is the most thermophilic cellulose degrader known and is of great interest because of its ability to degrade nonpretreated plant biomass. For biotechnological applications, an efficient genetic system is required to engineer it to convert plant biomass into desired products. To date, two different genetically tractable lineages of C. bescii strains have been generated. The first (JWCB005) is based on a random deletion within the pyrimidine biosynthesis genes pyrFA, and the second (MACB1018) is based on the targeted deletion of pyrE, making use of a kanamycin resistance marker. Importantly, an active insertion element, ISCbe4, was discovered in C. bescii when it disrupted the gene for lactate dehydrogenase (ldh) in strain JWCB018, constructed in the JWCB005 background. Additional instances of ISCbe4 movement in other strains of this lineage are presented herein. These observations raise concerns about the genetic stability of such strains and their use as metabolic engineering platforms. In order to investigate genome stability in engineered strains of C. bescii from the two lineages, genome sequencing and Southern blot analyses were performed. The evidence presented shows a dramatic increase in the number of single nucleotide polymorphisms, insertions/deletions, and ISCbe4 elements within the genome of JWCB005, leading to massive genome rearrangements in its daughter strain, JWCB018. Such dramatic effects were not evident in the newer MACB1018 lineage, indicating that JWCB005 and its daughter strains are not suitable for metabolic engineering purposes in C. bescii Furthermore, a facile approach for assessing genomic stability in C. bescii has been established. IMPORTANCE Caldicellulosiruptor bescii is a cellulolytic extremely thermophilic bacterium of great interest for metabolic engineering efforts geared toward lignocellulosic biofuel and bio-based chemical production. Genetic technology in C. bescii has led to the development of two uracil auxotrophic genetic background strains for metabolic engineering. We show that strains derived from the genetic background containing a random deletion in uracil biosynthesis genes (pyrFA) have a dramatic increase in the number of single nucleotide polymorphisms, insertions/deletions, and ISCbe4 insertion elements in their genomes compared to the wild type. At least one daughter strain of this lineage also contains large-scale genome rearrangements that are flanked by these ISCbe4 elements. In contrast, strains developed from the second background strain developed using a targeted deletion strategy of the uracil biosynthetic gene pyrE have a stable genome structure, making them preferable for future metabolic engineering studies. Copyright © 2017 American Society for Microbiology.


July 7, 2019

Complete genome sequence of the cellulose-producing strain Komagataeibacter nataicola RZS01.

Komagataeibacter nataicola is an acetic acid bacterium (AAB) that can produce abundant bacterial cellulose and tolerate high concentrations of acetic acid. To globally understand its fermentation characteristics, we present a high-quality complete genome sequence of K. nataicola RZS01. The genome consists of a 3,485,191-bp chromosome and 6 plasmids, which encode 3,514 proteins and bear three cellulose synthase operons. Phylogenetic analysis at the genome level provides convincing evidence of the evolutionary position of K. nataicola with respect to related taxa. Genomic comparisons with other AAB revealed that RZS01 shares 36.1%~75.1% of sequence similarity with other AAB. The sequence data was also used for metabolic analysis of biotechnological substrates. Analysis of the resistance to acetic acid at the genomic level indicated a synergistic mechanism responsible for acetic acid tolerance. The genomic data provide a viable platform that can be used to understand and manipulate the phenotype of K. nataicola RZS01 to further improve bacterial cellulose production.


July 7, 2019

Complete genome sequence of Stenotrophomonas sp. KCTC 12332, a biotechnological potential bacterium.

Hydroxy fatty acids are used in various industries due to their availability, and in particular, Stenotrophomonas sp. has been regarded as a potential candidate for biotechnological applications, including biotransformation that hydrate unsaturated fatty acids into their derivatives. Here we complete the genome sequence of Stenotrophomonas sp. KCTC 12332 which has a size of 4,541,594bp (G+C content of 63.83%) with 3790 coding DNA sequences (CDSs), 67 tRNA and 3 rRNA operons. The genome contains gene encoding oleate hydratase that can convert oleic acid into 10-hydroxyoctadecanoic acid. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

2015 epidemic of severe Streptococcus agalactiae sequence type 283 infections in Singapore associated with the consumption of raw freshwater fish: a detailed analysis of clinical, epidemiological, and bacterial sequencing data.

Streptococcus agalactiae (group B Streptococcus [GBS]) has not been described as a foodborne pathogen. However, in 2015, a large outbreak of severe invasive sequence type (ST) 283 GBS infections in adults epidemiologically linked to the consumption of raw freshwater fish occurred in Singapore. We attempted to determine the scale of the outbreak, define the clinical spectrum of disease, and link the outbreak to contaminated fish.Time-series analysis was performed on microbiology laboratory data. Food handlers and fishmongers were screened for enteric carriage of GBS. A retrospective cohort study was conducted to assess differences in demographic and clinical characteristics of patients with invasive ST283 and non-ST283 infections. Whole-genome sequencing was performed on human and fish ST283 isolates from Singapore, Thailand, and Hong Kong.The outbreak was estimated to have started in late January 2015. Within the study cohort of 408 patients, ST283 accounted for 35.8% of cases. Patients with ST283 infection were younger and had fewer comorbidities but were more likely to develop meningoencephalitis, septic arthritis, and spinal infection. Of 82 food handlers and fishmongers screened, none carried ST283. Culture of 43 fish samples yielded 13 ST283-positive samples. Phylogenomic analysis of 161 ST283 isolates from humans and fish revealed they formed a tight clade distinguished by 93 single-nucleotide polymorphisms.ST283 is a zoonotic GBS clone associated with farmed freshwater fish, capable of causing severe disease in humans. It caused a large foodborne outbreak in Singapore and poses both a regional and potentially more widespread threat.


July 7, 2019

Complete genome sequence of Microbulbifer sp. CCB-MM1, a halophile isolated from Matang Mangrove Forest, Malaysia.

Microbulbifer sp. CCB-MM1 is a halophile isolated from estuarine sediment of Matang Mangrove Forest, Malaysia. Based on 16S rRNA gene sequence analysis, strain CCB-MM1 is a potentially new species of genus Microbulbifer. Here we describe its features and present its complete genome sequence with annotation. The genome sequence is 3.86 Mb in size with GC content of 58.85%, harbouring 3313 protein coding genes and 92 RNA genes. A total of 71 genes associated with carbohydrate active enzymes were found using dbCAN. Ectoine biosynthetic genes, ectABC operon and ask_ect were detected using antiSMASH 3.0. Cell shape determination genes, mreBCD operon, rodA and rodZ were annotated, congruent with the rod-coccus cell cycle of the strain CCB-MM1. In addition, putative mreBCD operon regulatory gene, bolA was detected, which might be associated with the regulation of rod-coccus cell cycle observed from the strain.


July 7, 2019

Genome assembly of Chryseobacterium sp. strain IHBB 10212 from glacier top-surface soil in the Indian trans-Himalayas with potential for hydrolytic enzymes

The cold-active esterases are gaining importance due to their catalytic activities finding applications in chemical industry, food processes and detergent industry as additives, and organic synthesis of unstable compounds as catalysts. In the present study, the complete genome sequence of 4,843,645 bp with an average 34.08% G + C content and 4260 protein-coding genes are reported for the low temperature-active esterase-producing novel strain of Chrysobacterium isolated from the top-surface soil of a glacier in the cold deserts of the Indian trans-Himalayas. The genome contained two plasmids of 16,553 and 11,450 bp with 40.54 and 40.37% G + C contents, respectively. Several genes encoding the hydrolysis of ester linkages of triglycerides into fatty acids and glycerol were predicted in the genome. The annotation also predicted the genes encoding proteases, lipases, amylases, ß-glucosidases, endoglucanases and xylanases involved in biotechnological processes. The complete genome sequence of Chryseobacterium sp. strain IHBB 10212 and two plasmids have been deposited vide accession numbers CP015199, CP015200 and CP015201 at DDBJ/EMBL/GenBank.


July 7, 2019

De novo whole-genome sequencing of the wood rot fungus Polyporus brumalis, which exhibits potential terpenoid metabolism.

Polyporus brumalis is able to synthesize several sesquiterpenes during fungal growth. Using a single-molecule real-time sequencing platform, we present the 53-Mb draft genome of P. brumalis, which contains 6,231 protein-coding genes. Gene annotation and isolation support genetic information, which can increase the understanding of sesquiterpene metabolism in P. brumalis. Copyright © 2017 Lee et al.


July 7, 2019

Rapid and consistent evolution of colistin resistance in XDR Pseudomonas aeruginosa during morbidostat culture.

Colistin is a last resort antibiotic commonly used against multidrug-resistant strains of Pseudomonas aeruginosa To investigate the potential for in-situ evolution of resistance against colistin and to map the molecular targets of colistin resistance, we exposed two P. aeruginosa isolates to colistin using a continuous culture device known as morbidostat. As a result, colistin resistance reproducibly increased 10-fold within ten days, and 100-fold within 20 days, along with highly stereotypic, yet strain specific mutation patterns. The majority of mutations hit the pmrAB two component signaling system and genes involved in lipopolysaccharide (LPS) synthesis, including lpxC, pmrE, and migA We tracked the frequencies of all arising mutations by whole genome deep sequencing every 3-4 days to provide a detailed picture of the dynamics of resistance evolution, including competition and displacement among multiple resistant sub-populations. In seven out of 18 cultures, we observed mutations in mutS along with a mutator phenotype that seemed to facilitate resistance evolution. Copyright © 2017 American Society for Microbiology.


July 7, 2019

Comparative genomic and phylogenetic analysis of a toxigenic clinical isolate of Corynebacterium diphtheriae strain B-D-16-78 from Malaysia.

In this study, we report the comparative genomics and phylogenetic analysis of Corynebacterium diphtheriae strain B-D-16-78 that was isolated from a clinical specimen in 2016. The complete genome of C. diphtheriae strain B-D-16-78 was sequenced using PacBio Single Molecule, Real-Time sequencing technology and consists of a 2,474,151-bp circular chromosome with an average GC content of 53.56%. The core genome of C. diphtheriae was also deduced from a total of 74 strains with complete or draft genome sequences and the core genome-based phylogenetic analysis revealed close genetic relationship among strains that shared the same MLST allelic profile. In the context of CRISPR-Cas system, which confers adaptive immunity against re-invading DNA, 73 out of 86 spacer sequences were found to be unique to Malaysian strains which harboured only type-II-C and/or type-I-E-a systems. A total of 48 tox genes which code for the diphtheria toxin were retrieved from the 74 genomes and with the exception of one truncated gene, only nucleotide substitutions were detected when compared to the tox gene sequence of PW8. More than half were synonymous substitution and only two were nonsynonymous substitutions whereby H24Y was predicted to have a damaging effect on the protein function whilst T262V was predicted to be tolerated. Both toxigenic and non-toxigenic toxin-gene bearing strains have been isolated in Malaysia but the repeated isolation of toxigenic strains with the same MLST profile suggests the possibility of some of these strains may be circulating in the population. Hence, efforts to increase herd immunity should be continued and supported by an effective monitoring and surveillance system to track, manage and control outbreak of cases. Copyright © 2017 Elsevier B.V. All rights reserved.


July 7, 2019

Comparative genomic analysis of Acinetobacter strains isolated from murine colonic crypts.

A restricted set of aerobic bacteria dominated by the Acinetobacter genus was identified in murine intestinal colonic crypts. The vicinity of such bacteria with intestinal stem cells could indicate that they protect the crypt against cytotoxic and genotoxic signals. Genome analyses of these bacteria were performed to better appreciate their biodegradative capacities.Two taxonomically different clusters of Acinetobacter were isolated from murine proximal colonic crypts, one was identified as A. modestus and the other as A. radioresistens. Their identification was performed through biochemical parameters and housekeeping gene sequencing. After selection of one strain of each cluster (A. modestus CM11G and A. radioresistens CM38.2), comparative genomic analysis was performed on whole-genome sequencing data. The antibiotic resistance pattern of these two strains is different, in line with the many genes involved in resistance to heavy metals identified in both genomes. Moreover whereas the operon benABCDE involved in benzoate metabolism is encoded by the two genomes, the operon antABC encoding the anthranilate dioxygenase, and the phenol hydroxylase gene cluster are absent in the A. modestus genomic sequence, indicating that the two strains have different capacities to metabolize xenobiotics. A common feature of the two strains is the presence of a type IV pili system, and the presence of genes encoding proteins pertaining to secretion systems such as Type I and Type II secretion systems.Our comparative genomic analysis revealed that different Acinetobacter isolated from the same biological niche, even if they share a large majority of genes, possess unique features that could play a specific role in the protection of the intestinal crypt.


July 7, 2019

Adaptive evolution of a hyperthermophilic archaeon pinpoints a formate transporter as a critical factor for the growth enhancement on formate.

Previously, we reported that the hyperthermophilic archaeon Thermococcus onnurineus NA1 could grow on formate and produce H2. Formate conversion to hydrogen was mediated by a formate-hydrogen lyase complex and was indeed a part of chemiosmotic coupling to ATP generation. In this study, we employed an adaptation approach to enhance the cell growth on formate and investigated molecular changes. As serial transfer continued on formate-containing medium at the serum vial, cell growth, H2 production and formate consumption increased remarkably. The 156 times transferred-strain, WTF-156T, was demonstrated to enhance H2 production using formate in a bioreactor. The whole-genome sequencing of the WTF-156T strain revealed eleven mutations. While no mutation was found among the genes encoding formate hydrogen lyase, a point mutation (G154A) was identified in a formate transporter (TON_1573). The TON_1573 (A52T) mutation, when introduced into the parent strain, conferred increase in formate consumption and H2 production. Another adaptive passage, carried out by culturing repeatedly in a bioreactor, resulted in a strain, which has a mutation in TON_1573 (C155A) causing amino acid change, A52E. These results implicate that substitution of A52 residue of a formate transporter might be a critical factor to ensure the increase in formate uptake and cell growth.


July 7, 2019

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

Candida krusei is a diploid, heterozygous yeast that is an opportunistic fungal pathogen in immunocompromised patients. This species also is utilized for fermenting cocoa beans during chocolate production. One major concern in the clinical setting is the innate resistance of this species to the most commonly used antifungal drug fluconazole. Here we report a high-quality genome sequence and assembly for the first clinical isolate of C. krusei, strain 81-B-5, into 11 scaffolds generated with PacBio sequencing technology. Gene annotation and comparative analysis revealed a unique profile of transporters that could play a role in drug resistance or adaptation to different environments. In addition, we show that while 82% of the genome is highly heterozygous, a 2.0 Mb region of the largest scaffold has undergone loss of heterozygosity. This genome will serve as a reference for further genetic studies of this pathogen. Copyright © 2017 Author et al.


July 7, 2019

ConcatSeq: A method for increasing throughput of single molecule sequencing by concatenating short DNA fragments.

Single molecule sequencing (SMS) platforms enable base sequences to be read directly from individual strands of DNA in real-time. Though capable of long read lengths, SMS platforms currently suffer from low throughput compared to competing short-read sequencing technologies. Here, we present a novel strategy for sequencing library preparation, dubbed ConcatSeq, which increases the throughput of SMS platforms by generating long concatenated templates from pools of short DNA molecules. We demonstrate adaptation of this technique to two target enrichment workflows, commonly used for oncology applications, and feasibility using PacBio single molecule real-time (SMRT) technology. Our approach is capable of increasing the sequencing throughput of the PacBio RSII platform by more than five-fold, while maintaining the ability to correctly call allele frequencies of known single nucleotide variants. ConcatSeq provides a versatile new sample preparation tool for long-read sequencing technologies.


July 7, 2019

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Motivation: Despite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects. There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population. Result: Here, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects. Availability and implementation: Pamir is available at https://github.com/vpc-ccg/pamir. Contact:fhach@sfu.ca, prostatecentre.com or calkan@cs.bilkent.edu.tr Supplementary information:Supplementary data are available at Bioinformatics online.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.