Menu
September 22, 2019  |  

The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).

The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.


September 22, 2019  |  

Evaluating approaches to find exon chains based on long reads.

Transcript prediction can be modeled as a graph problem where exons are modeled as nodes and reads spanning two or more exons are modeled as exon chains. Pacific Biosciences third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions. We survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity/precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long-read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy.The simulated data and in-house scripts used for this article are available at http://www.cs.helsinki.fi/group/gsa/exon-chains/exon-chains-bib.tar.bz2.


September 22, 2019  |  

Subaerial biofilms on granitic historic buildings: microbial diversity and development of phototrophic multi-species cultures.

Microbial communities of natural subaerial biofilms developed on granitic historic buildings of a World Heritage Site (Santiago de Compostela, NW Spain) were characterized and cultured in liquid BG11 medium. Environmental barcoding through next-generation sequencing (Pacific Biosciences) revealed that the biofilms were mainly composed of species of Chlorophyta (green algae) and Ascomycota (fungi) commonly associated with rock substrata. Richness and diversity were higher for the fungal than for the algal assemblages and fungi showed higher heterogeneity among samples. Cultures derived from natural biofilms showed the establishment of stable microbial communities mainly composed of Chlorophyta and Cyanobacteria. Although most taxa found in these cultures were not common in the original biofilms, they are likely common pioneer colonizers of building stone surfaces, including granite. Stable phototrophic multi-species cultures of known microbial diversity were thus obtained and their reliability to emulate natural colonization on granite should be confirmed in further experiments.


September 22, 2019  |  

SMRT-Cappable-seq reveals complex operon variants in bacteria.

Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo processing of transcripts confounds the accurate identification of the 5′ and 3′ ends of operons. Here we develop a methodology called SMRT-Cappable-seq that combines the isolation of un-fragmented primary transcripts with single-molecule long read sequencing. Applied to E. coli, this technology results in an accurate definition of the transcriptome with 34% of known operons from RegulonDB being extended by at least one gene. Furthermore, 40% of transcription termination sites have read-through that alters the gene content of the operons. As a result, most of the bacterial genes are present in multiple operon variants reminiscent of eukaryotic splicing. By providing such granularity in the operon structure, this study represents an important resource for the study of prokaryotic gene network and regulation.


September 22, 2019  |  

Survey of Ixodes pacificus ticks in California reveals a diversity of microorganisms and a novel and widespread Anaplasmataceae species.

Ixodes pacificus ticks can harbor a wide range of human and animal pathogens. To survey the prevalence of tick-borne known and putative pathogens, we tested 982 individual adult and nymphal I. pacificus ticks collected throughout California between 2007 and 2009 using a broad-range PCR and electrospray ionization mass spectrometry (PCR/ESI-MS) assay designed to detect a wide range of tick-borne microorganisms. Overall, 1.4% of the ticks were found to be infected with Borrelia burgdorferi, 2.0% were infected with Borrelia miyamotoi and 0.3% were infected with Anaplasma phagocytophilum. In addition, 3.0% were infected with Babesia odocoilei. About 1.2% of the ticks were co-infected with more than one pathogen or putative pathogen. In addition, we identified a novel Anaplasmataceae species that we characterized by sequencing of its 16S rRNA, groEL, gltA, and rpoB genes. Sequence analysis indicated that this organism is phylogenetically distinct from known Anaplasma species with its closest genetic near neighbors coming from Asia. The prevalence of this novel Anaplasmataceae species was as high as 21% at one site, and it was detected in 4.9% of ticks tested statewide. Based upon this genetic characterization we propose that this organism be called ‘Candidatus Cryptoplasma californiense’. Knowledge of this novel microbe will provide awareness for the community about the breadth of the I. pacificus microbiome, the concept that this bacterium could be more widely spread; and an opportunity to explore whether this bacterium also contributes to human or animal disease burden.


September 22, 2019  |  

Complete genome sequence of multidrug-resistant Staphylococcus cohnii ssp. urealyticus strain SNUDS-2 isolated from farmed duck, Republic of Korea.

Staphylococcus cohnii has become increasingly recognized as a potential pathogen of clinically significant nosocomial and farm animal infections. This study was designed to determine the genome of a multidrug-resistant S. cohnii subsp. urealyticus strain SNUDS-2 isolated from a farmed duck in Korea.Genomic DNA was sequenced using the PacBio RS II system. The complete genome was annotated and the presence of antimicrobial resistance and virulence genes were identified.The annotated 2,625,703 bp genome contained various antimicrobial resistance genes conferring resistance to ß-lactam, aminoglycosides, fluoroquinolones, phenicols and trimethoprim. The virulence-associated three synergistic hemolysins have been identified in the strain.To the best of our knowledge, this is the first complete genome of S. cohnii, and will provide important insights into the biodiversity of CoNS and valuable information for the control of this emerging pathogen. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.


September 22, 2019  |  

Daily HIV pre-exposure prophylaxis (PrEP) with tenofovir disoproxil fumarate-emtricitabine reduced Streptococcus and increased Erysipelotrichaceae in rectal microbiota.

Daily PrEP is highly effective at preventing HIV-1 acquisition, but risks of long-term tenofovir disoproxil fumarate plus emtricitabine (TDF-FTC) include renal decline and bone mineral density decrease in addition to initial gastrointestinal side effects. We investigated the impact of TDF-FTC on the enteric microbiome using rectal swabs collected from healthy MSM before PrEP initiation and after 48 to 72 weeks of adherent PrEP use. The V4 region of the 16S ribosomal RNA gene sequencing showed that Streptococcus was significantly reduced from 12.0% to 1.2% (p?=?0.036) and Erysipelotrichaceae family was significantly increased from 0.79% to 3.3% (p?=?0.028) after 48-72 weeks of daily PrEP. Catenibacterium mitsuokai, Holdemanella biformis and Turicibacter sanguinis were increased within the Erysipelotrichaceae family and Streptococcus agalactiae, Streptococcus oralis, Streptococcus mitis were reduced. These changes were not associated with host factors including PrEP duration, age, race, tenofovir diphosphate blood level, any drug use and drug abuse, suggesting that the observed microbiome shifts were likely induced by daily PrEP use. Long-term PrEP resulted in increases of Catenibacterium mitsuokai and Holdemanella biformis, which have been associated with gut microbiome dysbiosis. Our observations can aid in characterizing PrEP’s side effects, which is likely to improve PrEP adherence, and thus HIV-1 prevention.


September 22, 2019  |  

Long-read, Single Molecule, Real-Time (SMRT) DNA Sequencing for metagenomic applications

In this chapter, we describe applications of single molecule, real-time (SMRT) DNA sequencing toward metagenomic research. The long sequence reads, combined with a lack of bias with respect to DNA sequence context or GC content, facilitate a more comprehensive analysis of the genomic constitution of microbial communities. Full-length 16S RNA gene sequencing at high (>99%) accuracy allows for species-level characterization of community members concomitant with the determination of community structure. The application of SMRT sequencing to whole-community shotgun microbial metagenomics has also been discussed.


September 22, 2019  |  

Complete genome sequence of Petrimonas sp. strain IBARAKI, assembled from the metagenome data of a culture containing Dehalococcoides spp.

The complete genome sequence of Petrimonas sp. strain IBARAKI in a Dehalococcoides-containing culture was determined using the PacBio RS II platform. The genome is a single circular chromosome of 3,693,233 nucleotides (nt), with a GC content of 44%. This is the first genome sequence of a Petrimonas species. Copyright © 2018 Ikegami et al.


September 22, 2019  |  

Metataxonomics reveal vultures as a reservoir for Clostridium perfringens.

The Old World vulture may carry and spread pathogens for emerging infections since they feed on the carcasses of dead animals and participate in the sky burials of humans, some of whom have died from communicable diseases. Therefore, we studied the precise fecal microbiome of the Old World vulture with metataxonomics, integrating the high-throughput sequencing of almost full-length small subunit ribosomal RNA (16S rRNA) gene amplicons in tandem with the operational phylogenetic unit (OPU) analysis strategy. Nine vultures of three species were sampled using rectal swabs on the Qinghai-Tibet Plateau, China. Using the Pacific Biosciences sequencing platform, we obtained 54 135 high-quality reads of 16S rRNA amplicons with an average of 1442±6.9?bp in length and 6015±1058 reads per vulture. Those sequences were classified into 314 OPUs, including 102 known species, 50 yet to be described species and 161 unknown new lineages of uncultured representatives. Forty-five species have been reported to be responsible for human outbreaks or infections, and 23 yet to be described species belong to genera that include pathogenic species. Only six species were common to all vultures. Clostridium perfringens was the most abundant and present in all vultures, accounting for 30.8% of the total reads. Therefore, using the new technology, we found that vultures are an important reservoir for C. perfringens as evidenced by the isolation of 107 strains encoding for virulence genes, representing 45 sequence types. Our study suggests that the soil-related C. perfringens and other pathogens could have a reservoir in vultures and other animals.


September 22, 2019  |  

Single-cell (meta-)genomics of a dimorphic Candidatus Thiomargarita nelsonii reveals genomic plasticity.

The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group I intron also carried a MITE sequence that, like the hupL MITE family, occurs broadly across the genome. The presence of a high degree of mobile elements in genes central to Thiomargarita’s core metabolism has not been previously reported in free-living bacteria and suggests a highly mutable genome.


September 22, 2019  |  

Automated broad range molecular detection of bacteria in clinical samples.

Molecular detection methods, such as quantitative PCR (qPCR), have found their way into clinical microbiology laboratories for the detection of an array of pathogens. Most routinely used methods, however, are directed at specific species. Thus, anything that is not explicitly searched for will be missed. This greatly limits the flexibility and universal application of these techniques. We investigated the application of a rapid universal bacterial molecular identification method, IS-pro, to routine patient samples received in a clinical microbiology laboratory. IS-pro is a eubacterial technique based on the detection and categorization of 16S-23S rRNA gene interspace regions with lengths that are specific for each microbial species. As this is an open technique, clinicians do not need to decide in advance what to look for. We compared routine culture to IS-pro using 66 samples sent in for routine bacterial diagnostic testing. The samples were obtained from patients with infections in normally sterile sites (without a resident microbiota). The results were identical in 20 (30%) samples, IS-pro detected more bacterial species than culture in 31 (47%) samples, and five of the 10 culture-negative samples were positive with IS-pro. The case histories of the five patients from whom these culture-negative/IS-pro-positive samples were obtained suggest that the IS-pro findings are highly clinically relevant. Our findings indicate that an open molecular approach, such as IS-pro, may have a high added value for clinical practice. Copyright © 2016, American Society for Microbiology. All Rights Reserved.


September 22, 2019  |  

Bacteroides dorei dominates gut microbiome prior to autoimmunity in Finnish children at high risk for type 1 diabetes.

The incidence of the autoimmune disease, type 1 diabetes (T1D), has increased dramatically over the last half century in many developed countries and is particularly high in Finland and other Nordic countries. Along with genetic predisposition, environmental factors are thought to play a critical role in this increase. As with other autoimmune diseases, the gut microbiome is thought to play a potential role in controlling progression to T1D in children with high genetic risk, but we know little about how the gut microbiome develops in children with high genetic risk for T1D. In this study, the early development of the gut microbiomes of 76 children at high genetic risk for T1D was determined using high-throughput 16S rRNA gene sequencing. Stool samples from children born in the same hospital in Turku, Finland were collected at monthly intervals beginning at 4-6 months after birth until 2.2 years of age. Of those 76 children, 29 seroconverted to T1D-related autoimmunity (cases) including 22 who later developed T1D, the remaining 47 subjects remained healthy (controls). While several significant compositional differences in low abundant species prior to seroconversion were found, one highly abundant group composed of two closely related species, Bacteroides dorei and Bacteroides vulgatus, was significantly higher in cases compared to controls prior to seroconversion. Metagenomic sequencing of samples high in the abundance of the B. dorei/vulgatus group before seroconversion, as well as longer 16S rRNA sequencing identified this group as Bacteroides dorei. The abundance of B. dorei peaked at 7.6 months in cases, over 8 months prior to the appearance of the first islet autoantibody, suggesting that early changes in the microbiome may be useful for predicting T1D autoimmunity in genetically susceptible infants. The cause of increased B. dorei abundance in cases is not known but its timing appears to coincide with the introduction of solid food.


September 22, 2019  |  

Analyses of intestinal microbiota: culture versus sequencing.

Analyzing human as well as animal microbiota composition has gained growing interest because structural components and metabolites of microorganisms fundamentally influence all aspects of host physiology. Originally dominated by culture-dependent methods for exploring these ecosystems, the development of molecular techniques such as high throughput sequencing has dramatically increased our knowledge. Because many studies of the microbiota are based on the bacterial 16S ribosomal RNA (rRNA) gene targets, they can, at least in principle, be compared to determine the role of the microbiome composition for developmental processes, host metabolism, and physiology as well as different diseases. In our review, we will summarize differences and pitfalls in current experimental protocols, including all steps from nucleic acid extraction to bioinformatical analysis which may produce variation that outweighs subtle biological differences. Future developments, such as integration of metabolomic, transcriptomic, and metagenomic data sets and standardization of the procedures, will be discussed. © The Author 2015. Published by Oxford University Press on behalf of the Institute for Laboratory Animal Research. All rights reserved. For permissions, please email: journals.permissions@oup.com.


September 22, 2019  |  

LSCplus: a fast solution for improving long read accuracy by short read alignment.

The single molecule, real time (SMRT) sequencing technology of Pacific Biosciences enables the acquisition of transcripts from end to end due to its ability to produce extraordinarily long reads (>10 kb). This new method of transcriptome sequencing has been applied to several projects on humans and model organisms. However, the raw data from SMRT sequencing are of relatively low quality, with a random error rate of approximately 15 %, for which error correction using next-generation sequencing (NGS) short reads is typically necessary. Few tools have been designed that apply a hybrid sequencing approach that combines NGS and SMRT data, and the most popular existing tool for error correction, LSC, has computing resource requirements that are too intensive for most laboratory and research groups. These shortcomings severely limit the application of SMRT long reads for transcriptome analysis.Here, we report an improved tool (LSCplus) for error correction with the LSC program as a reference. LSCplus overcomes the disadvantage of LSC’s time consumption and improves quality. Only 1/3-1/4 of the time and 1/20-1/25 of the error correction time is required using LSCplus compared with that required for using LSC.LSCplus is freely available at http://www.herbbol.org:8001/lscplus/ . Sample calculations are provided illustrating the precision and efficiency of this method regarding error correction and isoform detection.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.