Despite apparent carbon limitation, anoxic deep subsurface brines at the Soudan Underground Iron Mine harbor active microbial communities. To characterize these assemblages, we performed shotgun metagenomics of native and enriched samples. Following enrichment on poised electrodes and long read sequencing, we recovered from the metagenome the closed, circular genome of a novel Desulfuromonas sp. with remarkable genomic features that were not fully resolved by short read assembly alone. This organism was essentially absent in unenriched Soudan communities, indicating that electrodes are highly selective for putative metal reducers. Native community metagenomes suggest that carbon cycling is driven by methyl-C1 metabolism, in particular methylotrophic methanogenesis. Our results highlight the promising potential for long reads in metagenomic surveys of low-diversity environments.
Reference quality de novo genome assemblies were once solely the domain of large, well-funded genome projects. While next-generation short read technology removed some of the cost barriers, accurate chromosome-scale assembly remains a real challenge. Here we present efforts to de novo assemble the goat (Capra hircus) genome. Through the combination of single-molecule technologies from Pacific Biosciences (sequencing) and BioNano Genomics (optical mapping) coupled with high-throughput chromosome conformation capture sequencing (Hi-C), an inbred San Clemente goat genome has been sequenced and assembled to a high degree of completeness at a relatively modest cost. Starting with 38 million PacBio reads, we integrated the MinHash Alignment Process (MHAP) with the Celera Assembler (CA) to produce an assembly composed of 3110 contigs with a contig N50 size of 4.7 Mb. This assembly was scaffolded with BioNano genome maps derived from a single IrysChip into 333 scaffolds with an N50 of 23.1 Mb including the complete scaffolding of chromosome 20. Finally, cis-chromosome associations were determined by Hi-C, yielding complete reconstruction of all autosomes into single scaffolds with a final N50 of 91.7 Mb. We hope to demonstrate that our methods are not only cost effective, but improve our ability to annotate challenging genomic regions such as highly repetitive immune gene clusters.
The complex immune regions of the genome, including MHC and KIR, contain large copy number variants (CNVs), a high density of genes, hyper-polymorphic gene alleles, and conserved extended haplotypes (CEH) with enormous linkage disequilibrium (LDs). This level of complexity and inherent biases of short-read sequencing make it challenging for extracting immune region haplotype information from reference-reliant, shotgun sequencing and GWAS methods. As NGS based genome and exome sequencing and SNP arrays have become a routine for population studies, numerous efforts are being made for developing software to extract and or impute the immune gene information from these datasets. Despite these efforts, the fine mapping of causal variants of immune genes for their well-documented association with cancer, drug-induced hypersensitivity and immune-related diseases, has been slower than expected. This has in many ways limited our understanding of the mechanisms leading to immune disease. In the present work, we demonstrate the advantages of long reads delivered by SMRT Sequencing for assembling complete haplotypes of MHC and KIR gene clusters, as well as calling correct genotypes of genes comprised within them. All the genotype information is detected at allele- level with full phasing information across SNP-poor regions. Genotypes were called correctly from targeted gene amplicons, haplotypes, as well as from a completely assembled 5 Mb contig of the MHC region from a de novo assembly of whole genome shotgun data. De novo analysis pipeline used in all these approaches allowed for reference-free analysis without imputation, a key for interrogation without prior knowledge about ethnic backgrounds. These methods are thus easily adoptable for previously uncharacterized human or non-human species.
The killer immunoglobulin-like receptors (KIR) genes belong to the immunoglobulin superfamily and are widely studied due to the critical role they play in coordinating the innate immune response to infection and disease. Highly accurate, contiguous, long reads, like those generated by SMRT Sequencing, when combined with target-enrichment protocols, provide a straightforward strategy for generating complete de novo assembled KIR haplotypes. We have explored two different methods to capture the KIR region; one applying the use of fosmid clones and one using Nimblegen capture.
Fecal samples were obtained from human subjects in the first blinded, placebo-controlled trial to evaluate the efficacy and safety of fecal microbiota transplant (FMT) for treatment of recurrent C. difficile infection. Samples included pre-and post-FMT transplant, post-placebo transplant, and the donor control; samples were taken at 2 and 8 week post-FMT. Sequencing was done on the PacBio Sequel System, with the goal of obtaining high quality sequences covering whole genes or gene clusters, which will be used to better understand the relationship between the composition and functional capabilities of intestinal microbiomes and patient health. Methods: Samples were randomly sheared to 2-3 kb fragments, a sufficient length to cover most genes, and SMRTbell libraries were prepared using standard protocols. Libraries were run on the Sequel System, which has a throughput of hundreds of thousands of reads per SMRT Cell, adequate yield to sample the complex microbiomes of post-transplant and donor samples.Results: Here we characterize samples, describe library prep methods and detail Sequel System operation, including run conditions. Descriptive statistics of data output (primary analysis) are presented, along with SMRT Analysis reports on circular consensus sequence (CCS) reads generated using an updated algorithm (CCS2). Final sequencing yields are filtered at various levels of predicted accuracy from 90% to 99.9%. Previous studies done using the PacBio RS II System demonstrated the ability to profile at the species level, and in some cases the strain level, and provided functional insight. Conclusions: These results demonstrate that the Sequel System is well-suited for characterization of complex microbial communities, with the ability for high-throughput generation of extremely accurate single-molecule sequences, each several kilobases in length. The entire process from shearing and library prep through sequencing and CCS analysis can be completed in less than 48 hours.
Understanding interactions among plants and the complex communities of organisms living on, in and around them requires more than one experimental approach. A new method for de novo metagenome assembly,…
Webinar: Bioinformatics lunch & learn – Better assemblies of bacterial genomes and plasmids with the new microbial assembly pipeline in SMRT Link v8.0
Microbial Assembly is our latest pipeline, specifically designed to assemble bacterial genomes (between 2 and 10 Mb) and plasmids. This pipeline includes the implementation of a new, circular-aware read alignment…
Analyses of the Complete Genome Sequence of the Strain Bacillus pumilus ZB201701 Isolated from Rhizosphere Soil of Maize under Drought and Salt Stress.
Bacillus pumilus ZB201701 is a rhizobacterium with the potential to promote plant growth and tolerance to drought and salinity stress. We herein present the complete genome sequence of the Gram-positive bacterium B. pumilus ZB201701, which consists of a linear chromosome with 3,640,542 base pairs, 3,608 protein-coding sequences, 24 ribosomal RNAs, and 80 transfer RNAs. Genome analyses using bioinformatics revealed some of the putative gene clusters involved in defense mechanisms. In addition, activity analyses of the strain under salt and simulated drought stress suggested its potential tolerance to abiotic stress. Plant growth-promoting bacteria-based experiments indicated that the strain promotes the salt tolerance of maize. The complete genome of B. pumilus ZB201701 provides valuable insights into rhizobacteria-mediated salt and drought tolerance and rhizobacteria-based solutions for abiotic stress in agriculture.
Phylogenetic reconciliation reveals the natural history of glycopeptide antibiotic biosynthesis and resistance.
Glycopeptide antibiotics are produced by Actinobacteria through biosynthetic gene clusters that include genes supporting their regulation, synthesis, export and resistance. The chemical and biosynthetic diversities of glycopeptides are the product of an intricate evolutionary history. Extracting this history from genome sequences is difficult as conservation of the individual components of these gene clusters is variable and each component can have a different trajectory. We show that glycopeptide biosynthesis and resistance in Actinobacteria maps to approximately 150-400 million years ago. Phylogenetic reconciliation reveals that the precursors of glycopeptide biosynthesis are far older than other components, implying that these clusters arose from a pre-existing pool of genes. We find that resistance appeared contemporaneously with biosynthetic genes, raising the possibility that the mechanism of action of glycopeptides was a driver of diversification in these gene clusters. Our results put antibiotic biosynthesis and resistance into an evolutionary context and can guide the future discovery of compounds possessing new mechanisms of action, which are especially needed as the usefulness of the antibiotics available at present is imperilled by human activity.
Genomic analysis of Marinobacter sp. NP-4 and NP-6 isolated from the deep-sea oceanic crust on the western flank of the Mid-Atlantic Ridge
Two Marinobacter sp. NP-4 and NP-6 were isolated from a deep oceanic basaltic crust at North Pond, located at the western flank of the Mid-Atlantic Ridge. These two strains are capable of using multiple carbon sources such as acetate, succinate, glucose and sucrose while take oxygen as a primary electron acceptor. The strain NP-4 is also able to grow anaerobically under 20?MPa, with nitrate as the electron acceptor, thus represents a piezotolerant. To explore the metabolic potentials of Marinobacter sp. NP-4 and NP-6, the complete genome of NP-4 and close-to-complete genome of NP-6 were sequenced. The genome of NP-4 contains one chromosome and two plasmids with the size of 4.6?Mb in total, and with average GC content of 57.0%. The genome of NP-6 is 4.5?Mb and consists of 6 scaffolds, with an average GC content of 57.1%. Complete glycolysis, citrate cycle and aromatics compounds degradation pathways are identified in genomes of these two strains, suggesting that they possess a heterotrophic life style. Additionally, one plasmid of NP-4 contains genes for alkane degradation, phosphonate ABC transporter and cation efflux system, enabling NP-4 extra surviving abilities. In total, genomic information of these two strains provide insights into the physiological features and adaptation strategies of Marinobacter spp. in the deep oceanic crust biosphere.
Detection of transferable oxazolidinone resistance determinants in Enterococcus faecalis and Enterococcus faecium of swine origin in Sichuan Province, China.
The aim of this study was to detect the transferable oxazolidinone resistance determinants (cfr, optrA and poxtA) in E. faecalis and E. faecium of swine origin in Sichuan Province, China.A total of 158 enterococci strains (93 E. faecalis and 65 E. faecium) isolated from 25 large-scale swine farms were screened for the presence of cfr, optrA and poxtA by PCR. The genetic environments of cfr, optrA and poxtA were characterized by whole genome sequencing. Transfer of oxazolidinone resistance determinants was determined by conjugation or electrotransformation experiments.The transferable oxazolidinone resistance determinants, cfr, optrA and poxtA, were detected in zero, six, and one enterococci strains, respectively. The poxtA in one E. faecalis strain was located on a 37,990 bp plasmid, which co-harbored fexB, cat, tet(L) and tet(M), and could be conjugated to E. faecalis JH2-2. One E. faecalis strain harbored two different OptrA variants, including one variant with a single substitution, Q219H, which has not been reported previously. Two optrA-carrying plasmids, pC25-1, with a size of 45,581 bp, and pC54, with a size of 64,500 bp, shared a 40,494 bp identical region that contained genetic context IS1216E-fexA-optrA-erm(A)-IS1216E, which could be electrotransformed into Staphylococcus aureus. Four different chromosomal optrA gene clusters were found in five strains, in which optrA was associated with Tn554 or Tn558 that were inserted into the radC gene.Our study highlights the fact that mobile genetic elements, such as plasmids, IS1216E, Tn554 and Tn558, may facilitate the horizontal transmission of optrA or poxtA.Copyright © 2019. Published by Elsevier Ltd.
In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.
Janibacter limosus P3-3-X1, a psychrotolerant deep-sea actinobacterium isolated from the Southern Ocean, was completely sequenced and analyzed for its biotechnological potential in bioremediation and natural product biosynthesis. The circular genome contained 3.5?Mb with a high GC content of 70.44?mol%. Genomic data mining revealed a gene cluster for degrading phenol and its derivatives, including a multi-component phenol hydroxylase and a meta-cleavage pathway. The strain was shown to grow on phenol as its sole carbon source, supporting the findings of genomic analysis. Many more genes encoding for monooxygenases, dioxygenases and other aromatic compound degradation proteins involved in xenobiotics degradation were detected. Multiple natural product biosynthesis gene clusters were predicted as well. The genome sequencing and data mining provide insights into the bioremediation ability and biosynthetic potential of the Antarctic actinobacterium, and promote further experimental verification and exploration.
The use of Online Tools for Antimicrobial Resistance Prediction by Whole Genome Sequencing in MRSA and VRE.
The antimicrobial resistance (AMR) crisis represents a serious threat to public health and has resulted in concentrated efforts to accelerate development of rapid molecular diagnostics for AMR. In combination with publicly-available web-based AMR databases, whole genome sequencing (WGS) offers the capacity for rapid detection of antibiotic resistance genes. Here we studied the concordance between WGS-based resistance prediction and phenotypic susceptibility testing results for methicillin-resistant Staphylococcus aureus (MRSA) and vancomycin resistant Enterococcus (VRE) clinical isolates using publicly-available tools and databases.Clinical isolates prospectively collected at the University of Pittsburgh Medical Center between December 2016 and December 2017 underwent WGS. Antibiotic resistance gene content was assessed from assembled genomes by BLASTn search of online databases. Concordance between WGS-predicted resistance profile and phenotypic susceptibility as well as sensitivity, specificity, positive and negative predictive values (NPV, PPV) were calculated for each antibiotic/organism combination, using the phenotypic results as the gold standard.Phenotypic susceptibility testing and WGS results were available for 1242 isolate/antibiotic combinations. Overall concordance was 99.3% with a sensitivity, specificity, PPV, NPV of 98.7% (95% CI, 97.2-99.5%), 99.6% (95 % CI, 98.8-99.9%), 99.3% (95% CI, 98.0-99.8%), 99.2% (95% CI, 98.3-99.7%), respectively. Additional identification of point mutations in housekeeping genes increased the concordance to 99.4% and the sensitivity to 99.3% (95% CI, 98.2-99.8%) and NPV to 99.4% (95% CI, 98.4-99.8%).WGS can be used as a reliable predicator of phenotypic resistance for both MRSA and VRE using readily-available online tools.Copyright © 2019. Published by Elsevier Ltd.
Complete genome sequence of Bacillus velezensis JT3-1, a microbial germicide isolated from yak feces
Bacillus velezensis JT3-1 is a probiotic strain isolated from feces of the domestic yak (Bos grunniens) in the Gansu province of China. It has strong antagonistic activity against Listeria monocytogenes, Staphylococcus aureus, Escherichia coli, Salmonella Typhimurium, Mannheimia haemolytica, Staphylococcus hominis, Clostridium perfringens, and Mycoplasma bovis. These properties have made the JT3-1 strain the focus of commercial interest. In this study, we describe the complete genome sequence of JT3-1, with a genome size of 3,929,799 bp, 3761 encoded genes and an average GC content of 46.50%. Whole genome sequencing of Bacillus velezensis JT3-1 will lay a good foundation for elucidation of the mechanisms of its antimicrobial activity, and for its future application.