Menu
September 22, 2019

PBHoover and CigarRoller: a method for confident haploid variant calling on Pacific Biosciences data and its application to heterogeneous population analysis

Motivation: Single Molecule Real-Time (SMRT) sequencing has important and underutilized advantages that amplification-based platforms lack. Lack of systematic error (e.g. GC-bias), complete de novo assembly (including large repetitive regions) without scaffolding, can be mentioned. SMRT sequencing, however suffers from high random error rate and low sequencing depth (older chemistries). Here, we introduce PBHoover, software that uses a heuristic calling algorithm in order to make base calls with high certainty in low coverage regions. This software is also capable of mixed population detection with high sensitivity. PBHoovertextquoterights CigarRoller attachment improves sequencing depth in low-coverage regions through CIGAR-string correction. Results: We tested both modules on 348 M.tuberculosis clinical isolates sequenced on C1 or C2 chemistries. On average, CigarRoller improved percentage of usable read count from 68.9% to 99.98% in C1 runs and from 50% to 99% in C2 runs. Using the greater depth provided by CigarRoller, PBHoover was able to make base and variant calls 99.95% concordant with Sanger calls (QV33). PBHoover also detected antibiotic-resistant subpopulations that went undetected by Sanger. Using C1 chemistry, subpopulations as small as 9% of the total colony can be detected by PBHoover. This provides the most sensitive amplification-free molecular method for heterogeneity analysis and is in line with phenotypic methodstextquoteright sensitivity. This sensitivity significantly improves with the greater depth and lower error rate of the newer chemistries. Availability and Implementation: Executables are freely available under GNU GPL v3+ at http://www.gitlab.com/LPCDRP/pbhoover and http://www.gitlab.com/LPCDRP/CigarRoller. PBHoover is also available on bioconda: https://anaconda.org/bioconda/pbhoover.


September 22, 2019

Periodic variation of mutation rates in bacterial genomes associated with replication timing

The causes and consequences of spatiotemporal variation in mutation rates remain to be explored in nearly all organisms. Here we examine relationships between local mutation rates and replication timing in three bacterial species whose genomes have multiple chromosomes: Vibrio fischeri, Vibrio cholerae, and Burkholderia cenocepacia Following five mutation accumulation experiments with these bacteria conducted in the near absence of natural selection, the genomes of clones from each lineage were sequenced and analyzed to identify variation in mutation rates and spectra. In lineages lacking mismatch repair, base substitution mutation rates vary in a mirrored wave-like pattern on opposing replichores of the large chromosomes of V. fischeri and V. cholerae, where concurrently replicated regions experience similar base substitution mutation rates. The base substitution mutation rates on the small chromosome are less variable in both species but occur at similar rates to those in the concurrently replicated regions of the large chromosome. Neither nucleotide composition nor frequency of nucleotide motifs differed among regions experiencing high and low base substitution rates, which along with the inferred ~800-kb wave period suggests that the source of the periodicity is not sequence specific but rather a systematic process related to the cell cycle. These results support the notion that base substitution mutation rates are likely to vary systematically across many bacterial genomes, which exposes certain genes to elevated deleterious mutational load.IMPORTANCE That mutation rates vary within bacterial genomes is well known, but the detailed study of these biases has been made possible only recently with contemporary sequencing methods. We applied these methods to understand how bacterial genomes with multiple chromosomes, like those of Vibrio and Burkholderia, might experience heterogeneous mutation rates because of their unusual replication and the greater genetic diversity found on smaller chromosomes. This study captured thousands of mutations and revealed wave-like rate variation that is synchronized with replication timing and not explained by sequence context. The scale of this rate variation over hundreds of kilobases of DNA strongly suggests that a temporally regulated cellular process may generate wave-like variation in mutation risk. These findings add to our understanding of how mutation risk is distributed across bacterial and likely also eukaryotic genomes, owing to their highly conserved replication and repair machinery. Copyright © 2018 Dillon et al.


September 22, 2019

A reference genome of the Chinese hamster based on a hybrid assembly strategy.

Accurate and complete genome sequences are essential in biotechnology to facilitate genome-based cell engineering efforts. The current genome assemblies for Cricetulus griseus, the Chinese hamster, are fragmented and replete with gap sequences and misassemblies, consistent with most short-read-based assemblies. Here, we completely resequenced C. griseus using single molecule real time sequencing and merged this with Illumina-based assemblies. This generated a more contiguous and complete genome assembly than either technology alone, reducing the number of scaffolds by >28-fold, with 90% of the sequence in the 122 longest scaffolds. Most genes are now found in single scaffolds, including up- and downstream regulatory elements, enabling improved study of noncoding regions. With >95% of the gap sequence filled, important Chinese hamster ovary cell mutations have been detected in draft assembly gaps. This new assembly will be an invaluable resource for continued basic and pharmaceutical research.© 2018 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc.


September 22, 2019

Analysis of the draft genome of the red seaweed Gracilariopsis chorda provides insights into genome size evolution in Rhodophyta.

Red algae (Rhodophyta) underwent two phases of large-scale genome reduction during their early evolution. The red seaweeds did not attain genome sizes or gene inventories typical of other multicellular eukaryotes. We generated a high-quality 92.1 Mb draft genome assembly from the red seaweed Gracilariopsis chorda, including methylation and small (s)RNA data. We analyzed these and other Archaeplastida genomes to address three questions: 1) What is the role of repeats and transposable elements (TEs) in explaining Rhodophyta genome size variation, 2) what is the history of genome duplication and gene family expansion/reduction in these taxa, and 3) is there evidence for TE suppression in red algae? We find that the number of predicted genes in red algae is relatively small (4,803-13,125 genes), particularly when compared with land plants, with no evidence of polyploidization. Genome size variation is primarily explained by TE expansion with the red seaweeds having the largest genomes. Long terminal repeat elements and DNA repeats are the major contributors to genome size growth. About 8.3% of the G. chorda genome undergoes cytosine methylation among gene bodies, promoters, and TEs, and 71.5% of TEs contain methylated-DNA with 57% of these regions associated with sRNAs. These latter results suggest a role for TE-associated sRNAs in RNA-dependent DNA methylation to facilitate silencing. We postulate that the evolution of genome size in red algae is the result of the combined action of TE spread and the concomitant emergence of its epigenetic suppression, together with other important factors such as changes in population size.


September 22, 2019

The plant growth-promoting rhizobacterium Variovorax boronicumulans CGMCC 4969 regulates the level of indole-3-acetic acid synthesized from indole-3-acetonitrile.

Variovorax is a metabolically diverse genus of plant growth-promoting rhizobacteria (PGPR) that engages in mutually beneficial interactions between plants and microbes. Unlike most PGPR, Variovorax cannot synthesize the phytohormone indole-3-acetic acid (IAA) via tryptophan. However, we found that V. boronicumulans strain CGMCC 4969 could produce IAA using indole-3-acetonitrile (IAN) as the precursor. Thus, in the present study, the IAA synthesis mechanism of V. boronicumulans CGMCC 4969 was investigated. V. boronicumulans CGMCC 4969 metabolized IAN to IAA through both a nitrilase-dependent pathway and a nitrile hydratase (NHase) and amidase-dependent pathway. Cobalt enhanced the metabolic flux via the NHase/amidase, by which IAN was rapidly converted to indole-3-acetamide (IAM) and in turn to IAA. IAN stimulated the metabolic flux via the nitrilase, by which IAN was rapidly converted to IAA. Subsequently, the IAA was degraded. V. boronicumulans CGMCC 4969 could use IAN as the sole carbon and nitrogen source for growth. Genome sequencing confirmed the IAA synthesis pathways. Gene cloning and overexpression in Escherichia coli indicated that NitA has the nitrilase activity, and IamA has the amidase activity to respectively transform IAN and IAM to IAA. Interestingly, NitA showed a close genetic relationship with the nitrilase of the phytopathogen Pseudomonas syringae Quantitative PCR analysis indicated that the NHase/amidase system is constitutively expressed, whereas the nitrilase is inducible. The present study helps our understanding of the versatile functions of Variovorax nitrile-converting enzymes that mediate IAA synthesis and the interactions between plants and these bacteria.IMPORTANCE We demonstrated that Variovorax boronicumulans CGMCC 4969 has two enzymatic systems-nitrilase and nitrile hydratase/amidase-that convert indole-3-acetonitrile (IAN) to the important plant hormone indole-3-acetic acid (IAA). The two IAA synthesis systems have very different regulatory mechanisms, affecting the IAA synthesis rate and duration. The nitrilase was induced by IAN, which was rapidly converted to IAA; subsequently IAA was rapidly consumed for cell growth. The NHase and amidase system was constitutively expressed and slowly but continuously synthesized IAA. In addition to synthesizing IAA from IAN, CGMCC 4969 has a rapid IAA degradation system, which would be helpful for a host plant to eliminate redundant IAA. This study indicates that the plant growth-promoting rhizobacterium V. boronicumulans CGMCC 4969 has the potential to be used by host plants to regulate the IAA level. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Whole-genome sequencing and comparative analysis of two plant-associated strains of Rhodopseudomonas palustris (PS3 and YSC3).

Rhodopseudomonas palustris strains PS3 and YSC3 are purple non-sulfur phototrophic bacteria isolated from Taiwanese paddy soils. PS3 has beneficial effects on plant growth and enhances the uptake efficiency of applied fertilizer nutrients. In contrast, YSC3 has no significant effect on plant growth. The genomic structures of PS3 and YSC3 are similar; each contains one circular chromosome that is 5,269,926 or 5,371,816?bp in size, with 4,799 or 4,907 protein-coding genes, respectively. In this study, a large class of genes involved in chemotaxis and motility was identified in both strains, and genes associated with plant growth promotion, such as nitrogen fixation-, IAA synthesis- and ACC deamination-associated genes, were also identified. We noticed that the growth rate, the amount of biofilm formation, and the relative expression levels of several chemotaxis-associated genes were significantly higher for PS3 than for YSC3 upon treatment with root exudates. These results indicate that PS3 responds better to the presence of plant hosts, which may contribute to the successful interactions of PS3 with plant hosts. Moreover, these findings indicate that the existence of gene clusters associated with plant growth promotion is required but not sufficient for a bacterium to exhibit phenotypes associated with plant growth promotion.


September 22, 2019

Genomic analysis of the insect-killing fungus Beauveria bassiana JEF-007 as a biopesticide.

Insect-killing fungi have high potential in pest management. A deeper insight into the fungal genes at the whole genome level is necessary to understand the inter-species or intra-species genetic diversity of fungal genes, and to select excellent isolates. In this work, we conducted a whole genome sequencing of Beauveria bassiana (Bb) JEF-007 and characterized pathogenesis-related features and compared with other isolates including Bb ARSEF2860. A large number of Bb JEF-007 genes showed high identity with Bb ARSEF2860, but some genes showed moderate or low identity. The two Bb isolates showed a significant difference in vegetative growth, antibiotic-susceptibility, and virulence against Tenebrio molitor larvae. When highly identical genes between the two Bb isolates were subjected to real-time PCR, their transcription levels were different, particularly in heat shock protein 30 (hsp30) gene which is related to conidial thermotolerance. In several B. bassiana isolates, chitinases and trypsin-like protease genes involved in pathogenesis were highly conserved, but other genes showed noticeable sequence variation within the same species. Given the transcriptional and genetic diversity in B. bassiana, a selection of virulent isolates with industrial advantages is a pre-requisite, and this genetic approach could support the development of excellent biopesticides with intellectual property protection.


September 22, 2019

A gene-rich fraction analysis of the Passiflora edulis genome reveals highly conserved microsyntenic regions with two related Malpighiales species.

Passiflora edulis is the most widely cultivated species of passionflowers, cropped mainly for industrialized juice production and fresh fruit consumption. Despite its commercial importance, little is known about the genome structure of P. edulis. To fill in this gap in our knowledge, a genomic library was built, and now completely sequenced over 100 large-inserts. Sequencing data were assembled from long sequence reads, and structural sequence annotation resulted in the prediction of about 1,900 genes, providing data for subsequent functional analysis. The richness of repetitive elements was also evaluated. Microsyntenic regions of P. edulis common to Populus trichocarpa and Manihot esculenta, two related Malpighiales species with available fully sequenced genomes were examined. Overall, gene order was well conserved, with some disruptions of collinearity identified as rearrangements, such as inversion and translocation events. The microsynteny level observed between the P. edulis sequences and the compared genomes is surprising, given the long divergence time that separates them from the common ancestor. P. edulis gene-rich segments are more compact than those of the other two species, even though its genome is much larger. This study provides a first accurate gene set for P. edulis, opening the way for new studies on the evolutionary issues in Malpighiales genomes.


September 22, 2019

First draft genome assembly of the Argane tree (Argania spinosa)

Background: The Argane tree (Argania spinosa L. Skeels) is an endemic tree of southwestern Morocco that plays an important socioeconomic and ecologic role for a dense human population in an arid zone. Several studies confirmed the importance of this species as a food and feed source and as a resource for both pharmaceutical and cosmetic compounds. Unfortunately, the argane tree ecosystem is facing significant threats from environmental changes (global warming, over-population) and over-exploitation. Limited research has been conducted, however, on argane tree genetics and genomics, which hinders its conservation and genetic improvement. Methods: Here, we present a draft genome assembly of A. spinosa. A reliable reference genome of A. spinosa was created using a hybrid de novo assembly approach combining short and long sequencing reads. Results: In total, 144 Gb Illumina HiSeq reads and 7.2 Gb PacBio reads were produced and assembled. The final draft genome comprises 75 327 scaffolds totaling 671 Mb with an N50 of 49 916 kb. The draft assembly is close to the genome size estimated by k-mers distribution and covers 89% of complete and 4.3 % of partial Arabidopsis orthologous groups in BUSCO. Conclusion: The A. spinosa genome will be useful for assessing biodiversity leading to efficient conservation of this endangered endemic tree. Furthermore, the genome may enable genome-assisted cultivar breeding, and provide a better understanding of important metabolic pathways and their underlying genes for both cosmetic and pharmacological purposes.


September 22, 2019

Comparative genomics reveals diverse capsular polysaccharide synthesis gene clusters in emerging Raoultella planticola.

Raoultella planticola is an emerging zoonotic pathogen that is associated with rare but life-threatening cases of bacteremia, biliary tract infections, and urinary tract infections. Moreover, increasing antimicrobial resistance in the organism poses a potential threat to public health. In spite of its importance as a human pathogen, the genome of R. planticola remains largely unexplored and little is known about its virulence factors. Although lipopolysaccharides has been detected in R. planticola and implicated in the virulence in earlier studies, the genetic background is unknown. Here, we report the complete genome and comparative analysis of the multidrug-resistant clinical isolate R. planticola GODA. The complete genome sequence of R. planticola GODA was sequenced using single-molecule real-time DNA sequencing. Comparative genomic analysis reveals distinct capsular polysaccharide synthesis gene clusters in R. planticola GODA. In addition, we found bla TEM-57 and multiple transporters related to multidrug resistance. The availability of genomic data in open databases of this emerging zoonotic pathogen, in tandem with our comparative study, provides better understanding of R. planticola and the basis for future work.


September 22, 2019

Distinct genomic features characterize two clades of Corynebacterium diphtheriae: Proposal of Corynebacterium diphtheriae subsp. diphtheriae subsp. nov. and Corynebacterium diphtheriae subsp. lausannense subsp. nov.

Corynebacterium diphtheriae is the etiological agent of diphtheria, a disease caused by the presence of the diphtheria toxin. However, an increasing number of records report non-toxigenic C. diphtheriae infections. Here, a C. diphtheriae strain was recovered from a patient with a past history of bronchiectasis who developed a severe tracheo-bronchitis with multiple whitish lesions of the distal trachea and the mainstem bronchi. Whole-genome sequencing (WGS), performed in parallel with PCR targeting the toxin gene and the Elek test, provided clinically relevant results in a short turnaround time, showing that the isolate was non-toxigenic. A comparative genomic analysis of the new strain (CHUV2995) with 56 other publicly available genomes of C. diphtheriae revealed that the strains CHUV2995, CCUG 5865 and CMCNS703 share a lower average nucleotide identity (ANI) (95.24 to 95.39%) with the C. diphtheriae NCTC 11397T reference genome than all other C. diphtheriae genomes (>98.15%). Core genome phylogeny confirmed the presence of two monophyletic clades. Based on these findings, we propose here two new C. diphtheriae subspecies to replace the lineage denomination used in previous multilocus sequence typing studies: C. diphtheriae subsp. lausannense subsp. nov. (instead of lineage-2), regrouping strains CHUV2995, CCUG 5865, and CMCNS703, and C. diphtheriae subsp. diphtheriae subsp. nov, regrouping all other C. diphtheriae in the dataset (instead of lineage-1). Interestingly, members of subspecies lausannense displayed a larger genome size than subspecies diphtheriae and were enriched in COG categories related to transport and metabolism of lipids (I) and inorganic ion (P). Conversely, they lacked all genes involved in the synthesis of pili (SpaA-type, SpaD-type and SpaH-type), molybdenum cofactor and of the nitrate reductase. Finally, the CHUV2995 genome is particularly enriched in mobility genes and harbors several prophages. The genome encodes a type II-C CRISPR-Cas locus with 2 spacers that lacks csn2 or cas4, which could hamper the acquisition of new spacers and render strain CHUV2995 more susceptible to bacteriophage infections and gene acquisition through various mechanisms of horizontal gene transfer.


September 22, 2019

Natural selection in bats with historical exposure to white-nose syndrome

Hibernation allows animals to survive periods of resource scarcity by reducing their energy expenditure through decreased metabolism. However, hibernators become susceptible to psychrophilic pathogens if they cannot mount an efficient immune response to infection. While Nearctic bats infected with white-nose syndrome (WNS) suffer high mortality, related Palearctic taxa are better able to survive the disease than their Nearctic counterparts. We hypothesised that WNS exerted historical selective pressure in Palearctic bats, resulting in genomic changes that promote infection tolerance.


September 22, 2019

Comparative genomics reveal a flagellar system, a type VI secretion system and plant growth-promoting gene clusters unique to the endophytic bacterium Kosakonia radicincitans.

The recent worldwide discovery of plant growth-promoting (PGP) Kosakonia radicincitans in a large variety of crop plants suggests that this species confers significant influence on plants, both in terms of yield increase and product quality improvement. We provide a comparative genome analysis which helps to unravel the genetic basis for K. radicincitans’ motility, competitiveness and plant growth-promoting capacities. We discovered that K. radicincitans carries multiple copies of complex gene clusters, among them two flagellar systems and three type VI secretion systems (T6SSs). We speculate that host invasion may be facilitated by different flagella, and bacterial competitor suppression by effector proteins ejected via T6SSs. We found a large plasmid in K. radicincitans DSM 16656T, the species type strain, that confers the potential to exploit plant-derived carbon sources. We propose that multiple copies of complex gene clusters in K. radicincitans are metabolically expensive but provide competitive advantage over other bacterial strains in nutrient-rich environments. The comparison of the DSM 16656T genome to genomes of other genera of enteric plant growth-promoting bacteria (PGPB) exhibits traits unique to DSM 16656T and K. radicincitans, respectively, and traits shared between genera. We used the output of the in silico analysis for predicting the purpose of genomic features unique to K. radicincitans and performed microarray, PhyloChip, and microscopical analyses to gain deeper insight into the interaction of DSM 16656T, plants and associated microbiota. The comparative genome analysis will facilitate the future search for promising candidates of PGPB for sustainable crop production.


September 22, 2019

Hepacivirus A infection in horses defines distinct envelope hypervariable regions and elucidates potential roles of viral strain and adaptive immune status in determining envelope diversity and infection outcome.

Hepacivirus A (also known as nonprimate hepacivirus and equine hepacivirus) is a hepatotropic virus that can cause both transient and persistent infections in horses. The evolution of intrahost viral populations (quasispecies) has not been studied in detail for hepacivirus A, and its roles in immune evasion and persistence are unknown. To address these knowledge gaps, we first evaluated the envelope gene (E1 and E2) diversity of two different hepacivirus A strains (WSU and CU) in longitudinal blood samples from experimentally infected adult horses, juvenile horses (foals), and foals with severe combined immunodeficiency (SCID). Persistent infection with the WSU strain was associated with significantly greater quasispecies diversity than that observed in horses who spontaneously cleared infection (P = 0.0002) or in SCID foals (P < 0.0001). In contrast, the CU strain was able to persist despite significantly lower (P < 0.0001) and relatively static envelope diversity. These findings indicate that envelope diversity is a poor predictor of hepacivirus A infection outcomes and could be dependent on strain-specific factors. Next, entropy analysis was performed on all E1/E2 genes entered into GenBank. This analysis defined three novel hypervariable regions (HVRs) in E2, at residues 391 to 402 (HVR1), 450 to 461 (HVR2), and 550 to 562 (HVR3). For the experimentally infected horses, entropy analysis focusing on the HVRs demonstrated that these regions were under increased selective pressure during persistent infection. Increased diversity in the HVRs was also temporally associated with seroconversion in some horses, suggesting that these regions may be targets of neutralizing antibody and may play a role in immune evasion.IMPORTANCE Hepacivirus C (hepatitis C virus) is estimated to infect 150 million people worldwide and is a leading cause of cirrhosis and hepatocellular carcinoma. In contrast, its closest relative, hepacivirus A, causes relatively mild disease in horses and is frequently cleared. The relationship between quasispecies evolution and infection outcome has not been explored for hepacivirus A. To address this knowledge gap, we examined envelope gene diversity in horses with resolving and persistent infections. Interestingly, two strain-specific patterns of quasispecies diversity emerged. Persistence of the WSU strain was associated with increased quasispecies diversity and the accumulation of amino acid changes within three novel hypervariable regions following seroconversion. These findings provided evidence that envelope gene mutation is influenced by adaptive immune pressure and may contribute to hepacivirus persistence. However, the CU strain persisted despite relative evolutionary stasis, suggesting that some hepacivirus strains may use alternative mechanisms to persist in the host. Copyright © 2018 American Society for Microbiology.


September 22, 2019

Long-term colonization dynamics of Enterococcus faecalis in implanted devices in research macaques.

Enterococcus faecalis is a common opportunistic pathogen that colonizes cephalic recording chambers (CRCs) of macaques used in cognitive neuroscience research. We previously characterized 15 E. faecalis strains isolated from macaques at the Massachusetts Institute of Technology (MIT) in 2011. The goal of this study was to examine how a 2014 protocol change prohibiting the use of antimicrobials within CRCs affected colonizing E. faecalis strains. We collected 20 E. faecalis isolates from 10 macaques between 2013 and 2017 for comparison to 4 isolates previously characterized in 2011 with respect to the sequence type (ST) distribution, antimicrobial resistance, biofilm formation, and changes in genes that might confer a survival advantage. ST4 and ST55 were predominant among the isolates characterized in 2011, whereas the less antimicrobial-resistant lineage ST48 emerged to dominance after 2013. Two macaques remained colonized by ST4 and ST55 strains for 5 and 4 years, respectively. While the antimicrobial resistance and virulence factors identified in these ST4 and ST55 strains remained relatively stable, we detected an increase in biofilm formation ability over time in both isolates. We also found that ST48 strains were typically robust biofilm formers, which could explain why this ST increased in prevalence. Finally, we identified mutations in the DNA mismatch repair genes mutS and mutL in separate ST55 and ST4 strains and confirmed that strains bearing these mutations displayed a hypermutator phenotype. The presence of a hypermutator phenotype may complicate future antimicrobial treatment for clinically relevant E. faecalis infections in macaques.IMPORTANCEEnterococcus faecalis is a common cause of health care-associated infections in humans, largely due to its ability to persist in the hospital environment, colonize patients, acquire antimicrobial resistance, and form biofilms. Understanding how enterococci evolve in health care settings provides insight into factors affecting enterococcal survival and persistence. Macaques used in neuroscience research have long-term cranial implants that, despite best practices, often become colonized by E. faecalis This provides a unique opportunity to noninvasively examine the evolution of enterococci on a long-term indwelling device. We collected E. faecalis strains from cephalic implants over a 7-year period and characterized the sequence type, antimicrobial resistance, virulence factors, biofilm production, and hypermutator phenotypes. Improved antimicrobial stewardship allowed a less-antimicrobial-resistant E. faecalis strain to predominate at the implant interface, potentially improving antimicrobial treatment outcomes if future clinical infections occur. Biofilm formation appears to play an important role in the persistence of the E. faecalis strains associated with these implants. Copyright © 2018 American Society for Microbiology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.