Menu
July 7, 2019

Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements.We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements.Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ~22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements.We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats.© 2017 Botanical Society of America.


July 7, 2019

Phenotypic and genomic comparison of Mycobacterium aurum and surrogate model species to Mycobacterium tuberculosis: implications for drug discovery.

Tuberculosis (TB) is caused by Mycobacterium tuberculosis and represents one of the major challenges facing drug discovery initiatives worldwide. The considerable rise in bacterial drug resistance in recent years has led to the need of new drugs and drug regimens. Model systems are regularly used to speed-up the drug discovery process and circumvent biosafety issues associated with manipulating M. tuberculosis. These include the use of strains such as Mycobacterium smegmatis and Mycobacterium marinum that can be handled in biosafety level 2 facilities, making high-throughput screening feasible. However, each of these model species have their own limitations.We report and describe the first complete genome sequence of Mycobacterium aurum ATCC23366, an environmental mycobacterium that can also grow in the gut of humans and animals as part of the microbiota. This species shows a comparable resistance profile to that of M. tuberculosis for several anti-TB drugs. The aims of this study were to (i) determine the drug resistance profile of a recently proposed model species, Mycobacterium aurum, strain ATCC23366, for anti-TB drug discovery as well as Mycobacterium smegmatis and Mycobacterium marinum (ii) sequence and annotate the complete genome sequence of this species obtained using Pacific Bioscience technology (iii) perform comparative genomics analyses of the various surrogate strains with M. tuberculosis (iv) discuss how the choice of the surrogate model used for drug screening can affect the drug discovery process.We describe the complete genome sequence of M. aurum, a surrogate model for anti-tuberculosis drug discovery. Most of the genes already reported to be associated with drug resistance are shared between all the surrogate strains and M. tuberculosis. We consider that M. aurum might be used in high-throughput screening for tuberculosis drug discovery. We also highly recommend the use of different model species during the drug discovery screening process.


July 7, 2019

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters.

Trichoderma reesei (Ascomycota, Pezizomycotina) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis, and plant-fungal interactions. The genomes of QM6a and its high enzyme-producing mutants have been sequenced by second-generation-sequencing methods and are publicly available from the Joint Genome Institute. While these genome sequences have offered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architecture, and genetic modification or engineering.We integrated Pacific Biosciences and Illumina sequencing platforms for the highest-quality genome assembly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres, and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These findings open up a new perspective for future centromere and chromosome architecture studies. Next, we demonstrate that sexual crossing readily induced cytosine-to-thymine point mutations on both tandem and unlinked duplicated sequences. We also show by bioinformatic analysis that T. reesei has evolved a robust repeat-induced point mutation (RIP) system to accumulate AT-rich sequences, with longer AT-rich blocks having more RIP mutations. The widespread distribution of AT-rich blocks correlates genome-wide partitions with gene clusters, explaining why clustering of genes has been reported to not influence gene expression in T. reesei.Compartmentation of ancestral gene clusters by AT-rich blocks might promote flexibilities that are evolutionarily advantageous in this fungus’ soil habitats and other natural environments. Our analyses, together with the complete genome sequence, provide a better blueprint for biotechnological and industrial applications.


July 7, 2019

Comparative genomic analysis of Acinetobacter strains isolated from murine colonic crypts.

A restricted set of aerobic bacteria dominated by the Acinetobacter genus was identified in murine intestinal colonic crypts. The vicinity of such bacteria with intestinal stem cells could indicate that they protect the crypt against cytotoxic and genotoxic signals. Genome analyses of these bacteria were performed to better appreciate their biodegradative capacities.Two taxonomically different clusters of Acinetobacter were isolated from murine proximal colonic crypts, one was identified as A. modestus and the other as A. radioresistens. Their identification was performed through biochemical parameters and housekeeping gene sequencing. After selection of one strain of each cluster (A. modestus CM11G and A. radioresistens CM38.2), comparative genomic analysis was performed on whole-genome sequencing data. The antibiotic resistance pattern of these two strains is different, in line with the many genes involved in resistance to heavy metals identified in both genomes. Moreover whereas the operon benABCDE involved in benzoate metabolism is encoded by the two genomes, the operon antABC encoding the anthranilate dioxygenase, and the phenol hydroxylase gene cluster are absent in the A. modestus genomic sequence, indicating that the two strains have different capacities to metabolize xenobiotics. A common feature of the two strains is the presence of a type IV pili system, and the presence of genes encoding proteins pertaining to secretion systems such as Type I and Type II secretion systems.Our comparative genomic analysis revealed that different Acinetobacter isolated from the same biological niche, even if they share a large majority of genes, possess unique features that could play a specific role in the protection of the intestinal crypt.


July 7, 2019

Genomic and transcriptomic analyses of Agrobacterium tumefaciens S33 reveal the molecular mechanism of a novel hybrid nicotine-degrading pathway.

Agrobacterium tumefaciens S33 is able to degrade nicotine via a novel hybrid of the pyridine and pyrrolidine pathways. It can be utilized to remove nicotine from tobacco wastes and transform nicotine into important functionalized pyridine precursors for some valuable drugs and insecticides. However, the molecular mechanism of the hybrid pathway is still not completely clear. Here we report the genome analysis of strain S33 and its transcriptomes grown in glucose-ammonium medium and nicotine medium. The complete gene cluster involved in nicotine catabolism was found to be located on a genomic island composed of genes functionally similar but not in sequences to those of the pyridine and pyrrolidine pathways, as well as genes encoding plasmid partitioning and replication initiation proteins, conjugal transfer proteins and transposases. This suggests that the evolution of this hybrid pathway is not a simple fusion of the genes involved in the two pathways, but the result of a complicated lateral gene transfer. In addition, other genes potentially involved in the hybrid pathway could include those responsible for substrate sensing and transport, transcription regulation and electron transfer during nicotine degradation. This study provides new insights into the molecular mechanism of the novel hybrid pathway for nicotine degradation.


July 7, 2019

Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology.

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70?kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells.


July 7, 2019

Adaptive evolution of a hyperthermophilic archaeon pinpoints a formate transporter as a critical factor for the growth enhancement on formate.

Previously, we reported that the hyperthermophilic archaeon Thermococcus onnurineus NA1 could grow on formate and produce H2. Formate conversion to hydrogen was mediated by a formate-hydrogen lyase complex and was indeed a part of chemiosmotic coupling to ATP generation. In this study, we employed an adaptation approach to enhance the cell growth on formate and investigated molecular changes. As serial transfer continued on formate-containing medium at the serum vial, cell growth, H2 production and formate consumption increased remarkably. The 156 times transferred-strain, WTF-156T, was demonstrated to enhance H2 production using formate in a bioreactor. The whole-genome sequencing of the WTF-156T strain revealed eleven mutations. While no mutation was found among the genes encoding formate hydrogen lyase, a point mutation (G154A) was identified in a formate transporter (TON_1573). The TON_1573 (A52T) mutation, when introduced into the parent strain, conferred increase in formate consumption and H2 production. Another adaptive passage, carried out by culturing repeatedly in a bioreactor, resulted in a strain, which has a mutation in TON_1573 (C155A) causing amino acid change, A52E. These results implicate that substitution of A52 residue of a formate transporter might be a critical factor to ensure the increase in formate uptake and cell growth.


July 7, 2019

Genomic insights into the virulence and salt tolerance of Staphylococcus equorum.

To shed light on the genetic background behind the virulence and salt tolerance of Staphylococcus equorum, we performed comparative genome analysis of six S. equorum strains. Data on four previously published genome sequences were obtained from the NCBI database, while those on strain KM1031 displaying resistance to multiple antibiotics and strain C2014 causing haemolysis were determined in this study. Examination of the pan-genome of five of the six S. equorum strains showed that the conserved core genome retained the genes for general physiological processes and survival of the species. In this comparative genomic analysis, the factors that distinguish the strains from each other, including acquired genomic factors in mobile elements, were identified. Additionally, the high salt tolerance of strains enabling growth at a NaCl concentration of 25% (w/v) was attributed to the genes encoding potassium voltage-gated channels. Among the six strains, KS1039 does not possess any of the functional virulence determinants expressed in the other strains.


July 7, 2019

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

Candida krusei is a diploid, heterozygous yeast that is an opportunistic fungal pathogen in immunocompromised patients. This species also is utilized for fermenting cocoa beans during chocolate production. One major concern in the clinical setting is the innate resistance of this species to the most commonly used antifungal drug fluconazole. Here we report a high-quality genome sequence and assembly for the first clinical isolate of C. krusei, strain 81-B-5, into 11 scaffolds generated with PacBio sequencing technology. Gene annotation and comparative analysis revealed a unique profile of transporters that could play a role in drug resistance or adaptation to different environments. In addition, we show that while 82% of the genome is highly heterozygous, a 2.0 Mb region of the largest scaffold has undergone loss of heterozygosity. This genome will serve as a reference for further genetic studies of this pathogen. Copyright © 2017 Author et al.


July 7, 2019

Plasmid composition in Aeromonas salmonicida subsp. salmonicida 01-B526 unravels unsuspected type three secretion system loss patterns.

Aeromonas salmonicida subsp. salmonicida is a ubiquitous psychrophilic waterborne bacterium and a fish pathogen. The numerous mobile elements, especially insertion sequences (IS), in its genome promote rearrangements that impact its phenotype. One of the main virulence factors of this bacterium, its type three secretion system (TTSS), is affected by these rearrangements. In Aeromonas salmonicida subsp. salmonicida most of the TTSS genes are encoded in a single locus on a large plasmid called pAsa5, and may be lost when the bacterium is cultivated at a higher temperature (25 °C), producing non-virulent mutants. In a previous study, pAsa5-rearranged strains that lacked the TTSS locus on pAsa5 were produced using parental strains, including 01-B526. Some of the generated deletions were explained by homologous recombination between ISs found on pAsa5, whereas the others remained unresolved. To investigate those rearrangements, short- and long-read high-throughput sequencing technologies were used on the A. salmonicida subsp. salmonicida 01-B526 whole genome.Whole genome sequencing of the 01-B526 strain revealed that its pAsa5 has an additional IS copy, an ISAS5, compared to the reference strain (A449) sequence, which allowed for a previously unknown rearrangement to occur. It also appeared that 01-B526 bears a second large plasmid, named pAsa9, which shares 40 kbp of highly similar sequences with pAsa5. Following these discoveries, previously unexplained deletions were elucidated by genotyping. Furthermore, in one of the derived strains a fusion of pAsa5 and pAsa9, involving the newly discovered ISAS5 copy, was observed.The loss of TTSS and hence virulence is explained by one consistent mechanism: IS-driven homologous recombination. The similarities between pAsa9 and pAsa5 also provide another example of genetic diversity driven by ISs.


July 7, 2019

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Motivation: Despite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects. There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population. Result: Here, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects. Availability and implementation: Pamir is available at https://github.com/vpc-ccg/pamir. Contact:fhach@sfu.ca, prostatecentre.com or calkan@cs.bilkent.edu.tr Supplementary information:Supplementary data are available at Bioinformatics online.


July 7, 2019

The third restriction-modification system from Thermus aquaticus YT-1: solving the riddle of two TaqII specificities.

Two restriction-modification systems have been previously discovered in Thermus aquaticus YT-1. TaqI is a 263-amino acid (aa) Type IIP restriction enzyme that recognizes and cleaves within the symmetric sequence 5′-TCGA-3′. TaqII, in contrast, is a 1105-aa Type IIC restriction-and-modification enzyme, one of a family of Thermus homologs. TaqII was originally reported to recognize two different asymmetric sequences: 5′-GACCGA-3′ and 5′-CACCCA-3′. We previously cloned the taqIIRM gene, purified the recombinant protein from Escherichia coli, and showed that TaqII recognizes the 5′-GACCGA-3′ sequence only. Here, we report the discovery, isolation, and characterization of TaqIII, the third R-M system from T. aquaticus YT-1. TaqIII is a 1101-aa Type IIC/IIL enzyme and recognizes the 5′-CACCCA-3′ sequence previously attributed to TaqII. The cleavage site is 11/9 nucleotides downstream of the A residue. The enzyme exhibits striking biochemical similarity to TaqII. The 93% identity between their aa sequences suggests that they have a common evolutionary origin. The genes are located on two separate plasmids, and are probably paralogs or pseudoparalogs. Putative positions and aa that specify DNA recognition were identified and recognition motifs for 6 uncharacterized Thermus-family enzymes were predicted.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Unravelling the complete genome of Archangium gephyra DSM 2261T and evolutionary insights into myxobacterial chitinases.

Family Cystobacteraceae is a group of eubacteria within order Myxococcales and class Deltaproteobacteria that includes more than 20 species belonging to 6 genera, that is, Angiococcus, Archangium, Cystobacter, Hyalangium, Melittangium, and Stigmatella. Earlier these members have been classified based on chitin degrading efficiency such as Cystobacter fuscus and Stigmatella aurantiaca, which are efficient chitin degraders, C. violaceus a partial chitin degrader and Archangium gephyra a chitin nondegrader. Here we report the 12.5 Mbp complete genome of A. gephyra DSM 2261T and compare it with four available genomes within the family Cystobacteraceae. Phylogeny and DNA-DNA hybridization studies reveal that A. gephyra is closest to Angiococcus disciformis, C. violaceus and C. ferrugineus, which are partial chitin degraders of the family Cystobacteraceae. Homology studies reveal the conservation of approximately half of the proteins in these genomes, with about 15% unique proteins in each genome. The total carbohydrate-active enzymes (CAZome) analysis reveals the presence of one GH18 chitinase in the A. gephyra genome whereas eight copies are present in C. fuscus and S. aurantiaca. Evolutionary studies of myxobacterial GH18 chitinases reveal that most of them are likely related to Terrabacteria and Proteobacteria whereas the Archangium GH18 homolog shares maximum similarity with those of chitin nondegrading Acidobacteria.© The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


July 7, 2019

Evidence for contemporary switching of the O-antigen gene cluster between Shiga toxin-producing Escherichia coli strains colonizing cattle.

Shiga toxin-producing Escherichia coli (STEC) comprise a group of zoonotic enteric pathogens with ruminants, especially cattle, as the main reservoir. O-antigens are instrumental for host colonization and bacterial niche adaptation. They are highly immunogenic and, therefore, targeted by the adaptive immune system. The O-antigen is one of the most diverse bacterial cell constituents and variation not only exists between different bacterial species, but also between individual isolates/strains within a single species. We recently identified STEC persistently infecting cattle and belonging to the different serotypes O156:H25 (n = 21) and O182:H25 (n = 15) that were of the MLST sequence types ST300 or ST688. These STs differ by a single nucleotide in purA only. Fitness-, virulence-associated genome regions, and CRISPR/CAS (clustered regularly interspaced short palindromic repeats/CRISPR associated sequence) arrays of these STEC O156:H25 and O182:H25 isolates were highly similar, and identical genomic integration sites for the stx converting bacteriophages and the core LEE, identical Shiga toxin converting bacteriophage genes for stx1a, identical complete LEE loci, and identical sets of chemotaxis and flagellar genes were identified. In contrast to this genomic similarity, the nucleotide sequences of the O-antigen gene cluster (O-AGC) regions between galF and gnd and very few flanking genes differed fundamentally and were specific for the respective serotype. Sporadic aEPEC O156:H8 isolates (n = 5) were isolated in temporal and spatial proximity. While the O-AGC and the corresponding 5′ and 3′ flanking regions of these aEPEC isolates were identical to the respective region in the STEC O156:H25 isolates, the core genome, the virulence associated genome regions and the CRISPR/CAS elements differed profoundly. Our cumulative epidemiological and molecular data suggests a recent switch of the O-AGC between isolates with O156:H8 strains having served as DNA donors. Such O-antigen switches can affect the evaluation of a strain’s pathogenic and virulence potential, suggesting that NGS methods might lead to a more reliable risk assessment.


July 7, 2019

A novel inversion in the chloroplast genome of marama (Tylosema esculentum).

Tylosema esculentum (marama bean) is being developed as a possible crop for resource-poor farmers in arid regions of Southern Africa. As part of the molecular characterization of this species, the chloroplast genome has been assembled from next-generation sequencing using both Illumina and Pac-Bio data. The genome is of typical organization with a large single-copy region and a small single-copy region separated by a pair of inverted repeats and covers 161537 bp. It contains a unique inversion not present in any other legumes, even in the closest relatives for which the complete chloroplast genome is available, and two complete copies of the ycf1 gene. These data extend the range of variability of legume chloroplast genomes. The sequencing of multiple individuals has identified two different chloroplast genomes which were geographically separated. The current sampling is limited so that the extent of the intraspecific variation is still to be determined, leaving open the question of legume chloroplast genomes adapted to particular arid environments.© The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.