DNA is under constant stress from both endogenous and exogenous sources. DNA base modifications resulting from various types of DNA damage are wide-spread and play important roles in affecting physiological states and disease phenotypes. Examples include oxidative damage (8- oxoguanine, 8-oxoadenine; aging, Alzheimer’s, Parkinson’s), alkylation (1-methyladenine, 6-O- methylguanine; cancer), adduct formation (benzo[a]pyrene diol epoxide (BPDE), pyrimidine dimers; smoking, industrial chemical exposure, chemical UV light exposure, cancer), and ionizing radiation damage (5-hydroxycytosine, 5- hydroxyuracil, 5-hydroxymethyluracil; cancer). Currently, these and other products of DNA damage cannot be sequenced with existing sequencing methods. In contrast, single molecule, real-time (SMRT) DNA sequencing can report on modified DNA bases through an analysis of the DNA polymerase kinetics that is affected by a modified base in the template. We demonstrate the DNA strand-resolved sequencing of over 8 different DNA-damage associated base modifications, with base pair resolution and single DNA molecule sensitivity. We also report on the application of this sequencing capability to biological samples and the development of a generic, open-source algorithm to analyze kinetic information from SMRT sequencing.
A novel Gram-stain-positive, motile, white color and endospore-forming bacterium, designated 18JY67-1T, was isolated from soil in Jeju Island, Korea. The strain grow at 15-42 °C (optimum 30 °C) in R2A medium at pH (6.0-9.5) (optimum 7.5). Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain 18JY67-1T formed a distinct lineage within the family Paenibacillaceae (order Bacillales, class Bacilli), and was closely related to Paenibacillus rhizoryzae (KP675984; 96.9% 16S rRNA gene sequence similarity). The major cellular fatty acids of the strain 18JY67-1T were C16:0 and anteiso-C15:0. The predominant respiratory quinones were MK-7. The major polar lipid was identified as diphosphatidylglycerol. On the basis of phenotypic, chemotaxonomic and genotypic properties clearly indicated that isolate 18JY67-1T represents a novel species within the genus Paenibacillus, for which the name Paenibacillus flavus sp. nov. is proposed. The type strain of Paenibacillus flavus is 18JY67-1T (=?KCTC 33959T =?JCM 33184T).
Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.
Harnessing long-read amplicon sequencing to uncover NRPS and Type I PKS gene sequence diversity in polar desert soils.
The severity of environmental conditions at Earth’s frigid zones present attractive opportunities for microbial biomining due to their heightened potential as reservoirs for novel secondary metabolites. Arid soil microbiomes within the Antarctic and Arctic circles are remarkably rich in Actinobacteria and Proteobacteria, bacterial phyla known to be prolific producers of natural products. Yet the diversity of secondary metabolite genes within these cold, extreme environments remain largely unknown. Here, we employed amplicon sequencing using PacBio RS II, a third generation long-read platform, to survey over 200 soils spanning twelve east Antarctic and high Arctic sites for natural product-encoding genes, specifically targeting non-ribosomal peptides (NRPS) and Type I polyketides (PKS). NRPS-encoding genes were more widespread across the Antarctic, whereas PKS genes were only recoverable from a handful of sites. Many recovered sequences were deemed novel due to their low amino acid sequence similarity to known protein sequences, particularly throughout the east Antarctic sites. Phylogenetic analysis revealed that a high proportion were most similar to antifungal and biosurfactant-type clusters. Multivariate analysis showed that soil fertility factors of carbon, nitrogen and moisture displayed significant negative relationships with natural product gene richness. Our combined results suggest that secondary metabolite production is likely to play an important physiological component of survival for microorganisms inhabiting arid, nutrient-starved soils. © FEMS 2019.
Genomic and transcriptomic insights into the survival of the subaerial cyanobacterium Nostoc flagelliforme in arid and exposed habitats.
The cyanobacterium Nostoc flagelliforme is an extremophile that thrives under extraordinary desiccation and ultraviolet (UV) radiation conditions. To investigate its survival strategies, we performed whole-genome sequencing of N. flagelliforme CCNUN1 and transcriptional profiling of its field populations upon rehydration in BG11 medium. The genome of N. flagelliforme is 10.23 Mb in size and contains 10 825 predicted protein-encoding genes, making it one of the largest complete genomes of cyanobacteria reported to date. Comparative genomics analysis among 20 cyanobacterial strains revealed that genes related to DNA replication, recombination and repair had disproportionately high contributions to the genome expansion. The ability of N. flagelliforme to thrive under extreme abiotic stresses is supported by the acquisition of genes involved in the protection of photosynthetic apparatus, the formation of monounsaturated fatty acids, responses to UV radiation, and a peculiar role of ornithine metabolism. Transcriptome analysis revealed a distinct acclimation strategy to rehydration, including the strong constitutive expression of genes encoding photosystem I assembly factors and the involvement of post-transcriptional control mechanisms of photosynthetic resuscitation. Our results provide insights into the adaptive mechanisms of subaerial cyanobacteria in their harsh habitats and have important implications to understand the evolutionary transition of cyanobacteria from aquatic environments to terrestrial ecosystems. © 2019 Society for Applied Microbiology and John Wiley & Sons Ltd.
Genome of Crucihimalaya himalaica, a close relative of Arabidopsis, shows ecological adaptation to high altitude.
Crucihimalaya himalaica, a close relative of Arabidopsis and Capsella, grows on the Qinghai-Tibet Plateau (QTP) about 4,000 m above sea level and represents an attractive model system for studying speciation and ecological adaptation in extreme environments. We assembled a draft genome sequence of 234.72 Mb encoding 27,019 genes and investigated its origin and adaptive evolutionary mechanisms. Phylogenomic analyses based on 4,586 single-copy genes revealed that C. himalaica is most closely related to Capsella (estimated divergence 8.8 to 12.2 Mya), whereas both species form a sister clade to Arabidopsis thaliana and Arabidopsis lyrata, from which they diverged between 12.7 and 17.2 Mya. LTR retrotransposons in C. himalaica proliferated shortly after the dramatic uplift and climatic change of the Himalayas from the Late Pliocene to Pleistocene. Compared with closely related species, C. himalaica showed significant contraction and pseudogenization in gene families associated with disease resistance and also significant expansion in gene families associated with ubiquitin-mediated proteolysis and DNA repair. We identified hundreds of genes involved in DNA repair, ubiquitin-mediated proteolysis, and reproductive processes with signs of positive selection. Gene families showing dramatic changes in size and genes showing signs of positive selection are likely candidates for C. himalaica’s adaptation to intense radiation, low temperature, and pathogen-depauperate environments in the QTP. Loss of function at the S-locus, the reason for the transition to self-fertilization of C. himalaica, might have enabled its QTP occupation. Overall, the genome sequence of C. himalaica provides insights into the mechanisms of plant adaptation to extreme environments.Copyright © 2019 the Author(s). Published by PNAS.
Complete genome sequence of Pseudomonas frederiksbergensis ERDD5:01 revealed genetic bases for survivability at high altitude ecosystem and bioprospection potential.
Pseudomonas frederiksbergensis ERDD5:01 is a psychrotrophic bacteria isolated from the glacial stream flowing from East Rathong glacier in Sikkim Himalaya. The strain showed survivability at high altitude stress conditions like freezing, frequent freeze-thaw cycles, and UV-C radiations. The complete genome of 5,746,824?bp circular chromosome and a plasmid of 371,027?bp was sequenced to understand the genetic basis of its survival strategy. Multiple copies of cold-associated genes encoding cold active chaperons, general stress response, osmotic stress, oxidative stress, membrane/cell wall alteration, carbon storage/starvation and, DNA repair mechanisms supported its survivability at extreme cold and radiations corroborating with the bacterial physiological findings. The molecular cold adaptation analysis in comparison with the genome of 15 mesophilic Pseudomonas species revealed functional insight into the strategies of cold adaptation. The genomic data also revealed the presence of industrially important enzymes.Copyright © 2018 Elsevier Inc. All rights reserved.
RADAR-seq: A RAre DAmage and Repair sequencing method for detecting DNA damage on a genome-wide scale.
RAre DAmage and Repair sequencing (RADAR-seq) is a highly adaptable sequencing method that enables the identification and detection of rare DNA damage events for a wide variety of DNA lesions at single-molecule resolution on a genome-wide scale. In RADAR-seq, DNA lesions are replaced with a patch of modified bases that can be directly detected by Pacific Biosciences Single Molecule Real-Time (SMRT) sequencing. RADAR-seq enables dynamic detection over a wide range of DNA damage frequencies, including low physiological levels. Furthermore, without the need for DNA amplification and enrichment steps, RADAR-seq provides sequencing coverage of damaged and undamaged DNA across an entire genome. Here, we use RADAR-seq to measure the frequency and map the location of ribonucleotides in wild-type and RNaseH2-deficient E. coli and Thermococcus kodakarensis strains. Additionally, by tracking ribonucleotides incorporated during in vivo lagging strand DNA synthesis, we determined the replication initiation point in E. coli, and its relation to the origin of replication (oriC). RADAR-seq was also used to map cyclobutane pyrimidine dimers (CPDs) in Escherichia coli (E. coli) genomic DNA exposed to UV-radiation. On a broader scale, RADAR-seq can be applied to understand formation and repair of DNA damage, the correlation between DNA damage and disease initiation and progression, and complex biological pathways, including DNA replication.Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.
Chromosome-level genome assembly of Triplophysa tibetana, a fish adapted to the harsh high-altitude environment of the Tibetan Plateau.
Triplophysa is an endemic fish genus of the Tibetan Plateau in China. Triplophysa tibetana, which lives at a recorded altitude of ~4,000 m and plays an important role in the highland aquatic ecosystem, serves as an excellent model for investigating high-altitude environmental adaptation. However, evolutionary and conservation studies of T. tibetana have been limited by scarce genomic resources for the genus Triplophysa. In the present study, we applied PacBio sequencing and the Hi-C technique to assemble the T. tibetana genome. A 652-Mb genome with 1,325 contigs with an N50 length of 3.1 Mb was obtained. The 1,137 contigs were further assembled into 25 chromosomes, representing 98.7% and 80.47% of all contigs at the base and sequence number level, respectively. Approximately 260 Mb of sequence, accounting for ~39.8% of the genome, was identified as repetitive elements. DNA transposons (16.3%), long interspersed nuclear elements (12.4%) and long terminal repeats (11.0%) were the most repetitive types. In total, 24,372 protein-coding genes were predicted in the genome, and ~95% of the genes were functionally annotated via a search in public databases. Using whole genome sequence information, we found that T. tibetana diverged from its common ancestor with Danio rerio ~121.4 million years ago. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population and conservation studies of T. tibetana, but it also lays a solid foundation for further investigation into the mechanisms of environmental adaptation of endemic fishes in the Tibetan Plateau. © 2019 John Wiley & Sons Ltd.
Complete Assembly of the Genome of an Acidovorax citrulli Strain Reveals a Naturally Occurring Plasmid in This Species.
Acidovorax citrulli is the causal agent of bacterial fruit blotch (BFB), a serious threat to cucurbit crop production worldwide. Based on genetic and phenotypic properties, A. citrulli strains are divided into two major groups: group I strains have been generally isolated from melon and other non-watermelon cucurbits, while group II strains are closely associated with watermelon. In a previous study, we reported the genome of the group I model strain, M6. At that time, the M6 genome was sequenced by MiSeq Illumina technology, with reads assembled into 139 contigs. Here, we report the assembly of the M6 genome following sequencing with PacBio technology. This approach not only allowed full assembly of the M6 genome, but it also revealed the occurrence of a ~53 kb plasmid. The M6 plasmid, named pACM6, was further confirmed by plasmid extraction, Southern-blot analysis of restricted fragments and obtention of M6-derivative cured strains. pACM6 occurs at low copy numbers (average of ~4.1 ± 1.3 chromosome equivalents) in A. citrulli M6 and contains 63 open reading frames (ORFs), most of which (55.6%) encoding hypothetical proteins. The plasmid contains several genes encoding type IV secretion components, and typical plasmid-borne genes involved in plasmid maintenance, replication and transfer. The plasmid also carries an operon encoding homologs of a Fic-VbhA toxin-antitoxin (TA) module. Transcriptome data from A. citrulli M6 revealed that, under the tested conditions, the genes encoding the components of this TA system are among the highest expressed genes in pACM6. Whether this TA module plays a role in pACM6 maintenance is still to be determined. Leaf infiltration and seed transmission assays revealed that, under tested conditions, the loss of pACM6 did not affect the virulence of A. citrulli M6. We also show that pACM6 or similar plasmids are present in several group I strains, but absent in all tested group II strains of A. citrulli.
Adaptive Strategies in a Poly-Extreme Environment: Differentiation of Vegetative Cells in Serratia ureilytica and Resistance to Extreme Conditions.
Poly-extreme terrestrial habitats are often used as analogs to extra-terrestrial environments. Understanding the adaptive strategies allowing bacteria to thrive and survive under these conditions could help in our quest for extra-terrestrial planets suitable for life and understanding how life evolved in the harsh early earth conditions. A prime example of such a survival strategy is the modification of vegetative cells into resistant resting structures. These differentiated cells are often observed in response to harsh environmental conditions. The environmental strain (strain Lr5/4) belonging to Serratia ureilytica was isolated from a geothermal spring in Lirima, Atacama Desert, Chile. The Atacama Desert is the driest habitat on Earth and furthermore, due to its high altitude, it is exposed to an increased amount of UV radiation. The geothermal spring from which the strain was isolated is oligotrophic and the temperature of 54°C exceeds mesophilic conditions (15 to 45°C). Although the vegetative cells were tolerant to various environmental insults (desiccation, extreme pH, glycerol), a modified cell type was formed in response to nutrient deprivation, UV radiation and thermal shock. Scanning (SEM) and Transmission Electron Microscopy (TEM) analyses of vegetative cells and the modified cell structures were performed. In SEM, a change toward a circular shape with reduced size was observed. These circular cells possessed what appears as extra coating layers under TEM. The resistance of the modified cells was also investigated, they were resistant to wet heat, UV radiation and desiccation, while vegetative cells did not withstand any of those conditions. A phylogenomic analysis was undertaken to investigate the presence of known genes involved in dormancy in other bacterial clades. Genes related to spore-formation in Myxococcus and Firmicutes were found in S. ureilytica Lr5/4 genome; however, these genes were not enough for a full sporulation pathway that resembles either group. Although, the molecular pathway of cell differentiation in S. ureilytica Lr5/4 is not fully defined, the identified genes may contribute to the modified phenotype in the Serratia genus. Here, we show that a modified cell structure can occur as a response to extremity in a species that was previously not known to deploy this strategy. This strategy may be widely spread in bacteria, but only expressed under poly-extreme environmental conditions.