N,N-Dimethylformamide (DMF) is one of the most common xenobiotic chemicals, and it can be easily emitted into the environment, where it causes harm to human beings. Herein, an efficient DMF-degrading strain, DM1, was isolated and identified as Methylobacterium sp. This strain can use DMF as the sole source of carbon and nitrogen. Whole-genome sequencing of strain DM1 revealed that it has a 5.66-Mbp chromosome and a 200-kbp megaplasmid. The plasmid pLVM1 specifically harbors the genes essential for the initial steps of DMF degradation, and the chromosome carries the genes facilitating subsequent methylotrophic metabolism. Through analysis of the transcriptome sequencing data, the complete mineralization pathway and redundant gene clusters of DMF degradation were elucidated. The dimethylformamidase (DMFase) gene was heterologously expressed, and DMFase was purified and characterized. Plasmid pLVM1 is catabolically crucial for DMF utilization, as evidenced by the phenotype identification of the plasmid-free strain. This study systematically elucidates the molecular mechanisms of DMF degradation by MethylobacteriumIMPORTANCE DMF is a hazardous pollutant that has been used in the chemical industry, pharmaceutical manufacturing, and agriculture. Biodegradation as a method for removing DMF has received increasing attention. Here, we identified an efficient DMF degrader, Methylobacterium sp. strain DM1, and characterized the complete DMF mineralization pathway and enzymatic properties of DMFase in this strain. This study provides insights into the molecular mechanisms and evolutionary advantage of DMF degradation facilitated by plasmid pLVM1 and redundant genes in strain DM1, suggesting the emergence of new ecotypes of Methylobacterium.Copyright © 2019 American Society for Microbiology.
Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington’s Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.