Clostridium spp. can synthesize valuable chemicals and fuels by utilizing diverse waste-stream substrates, including starchy biomass, lignocellulose, and industrial waste gases. However, metabolic engineering in Clostridium spp. is challenging due to the low efficiency of gene transfer and genomic integration of entire biosynthetic pathways.We have developed a reliable gene transfer and genomic integration system for the syngas-fermenting bacterium Clostridium ljungdahlii based on the conjugal transfer of donor plasmids containing large transgene cassettes (>?5 kb) followed by the inducible activation of Himar1 transposase to promote integration. We established a conjugation protocol for the efficient generation of transconjugants using the Gram-positive origins of…
The human genome contains “dark” gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark…
Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference genome. Our method is robust to systematic sequencing errors, inexact repeats with fuzzy boundaries, and low sequencing coverage. By comparing to healthy controls, we prioritize pathogenic expansions within the top 10 out of 700,000 tandem repeats in whole genome sequencing data. This may help to elucidate the many genetic diseases whose causes remain unknown.
Species of Populus section Leuce are distributed throughout most parts of the Northern Hemisphere and have important economic and ecological significance. However, due to frequent hybridization within Leuce, the phylogenetic relationship between species has not been clarified. The chloroplast (cp) genome is characterized by maternal inheritance and relatively conservative mutation rates; thus, it is a powerful tool for building phylogenetic trees. In this study, we used the PacBio SEQUEL software to determine that the cp genome of Populus tomentosa has a length of 156,558 bp including a long single-copy region (84,717 bp), a small single-copy region (16,555 bp), and a…
Transposable elements (TEs) are genomic parasites with major impacts on host genome architecture and host adaptation. A proper evaluation of their evolutionary significance has been hampered by the paucity of short scale phylogenetic comparisons between closely related species. Here, we characterized the dynamics of TE accumulation at the micro-evolutionary scale by comparing two closely related plant species, Arabidopsis lyrata and A. halleri.Joint genome annotation in these two outcrossing species confirmed that both contain two distinct populations of TEs with either ‘recent’ or ‘old’ insertion histories. Identification of rare segregating insertions suggests that diverse TE families contribute to the ongoing dynamics…
Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids.We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58…