Traditionally derived from fossil fuels, biological production of propionic acid has recently gained interest. Propionibacterium species produce propionic acid as their main fermentation product. Production of other organic acids reduces propionic acid yield and productivity, pointing to by-products gene-knockout strategies as a logical solution to increase yield. However, removing by-product formation has seen limited success due to our inability to genetically engineer the best producing strains (i.e. Propionibacterium acidipropionici). To overcome this limitation, random mutagenesis continues to be the best path towards improving strains for biological propionic acid production. Recent advances in next generation sequencing opened new avenues to understand improved strains. In this work, we use genome shuffling on two wild type strains to generate a better propionic acid producing strain. Using next generation sequencing, we mapped the genomic changes leading to the improved phenotype. The best strain produced 25% more propionic acid than the wild type strain. Sequencing of the strains showed that genomic changes were restricted to single point mutations and gene duplications in well-conserved regions in the genomes. Such results confirm the involvement of gene conversion in genome shuffling as opposed to long genomic insertions. © 2016 The Authors. Biotechnology Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Draft genome sequence of Sulfurospirillum sp. strain MES, reconstructed from the metagenome of a microbial electrosynthesis system.
A draft genome of Sulfurospirillum sp. strain MES was isolated through taxonomic binning of a metagenome sequenced from a microbial electrosynthesis system (MES) actively producing acetate and hydrogen. The genome contains the nosZDFLY genes, which are involved in nitrous oxide reduction, suggesting the potential role of this strain in denitrification. Copyright © 2015 Ross et al.
The methylome of the gut microbiome: disparate Dam methylation patterns in intestinal Bacteroides dorei
Despite the large interest in the human microbiome in recent years, there are no reports of bacterial DNA methylation in the microbiome. Here metagenomic sequencing using the Pacific Biosciences platform allowed for rapid identification of bacterial GATC methylation status of a bacterial species in human stool samples. For this work, two stool samples were chosen that were dominated by a single species, Bacteroides dorei. Based on 16S rRNA analysis, this species represented over 45% of the bacteria present in these two samples. The B. dorei genome sequence from these samples was determined and the GATC methylation sites mapped. The Bacteroides dorei genome from one subject lacked any GATC methylation and lacked the DNA adenine methyltransferase genes. In contrast, B. dorei from another subject contained 20,551 methylated GATC sites. Of the 4970 open reading frames identified in the GATC methylated B. dorei genome, 3184 genes were methylated as well as 1735 GATC methylations in intergenic regions. These results suggest that DNA methylation patterns are important to consider in multi-omic analyses of microbiome samples seeking to discover the diversity of bacterial functions and may differ between disease states.
A report on the International Plant and Animal Genomes (PAG) conference held in San Diego, USA, 13-17 January 2018.
Most current approaches to analyse metagenomic data rely on reference genomes. Novel microbial communities extend far beyond the coverage of reference databases and de novo metagenome assembly from complex microbial communities remains a great challenge. Here we present a novel experimental and bioinformatic framework, metaSort, for effective construction of bacterial genomes from metagenomic samples. MetaSort provides a sorted mini-metagenome approach based on flow cytometry and single-cell sequencing methodologies, and employs new computational algorithms to efficiently recover high-quality genomes from the sorted mini-metagenome by the complementary of the original metagenome. Through extensive evaluations, we demonstrated that metaSort has an excellent and unbiased performance on genome recovery and assembly. Furthermore, we applied metaSort to an unexplored microflora colonized on the surface of marine kelp and successfully recovered 75 high-quality genomes at one time. This approach will greatly improve access to microbial genomes from complex or novel communities.
Alternative polyadenylation (APA), a phenomenon that RNA molecules with different 3′ ends originate from distinct polyadenylation sites of a single gene, is emerging as a mechanism widely used to regulate gene expression. In the present review, we first summarized various methods prevalently adopted in APA study, mainly focused on the next-generation sequencing (NGS)-based techniques specially designed for APA identification, the related bioinformatics methods, and the strategies for APA study in single cells. Then we summarized the main findings and advances so far based on these methods, including the preferences of alternative polyA (pA) site, the biological processes involved, and the corresponding consequences. We especially categorized the APA changes discovered so far and discussed their potential functions under given conditions, along with the possible underlying molecular mechanisms. With more in-depth studies on extensive samples, more signatures and functions of APA will be revealed, and its diverse roles will gradually heave in sight. Copyright © 2017 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
Parallel sequencing of a single cell’s genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ~3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.
While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.
Evaluating the mobility potential of antibiotic resistance genes in environmental resistomes without metagenomics.
Antibiotic resistance genes are ubiquitous in the environment. However, only a fraction of them are mobile and able to spread to pathogenic bacteria. Until now, studying the mobility of antibiotic resistance genes in environmental resistomes has been challenging due to inadequate sensitivity and difficulties in contig assembly of metagenome based methods. We developed a new cost and labor efficient method based on Inverse PCR and long read sequencing for studying mobility potential of environmental resistance genes. We applied Inverse PCR on sediment samples and identified 79 different MGE clusters associated with the studied resistance genes, including novel mobile genetic elements, co-selected resistance genes and a new putative antibiotic resistance gene. The results show that the method can be used in antibiotic resistance early warning systems. In comparison to metagenomics, Inverse PCR was markedly more sensitive and provided more data on resistance gene mobility and co-selected resistances.
A response to Lindsey et al. “Wolbachia pipientis should not be split into multiple species: A response to Ramírez-Puebla et al.”.
In Ramírez-Puebla et al.  we compared 34 Wolbachia genomes and constructed phylogenetic trees using genomic data. In general, our results were congruent with previously reported phy- logenetic trees [5,9]. Our datasets were carefully selected, checked and analyzed avoiding horizontally transferred genes. In the case of the wAna genome we did not use the raw data, but the assem- bled genome  and 31 genes were used to compare in a dataset of conserved proteins. To confirm our conclusions a new phyloge- nomic analysis was performed excluding the wAna strain in the dataset (Fig. 1). The same topology was obtained, therefore indi- cating that the results were not affected by the presence of this particular strain.
Candidatus Dactylopiibacterium carminicum, a nitrogen-fixing symbiont of Dactylopius cochineal insects (Hemiptera: Coccoidea: Dactylopiidae)
The domesticated carmine cochineal Dactylopius coccus (scale insect) has commercial value and has been used for more than 500?years for natural red pigment production. Besides the domesticated cochineal, other wild Dactylopius species such as Dactylopius opuntiae are found in the Americas, all feeding on nutrient poor sap from native cacti. To compensate nutritional deficiencies, many insects harbor symbiotic bacteria which provide essential amino acids or vitamins to their hosts. Here, we characterized a symbiont from the carmine cochineal insects, Candidatus Dactylopiibacterium carminicum (betaproteobacterium, Rhodocyclaceae family) and found it in D. coccus and in D. opuntiae ovaries by fluorescent in situ hybridization, suggesting maternal inheritance. Bacterial genomes recovered from metagenomic data derived from whole insects or tissues both from D. coccus and from D. opuntiae were around 3.6?Mb in size. Phylogenomics showed that dactylopiibacteria constituted a closely related clade neighbor to nitrogen fixing bacteria from soil or from various plants including rice and other grass endophytes. Metabolic capabilities were inferred from genomic analyses, showing a complete operon for nitrogen fixation, biosynthesis of amino acids and vitamins and putative traits of anaerobic or microoxic metabolism as well as genes for plant interaction. Dactylopiibacterium nif gene expression and acetylene reduction activity detecting nitrogen fixation were evidenced in D. coccus hemolymph and ovaries, in congruence with the endosymbiont fluorescent in situ hybridization location. Dactylopiibacterium symbionts may compensate for the nitrogen deficiency in the cochineal diet. In addition, this symbiont may provide essential amino acids, recycle uric acid, and increase the cochineal life span.
Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.
The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important ERBB2 oncogene (also known as HER2), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.© 2018 Nattestad et al.; Published by Cold Spring Harbor Laboratory Press.
Single cell genomic study of Dehalococcoidetes species from deep-sea sediments of the Peruvian Margin.
The phylum Chloroflexi is one of the most frequently detected phyla in the subseafloor of the Pacific Ocean margins. Dehalogenating Chloroflexi (Dehalococcoidetes) was originally discovered as the key microorganisms mediating reductive dehalogenation via their key enzymes reductive dehalogenases (Rdh) as sole mode of energy conservation in terrestrial environments. The frequent detection of Dehalococcoidetes-related 16S rRNA and rdh genes in the marine subsurface implies a role for dissimilatory dehalorespiration in this environment; however, the two genes have never been linked to each other. To provide fundamental insights into the metabolism, genomic population structure and evolution of marine subsurface Dehalococcoidetes sp., we analyzed a non-contaminated deep-sea sediment core sample from the Peruvian Margin Ocean Drilling Program (ODP) site 1230, collected 7.3?m below the seafloor by a single cell genomic approach. We present for the first time single cell genomic data on three deep-sea Chloroflexi (Dsc) single cells from a marine subsurface environment. Two of the single cells were considered to be part of a local Dehalococcoidetes population and assembled together into a 1.38-Mb genome, which appears to be at least 85% complete. Despite a high degree of sequence-level similarity between the shared proteins in the Dsc and terrestrial Dehalococcoidetes, no evidence for catabolic reductive dehalogenation was found in Dsc. The genome content is however consistent with a strictly anaerobic organotrophic or lithotrophic lifestyle.
Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Long-read sequencing technologies enable high-quality, contiguous genome assemblies. Here we used SMRT sequencing to assemble the genome of a Drosophila simulans strain originating from Madagascar, the ancestral range of the species. We generated 8 Gb of raw data (~50x coverage) with a mean read length of 6,410 bp, a NR50 of 9,125 bp and the longest subread at 49 kb. We benchmarked six different assemblers and merged the best two assemblies from Canu and Falcon. Our final assembly was 127.41 Mb with a N50 of 5.38 Mb and 305 contigs. We anchored more than 4 Mb of novel sequence to the major chromosome arms, and significantly improved the assembly of peri-centromeric and telomeric regions. Finally, we performed full-length transcript sequencing and used this data in conjunction with short-read RNAseq data to annotate 13,422 genes in the genome, improving the annotation in regions with complex, nested gene structures.