In this AGBT 2017 poster, the University of Helsinki’s Petri Auevinen reports on efforts to understand bacteria that grow on, and subsequently spoil, food. This analysis monitored DNA modifications and transcriptomic changes in three species of lactic acid bacteria. Scientists discovered that the organisms’ metabolic profiles change substantially when grown together compared to those cultured individually, and are now studying how Cas protein activity changes under these conditions too.
At AGBT 2017, Lars Paulin from the University of Helsinki presented this poster on whole genome sequencing of the virus responsible for progressive multifocal leukoencephalopathy, a rare and dangerous brain infection. His team used long amplicon analysis to resolve the whole virus genome from three patient samples, pooled them for SMRT Sequencing, and identified variants and rearrangements. This work represents the first time the viral genome was sequenced from patients.
Bipolar disorder (BD) is a phenotypically and genetically complex neurological disorder that affects 1% of the worldwide population. There is compelling evidence from family, twin and adoption studies supporting the involvement of a genetic predisposition with estimated heritability up to ~ 80%. The risk in first-degree relatives is ten times higher than in the general population. Linkage and association studies have implicated multiple putative chromosomal loci for BD susceptibility, however no disease genes have yet to be identified. Here, we have fully characterized a ~12 Mb significantly linked (lod score=3.54) genomic region on chromosome Xq24-q27 in an extended family from…
Bipolar disorder (BD) is a phenotypically and genetically complex and debilitating neurological disorder that affects 1% of the worldwide population. There is compelling evidence from family, twin and adoption studies supporting the involvement of a genetic predisposition in BD with estimated heritability up to ~ 80%. The risk in first-degree relatives is ten times higher than in the general population. Linkage and association studies have implicated multiple putative chromosomal loci for BP susceptibility, however no disease genes have been identified to date.
The pathogenic extended-spectrum-beta-lactamase (ESBL)-producing Escherichia coli lineage ST648 is increasingly reported from multiple origins. Our study of a large and global ST648 collection from various hosts (87 whole-genome sequences) combining core and accessory genomics with functional analyses and in vivo experiments suggests that ST648 is a nascent and generalist lineage, lacking clear phylogeographic and host association signals. By including large numbers of ST131 (n?=?107) and ST10 (n?=?96) strains for comparative genomics and phenotypic analysis, we demonstrate that the combination of multidrug resistance and high-level virulence are the hallmarks of ST648, similar to international high-risk clonal lineage ST131. Specifically, our in…
Unknown sequences, or gaps, are present in many published genomes across public databases. Gap filling is an important finishing step in de novo genome assembly, especially in large genomes. The gap filling problem is nontrivial and while there are many computational tools partially solving the problem, several have shortcomings as to the reliability and correctness of the output, i.e. the gap filled draft genome. SSPACE-LongRead is a scaffolding tool that utilizes long reads from multiple third-generation sequencing platforms in finding links between contigs and combining them. The long reads potentially contain sequence information to fill the gaps created in the…
Mushroom-forming fungi (Agaricomycetes) have the greatest morphological diversity and complexity of any group of fungi. They have radiated into most niches and fulfil diverse roles in the ecosystem, including wood decomposers, pathogens or mycorrhizal mutualists. Despite the importance of mushroom-forming fungi, large-scale patterns of their evolutionary history are poorly known, in part due to the lack of a comprehensive and dated molecular phylogeny. Here, using multigene and genome-based data, we assemble a 5,284-species phylogenetic tree and infer ages and broad patterns of speciation/extinction and morphological innovation in mushroom-forming fungi. Agaricomycetes started a rapid class-wide radiation in the Jurassic, coinciding with…
The human gut microbiome matures towards the adult composition during the first years of life and is implicated in early immune development. Here, we investigate the effects of microbial genomic diversity on gut microbiome development using integrated early childhood data sets collected in the DIABIMMUNE study in Finland, Estonia and Russian Karelia. We show that gut microbial diversity is associated with household location and linear growth of children. Single nucleotide polymorphism- and metagenomic assembly-based strain tracking revealed large and highly dynamic microbial pangenomes, especially in the genus Bacteroides, in which we identified evidence of variability deriving from Bacteroides-targeting bacteriophages. Our…
The Baltic Sea is a shallow basin of brackish water in which the spatial salinity gradient is one of the most important factors contributing to species distribution. The Baltic Sea is infamous for its annual cyanobacterial blooms comprised of Nodularia spumigena, Aphanizomenon spp., and Dolichospermum spp. that cause harm, especially for recreational users. To broaden our knowledge of the cyanobacterial adaptation strategies for brackish water environments, we sequenced the entire genome of Dolichospermum sp. UHCC 0315, a species occurring not only in freshwater environments but also in brackish water. Comparative genomics analyses revealed a close association with Dolichospermum sp. UHCC…
Cereal grasses of the Triticeae tribe have been the major food source in temperate regions since the dawn of agriculture. Their large genomes are characterized by a high content of repetitive elements and large pericentromeric regions that are virtually devoid of meiotic recombination. Here we present a high-quality reference genome assembly for barley (Hordeum vulgare L.). We use chromosome conformation capture mapping to derive the linear order of sequences across the pericentromeric space and to investigate the spatial organization of chromatin in the nucleus at megabase resolution. The composition of genes and repetitive elements differs between distal and proximal regions.…
Transcript prediction can be modeled as a graph problem where exons are modeled as nodes and reads spanning two or more exons are modeled as exon chains. Pacific Biosciences third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions. We survey several approaches to find the exon…
Despite the large interest in the human microbiome in recent years, there are no reports of bacterial DNA methylation in the microbiome. Here metagenomic sequencing using the Pacific Biosciences platform allowed for rapid identification of bacterial GATC methylation status of a bacterial species in human stool samples. For this work, two stool samples were chosen that were dominated by a single species, Bacteroides dorei. Based on 16S rRNA analysis, this species represented over 45% of the bacteria present in these two samples. The B. dorei genome sequence from these samples was determined and the GATC methylation sites mapped. The Bacteroides…
Antibiotic resistance genes are ubiquitous in the environment. However, only a fraction of them are mobile and able to spread to pathogenic bacteria. Until now, studying the mobility of antibiotic resistance genes in environmental resistomes has been challenging due to inadequate sensitivity and difficulties in contig assembly of metagenome based methods. We developed a new cost and labor efficient method based on Inverse PCR and long read sequencing for studying mobility potential of environmental resistance genes. We applied Inverse PCR on sediment samples and identified 79 different MGE clusters associated with the studied resistance genes, including novel mobile genetic elements,…
Long-read sequencing technologies enable high-quality, contiguous genome assemblies. Here we used SMRT sequencing to assemble the genome of a Drosophila simulans strain originating from Madagascar, the ancestral range of the species. We generated 8 Gb of raw data (~50x coverage) with a mean read length of 6,410 bp, a NR50 of 9,125 bp and the longest subread at 49 kb. We benchmarked six different assemblers and merged the best two assemblies from Canu and Falcon. Our final assembly was 127.41 Mb with a N50 of 5.38 Mb and 305 contigs. We anchored more than 4 Mb of novel sequence to…
The incidence of the autoimmune disease, type 1 diabetes (T1D), has increased dramatically over the last half century in many developed countries and is particularly high in Finland and other Nordic countries. Along with genetic predisposition, environmental factors are thought to play a critical role in this increase. As with other autoimmune diseases, the gut microbiome is thought to play a potential role in controlling progression to T1D in children with high genetic risk, but we know little about how the gut microbiome develops in children with high genetic risk for T1D. In this study, the early development of the…