Menu
July 7, 2019

An integrated map of structural variation in 2,504 human genomes.

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


July 7, 2019

Potential mechanisms of attenuation for rifampicin-passaged strains of Flavobacterium psychrophilum.

Flavobacterium psychrophilum is the etiologic agent of bacterial coldwater disease in salmonids. Earlier research showed that a rifampicin-passaged strain of F. psychrophilum (CSF 259-93B.17) caused no disease in rainbow trout (Oncorhynchus mykiss, Walbaum) while inducing a protective immune response against challenge with the virulent CSF 259-93 strain. We hypothesized that rifampicin passage leads to an accumulation of genomic mutations that, by chance, reduce virulence. To assess the pattern of phenotypic and genotypic changes associated with passage, we examined proteomic, LPS and single-nucleotide polymorphism (SNP) differences for two F. psychrophilum strains (CSF 259-93 and THC 02-90) that were passaged with and without rifampicin selection.Rifampicin resistance was conveyed by expected mutations in rpoB, although affecting different DNA bases depending on the strain. One rifampicin-passaged CSF 259-93 strain (CR) was attenuated (4 % mortality) in challenged fish, but only accumulated eight nonsynonymous SNPs compared to the parent strain. A CSF 259-93 strain passaged without rifampicin (CN) accumulated five nonsynonymous SNPs and was partially attenuated (28 % mortality) compared to the parent strain (54.5 % mortality). In contrast, there were no significant change in fish mortalities among THC 02-90 wild-type and passaged strains, despite numerous SNPs accumulated during passage with (n?=?174) and without rifampicin (n?=?126). While only three missense SNPs were associated with attenuation, a Ser492Phe rpoB mutation in the CR strain may contribute to further attenuation. All strains except CR retained a gliding motility phenotype. Few proteomic differences were observed by 2D SDS-PAGE and there were no apparent changes in LPS between strains. Comparative methylome analysis of two strains (CR and TR) identified no shared methylation motifs for these two strains.Multiple genomic changes arose during passage experiments with rifampicin selection pressure. Consistent with our hypothesis, unique strain-specific mutations were detected for the fully attenuated (CR), partially attenuated (CN) and another fully attenuated strain (B17).


July 7, 2019

Insights on virulence from the complete genome of Staphylococcus capitis.

Staphylococcus capitis is an opportunistic pathogen of the coagulase negative staphylococci (CoNS). Functional genomic studies of S. capitis have thus far been limited by a lack of available complete genome sequences. Here, we determined the closed S. capitis genome and methylome using Single Molecule Real Time (SMRT) sequencing. The strain, AYP1020, harbors a single circular chromosome of 2.44 Mb encoding 2304 predicted proteins, which is the smallest of all complete staphylococcal genomes sequenced to date. AYP1020 harbors two large mobile genetic elements; a plasmid designated pAYP1020 (59.6 Kb) and a prophage, FAYP1020 (48.5 Kb). Methylome analysis identified significant adenine methylation across the genome involving two distinct methylation motifs (1972 putative 6-methyladenine (m6A) residues identified). Putative adenine methyltransferases were also identified. Comparative analysis of AYP1020 and the closely related CoNS, S. epidermidis RP62a, revealed a host of virulence factors that likely contribute to S. capitis pathogenicity, most notably genes important for biofilm formation and a suite of phenol soluble modulins (PSMs); the expression/production of these factors were corroborated by functional assays. The complete S. capitis genome will aid future studies on the evolution and pathogenesis of the coagulase negative staphylococci.


July 7, 2019

Completing the human genome: the progress and challenge of satellite DNA assembly.

Genomic studies rely on accurate chromosome assemblies to explore sequence-based models of cell biology, evolution and biomedical disease. However, even the extensively studied human genome has not yet reached a complete, ‘telomere-to-telomere’, chromosome assembly. The largest assembly gaps remain in centromeric regions and acrocentric short arms, sites known to contain megabase-sized arrays of tandem repeats, or satellite DNAs. This review aims to briefly address the progress and challenges of generating correct assemblies of satellite DNA arrays. Although the focus is placed on the human genome, many concepts presented here are applicable to other genomes.


July 7, 2019

DNA N(6)-methyladenine: a new epigenetic mark in eukaryotes?

DNA N(6)-adenine methylation (N(6)-methyladenine; 6mA) in prokaryotes functions primarily in the host defence system. The prevalence and significance of this modification in eukaryotes had been unclear until recently. Here, we discuss recent publications documenting the presence of 6mA in Chlamydomonas reinhardtii, Drosophila melanogaster and Caenorhabditis elegans; consider possible roles for this DNA modification in regulating transcription, the activity of transposable elements and transgenerational epigenetic inheritance; and propose 6mA as a new epigenetic mark in eukaryotes.


July 7, 2019

Fosfomycin resistance in Escherichia coli, Pennsylvania, USA.

Fosfomycin resistance in Escherichia coli is rare in the United States. An extended-spectrum ß-lactamase-producing E. coli clinical strain identified in Pennsylvania, USA, showed high-level fosfomycin resistance caused by the fosA3 gene. The IncFII plasmid carrying this gene had a structure similar to those found in China, where fosfomycin resistance is commonly described.


July 7, 2019

Coupling of mRNA structure rearrangement to ribosome movement during bypassing of non-coding regions.

Nearly half of the ribosomes translating a particular bacteriophage T4 mRNA bypass a region of 50 nt, resuming translation 3′ of this gap. How this large-scale, specific hop occurs and what determines whether a ribosome bypasses remain unclear. We apply single-molecule fluorescence with zero-mode waveguides to track individual Escherichia coli ribosomes during translation of T4’s gene 60 mRNA. Ribosomes that bypass are characterized by a 10- to 20-fold longer pause in a non-canonical rotated state at the take-off codon. During the pause, mRNA secondary structure rearrangements are coupled to ribosome forward movement, facilitated by nascent peptide interactions that disengage the ribosome anticodon-codon interactions for slippage. Close to the landing site, the ribosome then scans mRNA in search of optimal base-pairing interactions. Our results provide a mechanistic and conformational framework for bypassing, highlighting a non-canonical ribosomal state to allow for mRNA structure refolding to drive large-scale ribosome movements. Copyright © 2015 Elsevier Inc. All rights reserved.


July 7, 2019

De novo assembly of Dekkera bruxellensis: a multi technology approach using short and long-read sequencing and optical mapping.

It remains a challenge to perform de novo assembly using next-generation sequencing (NGS). Despite the availability of multiple sequencing technologies and tools (e.g., assemblers) it is still difficult to assemble new genomes at chromosome resolution (i.e., one sequence per chromosome). Obtaining high quality draft assemblies is extremely important in the case of yeast genomes to better characterise major events in their evolutionary history. The aim of this work is two-fold: on the one hand we want to show how combining different and somewhat complementary technologies is key to improving assembly quality and correctness, and on the other hand we present a de novo assembly pipeline we believe to be beneficial to core facility bioinformaticians. To demonstrate both the effectiveness of combining technologies and the simplicity of the pipeline, here we present the results obtained using the Dekkera bruxellensis genome.In this work we used short-read Illumina data and long-read PacBio data combined with the extreme long-range information from OpGen optical maps in the task of de novo genome assembly and finishing. Moreover, we developed NouGAT, a semi-automated pipeline for read-preprocessing, de novo assembly and assembly evaluation, which was instrumental for this work.We obtained a high quality draft assembly of a yeast genome, resolved on a chromosomal level. Furthermore, this assembly was corrected for mis-assembly errors as demonstrated by resolving a large collapsed repeat and by receiving higher scores by assembly evaluation tools. With the inclusion of PacBio data we were able to fill about 5 % of the optical mapped genome not covered by the Illumina data.


July 7, 2019

Wham: Identifying structural variants of biological consequence.

Existing methods for identifying structural variants (SVs) from short read datasets are inaccurate. This complicates disease-gene identification and efforts to understand the consequences of genetic variation. In response, we have created Wham (Whole-genome Alignment Metrics) to provide a single, integrated framework for both structural variant calling and association testing, thereby bypassing many of the difficulties that currently frustrate attempts to employ SVs in association testing. Here we describe Wham, benchmark it against three other widely used SV identification tools-Lumpy, Delly and SoftSearch-and demonstrate Wham’s ability to identify and associate SVs with phenotypes using data from humans, domestic pigeons, and vaccinia virus. Wham and all associated software are covered under the MIT License and can be freely downloaded from github (https://github.com/zeeev/wham), with documentation on a wiki (http://zeeev.github.io/wham/). For community support please post questions to https://www.biostars.org/.


July 7, 2019

Complete genome sequence of Salinicoccus halodurans H3B36, isolated from the Qaidam Basin in China.

Salinicoccus halodurans H3B36 is a moderately halophilic bacterium isolated from a sediment sample of Qaidam Basin at 3.2 m vertical depth. Strain H3B36 accumulate N (a)-acetyl-a-lysine as compatible solute against salinity and heat stresses and may have potential applications in industrial biotechnology. In this study, we sequenced the genome of strain H3B36 using single molecule, real-time sequencing technology on a PacBio RS II instrument. The complete genome of strain H3B36 was 2,778,379 bp and contained 2,853 protein-coding genes, 12 rRNA genes, and 61 tRNA genes with 58 tandem repeats, six minisatellite DNA sequences, 11 genome islands, and no CRISPR repeat region. Further analysis of epigenetic modifications revealed the presence of 11,000 m4C-type modified bases, 7,545 m6A-type modified bases, and 89,064 other modified bases. The data on the genome of this strain may provide an insight into the metabolism of N (a)-acetyl-a-lysine.


July 7, 2019

Complete genome sequence of Pseudomonas aeruginosa PA1, isolated from a patient with a respiratory tract infection.

We report the 6,498,072-bp complete genome sequence of Pseudomonas aeruginosa PA1, which was isolated from a patient with a respiratory tract infection in Chongqing, People’s Republic of China. Whole-genome sequencing was performed using single-molecule real-time (SMRT) technology, and de novo assembly revealed a single contig with 396-fold sequence coverage. Copyright © 2015 Lu et al.


July 7, 2019

The functions of DNA methylation by CcrM in Caulobacter crescentus: a global approach.

DNA methylation is involved in a diversity of processes in bacteria, including maintenance of genome integrity and regulation of gene expression. Here, using Caulobacter crescentus as a model, we exploit genome-wide experimental methods to uncover the functions of CcrM, a DNA methyltransferase conserved in most Alphaproteobacteria. Using single molecule sequencing, we provide evidence that most CcrM target motifs (GANTC) switch from a fully methylated to a hemi-methylated state when they are replicated, and back to a fully methylated state at the onset of cell division. We show that DNA methylation by CcrM is not required for the control of the initiation of chromosome replication or for DNA mismatch repair. By contrast, our transcriptome analysis shows that >10% of the genes are misexpressed in cells lacking or constitutively over-expressing CcrM. Strikingly, GANTC methylation is needed for the efficient transcription of dozens of genes that are essential for cell cycle progression, in particular for DNA metabolism and cell division. Many of them are controlled by promoters methylated by CcrM and co-regulated by other global cell cycle regulators, demonstrating an extensive cross talk between DNA methylation and the complex regulatory network that controls the cell cycle of C. crescentus and, presumably, of many other Alphaproteobacteria.


July 7, 2019

Methylome diversification through changes in DNA methyltransferase sequence specificity.

Epigenetic modifications such as DNA methylation have large effects on gene expression and genome maintenance. Helicobacter pylori, a human gastric pathogen, has a large number of DNA methyltransferase genes, with different strains having unique repertoires. Previous genome comparisons suggested that these methyltransferases often change DNA sequence specificity through domain movement–the movement between and within genes of coding sequences of target recognition domains. Using single-molecule real-time sequencing technology, which detects N6-methyladenines and N4-methylcytosines with single-base resolution, we studied methylated DNA sites throughout the H. pylori genome for several closely related strains. Overall, the methylome was highly variable among closely related strains. Hypermethylated regions were found, for example, in rpoB gene for RNA polymerase. We identified DNA sequence motifs for methylation and then assigned each of them to a specific homology group of the target recognition domains in the specificity-determining genes for Type I and other restriction-modification systems. These results supported proposed mechanisms for sequence-specificity changes in DNA methyltransferases. Knocking out one of the Type I specificity genes led to transcriptome changes, which suggested its role in gene expression. These results are consistent with the concept of evolution driven by DNA methylation, in which changes in the methylome lead to changes in the transcriptome and potentially to changes in phenotype, providing targets for natural or artificial selection.


July 7, 2019

The effects of read length, quality and quantity on microsatellite discovery and primer development: from Illumina to PacBio.

The advent of next-generation sequencing (NGS) technologies has transformed the way microsatellites are isolated for ecological and evolutionary investigations. Recent attempts to employ NGS for microsatellite discovery have used the 454, Illumina, and Ion Torrent platforms, but other methods including single-molecule real-time DNA sequencing (Pacific Biosciences or PacBio) remain viable alternatives. We outline a workflow from sequence quality control to microsatellite marker validation in three plant species using PacBio circular consensus sequencing (CCS). We then evaluate the performance of PacBio CCS in comparison with other NGS platforms for microsatellite isolation, through simulations that focus on variations in read length, read quantity and sequencing error rate. Although quality control of CCS reads reduced microsatellite yield by around 50%, hundreds of microsatellite loci that are expected to have improved conversion efficiency to functional markers were retrieved for each species. The simulations quantitatively validate the advantages of long reads and emphasize the detrimental effects of sequencing errors on NGS-enabled microsatellite development. In view of the continuing improvement in read length on NGS platforms, sequence quality and the corresponding strategies of quality control will become the primary factors to consider for effective microsatellite isolation. Among current options, PacBio CCS may be optimal for rapid, small-scale microsatellite development due to its flexibility in scaling sequencing effort, while platforms such as Illumina MiSeq will provide cost-efficient solutions for multispecies microsatellite projects. © 2014 John Wiley & Sons Ltd.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.