Menu
July 7, 2019

The value of new genome references.

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.


July 7, 2019

Deep sequencing in the management of hepatitis virus infections.

The hepatitis viruses represent a major public health problem worldwide. Procedures for characterization of the genomic composition of their populations, accurate diagnosis, identification of multiple infections, and information on inhibitor-escape mutants for treatment decisions are needed. Deep sequencing methodologies are extremely useful for these viruses since they replicate as complex and dynamic quasispecies swarms whose complexity and mutant composition are biologically relevant traits. Population complexity is a major challenge for disease prevention and control, but also an opportunity to distinguish among related but phenotypically distinct variants that might anticipate disease progression and treatment outcome. Detailed characterization of mutant spectra should permit choosing better treatment options, given the increasing number of new antiviral inhibitors available. In the present review we briefly summarize our experience on the use of deep sequencing for the management of hepatitis virus infections, particularly for hepatitis B and C viruses, and outline some possible new applications of deep sequencing for these important human pathogens. Copyright © 2016 Elsevier B.V. All rights reserved.


July 7, 2019

Epigenetic origin of evolutionary novel centromeres.

Most evolutionary new centromeres (ENC) are composed of large arrays of satellite DNA and surrounded by segmental duplications. However, the hypothesis is that ENCs are seeded in an anonymous sequence and only over time have acquired the complexity of “normal” centromeres. Up to now evidence to test this hypothesis was lacking. We recently discovered that the well-known polymorphism of orangutan chromosome 12 was due to the presence of an ENC. We sequenced the genome of an orangutan homozygous for the ENC, and we focused our analysis on the comparison of the ENC domain with respect to its wild type counterpart. No significant variations were found. This finding is the first clear evidence that ENC seedings are epigenetic in nature. The compaction of the ENC domain was found significantly higher than the corresponding WT region and, interestingly, the expression of the only gene embedded in the region was significantly repressed.


July 7, 2019

RelA mutant Enterococcus faecium with multiantibiotic tolerance arising in an immunocompromised host.

Serious bacterial infections in immunocompromised patients require highly effective antibacterial therapy for cure, and thus, this setting may reveal novel mechanisms by which bacteria circumvent antibiotics in the absence of immune pressure. Here, an infant with leukemia developed vancomycin-resistant Enterococcus faecium (VRE) bacteremia that persisted for 26 days despite appropriate antibiotic therapy. Sequencing of 22 consecutive VRE isolates identified the emergence of a single missense mutation (L152F) in relA, which constitutively activated the stringent response, resulting in elevated baseline levels of the alarmone guanosine tetraphosphate (ppGpp). Although the mutant remained susceptible to both linezolid and daptomycin in clinical MIC testing and during planktonic growth, it demonstrated tolerance to high doses of both antibiotics when growing in a biofilm. This biofilm-specific gain in resistance was reflected in the broad shift in transcript levels caused by the mutation. Only an experimental biofilm-targeting ClpP-activating antibiotic was able to kill the mutant strain in an established biofilm. The relA mutation was associated with a fitness trade-off, forming smaller and less-well-populated biofilms on biological surfaces. We conclude that clinically relevant relA mutations can emerge during prolonged VRE infection, causing baseline activation of the stringent response, subsequent antibiotic tolerance, and delayed eradication in an immunocompromised state.The increasing prevalence of antibiotic-resistant bacterial pathogens is a major challenge currently facing the medical community. Such pathogens are of particular importance in immunocompromised patients as these individuals may favor emergence of novel resistance determinants due to lack of innate immune defenses and intensive antibiotic exposure. During the course of chemotherapy, a patient developed prolonged bacteremia with vancomycin-resistant Enterococcus faecium that failed to clear despite multiple front-line antibiotics. The consecutive bloodstream isolates were sequenced, and a single missense mutation identified in the relA gene, the mediator of the stringent response. Strains harboring the mutation had elevated baseline levels of the alarmone and displayed heightened resistance to the bactericidal activity of multiple antibiotics, particularly in a biofilm. Using a new class of compounds that modulate ClpP activity, the biofilms were successfully eradicated. These data represent the first clinical emergence of mutations in the stringent response in vancomycin-resistant entereococci. Copyright © 2017 Honsa et al.


July 7, 2019

Complete genome sequences of three multidrug-resistant clinical isolates of Streptococcus pneumoniae serotype 19A with different susceptibilities to the myxobacterial metabolite carolacton.

The full-genome sequences of three drug- and multidrug-resistant Streptococcus pneumoniae clinical isolates of serotype 19A were determined by PacBio single-molecule real-time sequencing, in combination with Illumina MiSeq sequencing. A comparison to the genomes of other pneumococci indicates a high nucleotide sequence identity to strains Hungary19A-6 and TCH8431/19A. Copyright © 2017 Donner et al.


July 7, 2019

The mitochondrial genome sequences of the round goby and the sand goby reveal patterns of recent evolution in gobiid fish.

Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb).During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies.Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.


July 7, 2019

Proteomic analysis of Pemphigus autoantibodies indicates a larger, more diverse, and more dynamic repertoire than determined by B cell genetics.

In autoantibody-mediated diseases such as pemphigus, serum antibodies lead to disease. Genetic analysis of B cells has allowed characterization of antibody repertoires in such diseases but would be complemented by proteomic analysis of serum autoantibodies. Here, we show using proteomic analysis that the serum autoantibody repertoire in pemphigus is much more polyclonal than that found by genetic studies of B cells. In addition, many B cells encode pemphigus autoantibodies that are not secreted into the serum. Heavy chain variable gene usage of serum autoantibodies is not shared among patients, implying targeting of the coded proteins will not be a useful therapeutic strategy. Analysis of autoantibodies in individual patients over several years indicates that many antibody clones persist but the proportion of each changes. These studies indicate a dynamic and diverse autoantibody response not revealed by genetic studies and explain why similar overall autoantibody titers may give variable disease activity. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.


July 7, 2019

Fallacy of the unique genome: sequence diversity within single Helicobacter pylori strains.

Many bacterial genomes are highly variable but nonetheless are typically published as a single assembled genome. Experiments tracking bacterial genome evolution have not looked at the variation present at a given point in time. Here, we analyzed the mouse-passaged Helicobacter pylori strain SS1 and its parent PMSS1 to assess intra- and intergenomic variability. Using high sequence coverage depth and experimental validation, we detected extensive genome plasticity within these H. pylori isolates, including movement of the transposable element IS607, large and small inversions, multiple single nucleotide polymorphisms, and variation in cagA copy number. The cagA gene was found as 1 to 4 tandem copies located off the cag island in both SS1 and PMSS1; this copy number variation correlated with protein expression. To gain insight into the changes that occurred during mouse adaptation, we also compared SS1 and PMSS1 and observed 46 differences that were distinct from the within-genome variation. The most substantial was an insertion in cagY, which encodes a protein required for a type IV secretion system function. We detected modifications in genes coding for two proteins known to affect mouse colonization, the HpaA neuraminyllactose-binding protein and the FutB a-1,3 lipopolysaccharide (LPS) fucosyltransferase, as well as genes predicted to modulate diverse properties. In sum, our work suggests that data from consensus genome assemblies from single colonies may be misleading by failing to represent the variability present. Furthermore, we show that high-depth genomic sequencing data of a population can be analyzed to gain insight into the normal variation within bacterial strains.IMPORTANCE Although it is well known that many bacterial genomes are highly variable, it is nonetheless traditional to refer to, analyze, and publish “the genome” of a bacterial strain. Variability is usually reduced (“only sequence from a single colony”), ignored (“just publish the consensus”), or placed in the “too-hard” basket (“analysis of raw read data is more robust”). Now that whole-genome sequences are regularly used to assess virulence and track outbreaks, a better understanding of the baseline genomic variation present within single strains is needed. Here, we describe the variability seen in typical working stocks and colonies of pathogen Helicobacter pylori model strains SS1 and PMSS1 as revealed by use of high-coverage mate pair next-generation sequencing (NGS) and confirmed by traditional laboratory techniques. This work demonstrates that reliance on a consensus assembly as “the genome” of a bacterial strain may be misleading. Copyright © 2017 Draper et al.


July 7, 2019

Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Long-read sequencing can overcome the weaknesses of short reads in the assembly of eukaryotic genomes, however, at present additional scaffolding is needed to achieve chromosome-level assemblies. We generated PacBio long-read data of the genomes of three relatives of the model plant Arabidopsis thaliana and assembled all three genomes into only a few hundred contigs. To improve the contiguities of these assemblies, we generated BioNano Genomics optical mapping and Dovetail Genomics chromosome conformation capture data for genome scaffolding. Despite their technical differences, optical mapping and chromosome conformation capture performed similarly and doubled N50 values. After improving both integration methods, assembly contiguity reached chromosome-arm-levels. We rigorously assessed the quality of contigs and scaffolds using Illumina mate-pair libraries and genetic map information. This showed that PacBio assemblies have high sequence accuracy but can contain several misassemblies, which join unlinked regions of the genome. Most, but not all of these mis-joints were removed during the integration of the optical mapping and chromosome conformation capture data. Even though none of the centromeres was fully assembled, the scaffolds revealed large parts of some centromeric regions, even including some of the heterochromatic regions, which are not present in gold standard reference sequences. Published by Cold Spring Harbor Laboratory Press.


July 7, 2019

Variant tolerant read mapping using min-hashing

DNA read mapping is a ubiquitous task in bioinformatics, and many tools have been developed to solve the read mapping problem. However, there are two trends that are changing the landscape of readmapping: First, new sequencing technologies provide very long reads with high error rates (up to 15%). Second, many genetic variants in the population are known, so the reference genome is not considered as a single string over ACGT, but as a complex object containing these variants. Most existing read mappers do not handle these new circumstances appropriately.


July 7, 2019

Whole-genome sequence of Escherichia coli serotype O157:H7 strain PA20.

Escherichia coli serotype O157:H7 strain PA20 is a Pennsylvania Department of Health clinical isolate. It has been used to study biofilm formation in O157:H7 clinical isolates, where the high incidence of prophage insertions in the mlrA transcription factor disrupts traditional csgD biofilm regulation. Here, we report the complete PA20 genome sequence. Copyright © 2017 Uhlich et al.


July 7, 2019

First complete genome sequence of Haemophilus influenzae serotype a.

Haemophilus influenzae is an important human pathogen that primarily infects small children. In recent years, H. influenzae serotype a has emerged as a significant cause of invasive disease among indigenous populations. Here, we present the first complete whole-genome sequence of H. influenzae serotype a.© Crown copyright 2017.


July 7, 2019

Whole-genome sequences of Mycobacterium tuberculosis TB282 and TB284, a widespread and a unique strain, respectively, identified in a previous study of tuberculosis transmission in central Los Angeles, California, USA.

We report here the genome sequences of two Mycobacterium tuberculosis clinical isolates previously identified in central Los Angeles, CA, in the 1990s using a PacBio platform. Isolate TB282 represents a large-cluster strain that caused 27% of the tuberculosis cases, while TB284 represents a strain that caused disease in only one patient. Copyright © 2017 Zhang and Yang.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.