Menu
July 7, 2019

Hunting structural variants: Population by population

Until recently, most population-scale genome sequencing studies have focused on identifying single nucleotide variants (SNVs) to explore genetic differences between individuals. Like so many SNV-based genome-wide association studies, however, these efforts have had difficulty identifying causative genetic mechanisms underlying most complex functions. More and more, the genomics community has realised that structural variation is likely responsible for many of the traits and phenotypes that scientists have not been able to attribute to SNVs. This class of variants, defined as genetic differences of 50 bp or larger, accounts for most of the DNA sequence differences between any two people. Structural variants (SVs) are also already known to cause many common and rare diseases including ALS, schizophrenia, leukemia, Carney complex, and Huntington’s disease. Despite the importance of SVs, these larger variants have been understudied and underreported compared to their single-nucleotide counterparts. One reason is that they remain difficult to detect. Their length often means they cannot be fully spanned using short sequencing reads. They also often occur in highly repetitive or GC-rich regions of the genome, making them challenging targets. As such, this class of human genetic variation has remained vastly under-explored in global populations and is now ripe for discovery.


July 7, 2019

Convergence of plasmid architectures drives emergence of multi-drug resistance in a clonally diverse Escherichia coli population from a veterinary clinical care setting.

The purpose of this study was to determine the plasmid architecture and context of resistance genes in multi-drug resistant (MDR) Escherichia coli strains isolated from urinary tract infections in dogs. Illumina and single-molecule real-time (SMRT) sequencing were applied to assemble the complete genomes of E. coli strains associated with clinical urinary tract infections, which were either phenotypically MDR or drug susceptible. This revealed that multiple distinct families of plasmids were associated with building an MDR phenotype. Plasmid-mediated AmpC (CMY-2) beta-lactamase resistance was associated with a clonal group of IncI1 plasmids that has remained stable in isolates collected up to a decade apart. Other plasmids, in particular those with an IncF replicon type, contained other resistance gene markers, so that the emergence of these MDR strains was driven by the accumulation of multiple plasmids, up to 5 replicons in specific cases. This study indicates that vulnerable patients, often with complex clinical histories provide a setting leading to the emergence of MDR E. coli strains in clonally distinct commensal backgrounds. While it is known that horizontally-transferred resistance supplements uropathogenic strains of E. coli such as ST131, our study demonstrates that the selection of an MDR phenotype in commensal E. coli strains can result in opportunistic infections in vulnerable patient populations. These strains provide a reservoir for the onward transfer of resistance alleles into more typically pathogenic strains and provide opportunities for the coalition of resistance and virulence determinants on plasmids as evidenced by the IncF replicons characterised in this study. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.


July 7, 2019

Mechanisms of surface antigenic variation in the human pathogenic fungus Pneumocystis jirovecii.

Microbial pathogens commonly escape the human immune system by varying surface proteins. We investigated the mechanisms used for that purpose by Pneumocystis jirovecii This uncultivable fungus is an obligate pulmonary pathogen that in immunocompromised individuals causes pneumonia, a major life-threatening infection. Long-read PacBio sequencing was used to assemble a core of subtelomeres of a single P. jirovecii strain from a bronchoalveolar lavage fluid specimen from a single patient. A total of 113 genes encoding surface proteins were identified, including 28 pseudogenes. These genes formed a subtelomeric gene superfamily, which included five families encoding adhesive glycosylphosphatidylinositol (GPI)-anchored glycoproteins and one family encoding excreted glycoproteins. Numerical analyses suggested that diversification of the glycoproteins relies on mosaic genes created by ectopic recombination and occurs only within each family. DNA motifs suggested that all genes are expressed independently, except those of the family encoding the most abundant surface glycoproteins, which are subject to mutually exclusive expression. PCR analyses showed that exchange of the expressed gene of the latter family occurs frequently, possibly favored by the location of the genes proximal to the telomere because this allows concomitant telomere exchange. Our observations suggest that (i) the P. jirovecii cell surface is made of a complex mixture of different surface proteins, with a majority of a single isoform of the most abundant glycoprotein, (ii) genetic mosaicism within each family ensures variation of the glycoproteins, and (iii) the strategy of the fungus consists of the continuous production of new subpopulations composed of cells that are antigenically different.IMPORTANCEPneumocystis jirovecii is a fungus causing severe pneumonia in immunocompromised individuals. It is the second most frequent life-threatening invasive fungal infection. We have studied the mechanisms of antigenic variation used by this pathogen to escape the human immune system, a strategy commonly used by pathogenic microorganisms. Using a new DNA sequencing technology generating long reads, we could characterize the highly repetitive gene families encoding the proteins that are present on the cellular surface of this pest. These gene families are localized in the regions close to the ends of all chromosomes, the subtelomeres. Such chromosomal localization was found to favor genetic recombinations between members of each gene family and to allow diversification of these proteins continuously over time. This pathogen seems to use a strategy of antigenic variation consisting of the continuous production of new subpopulations composed of cells that are antigenically different. Such a strategy is unique among human pathogens. Copyright © 2017 Schmid-Siegert et al.


July 7, 2019

Genomic variation and evolution of Vibrio parahaemolyticus ST36 over the course of a transcontinental epidemic expansion.

Vibrio parahaemolyticus is the leading cause of seafood-related infections with illnesses undergoing a geographic expansion. In this process of expansion, the most fundamental change has been the transition from infections caused by local strains to the surge of pandemic clonal types. Pandemic clone sequence type 3 (ST3) was the only example of transcontinental spreading until 2012, when ST36 was detected outside the region where it is endemic in the U.S. Pacific Northwest causing infections along the U.S. northeast coast and Spain. Here, we used genome-wide analyses to reconstruct the evolutionary history of the V. parahaemolyticus ST36 clone over the course of its geographic expansion during the previous 25 years. The origin of this lineage was estimated to be in ~1985. By 1995, a new variant emerged in the region and quickly replaced the old clone, which has not been detected since 2000. The new Pacific Northwest (PNW) lineage was responsible for the first cases associated with this clone outside the Pacific Northwest region. After several introductions into the northeast coast, the new PNW clone differentiated into a highly dynamic group that continues to cause illness on the northeast coast of the United States. Surprisingly, the strains detected in Europe in 2012 diverged from this ancestral group around 2000 and have conserved genetic features present only in the old PNW lineage. Recombination was identified as the major driver of diversification, with some preliminary observations suggesting a trend toward a more specialized lifestyle, which may represent a critical element in the expansion of epidemics under scenarios of coastal warming.IMPORTANCEVibrio parahaemolyticus and Vibrio cholerae represent the only two instances of pandemic expansions of human pathogens originating in the marine environment. However, while the current pandemic of V. cholerae emerged more than 50 years ago, the global expansion of V. parahaemolyticus is a recent phenomenon. These modern expansions provide an exceptional opportunity to study the evolutionary process of these pathogens at first hand and gain an understanding of the mechanisms shaping the epidemic dynamics of these diseases, in particular, the emergence, dispersal, and successful introduction in new regions facilitating global spreading of infections. In this study, we used genomic analysis to examine the evolutionary divergence that has occurred over the course of the most recent transcontinental expansion of a pathogenic Vibrio, the spreading of the V. parahaemolyticus sequence type 36 clone from the region where it is endemic on the Pacific coast of North America to the east coast of the United States and finally to the west coast of Europe.


July 7, 2019

Genomic comparison between Staphylococcus aureus GN strains clinically isolated from a familial infection case: IS1272 transposition through a novel inverted repeat-replacing mechanism.

A bacterial insertion sequence (IS) is a mobile DNA sequence carrying only the transposase gene (tnp) that acts as a mutator to disrupt genes, alter gene expressions, and cause genomic rearrangements. “Canonical” ISs have historically been characterized by their terminal inverted repeats (IRs), which may form a stem-loop structure, and duplications of a short (non-IR) target sequence at both ends, called target site duplications (TSDs). The IS distributions and virulence potentials of Staphylococcus aureus genomes in familial infection cases are unclear. Here, we determined the complete circular genome sequences of familial strains from a Panton-Valentine leukocidin (PVL)-positive ST50/agr4 S. aureus (GN) infection of a 4-year old boy with skin abscesses. The genomes of the patient strain (GN1) and parent strain (GN3) were rich for “canonical” IS1272 with terminal IRs, both having 13 commonly-existing copies (ce-IS1272). Moreover, GN1 had a newly-inserted IS1272 (ni-IS1272) on the PVL-converting prophage, while GN3 had two copies of ni-IS1272 within the DNA helicase gene and near rot. The GN3 genome also had a small deletion. The targets of ni-IS1272 transposition were IR structures, in contrast with previous “canonical” ISs. There were no TSDs. Based on a database search, the targets for ce-IS1272 were IRs or “non-IRs”. IS1272 included a larger structure with tandem duplications of the left (IRL) side sequence; tnp included minor cases of a long fusion form and truncated form. One ce-IS1272 was associated with the segments responsible for immune evasion and drug resistance. Regarding virulence, GN1 expressed cytolytic peptides (phenol-soluble modulin a and d-hemolysin) and PVL more strongly than some other familial strains. These results suggest that IS1272 transposes through an IR-replacing mechanism, with an irreversible process unlike that of “canonical” transpositions, resulting in genomic variations, and that, among the familial strains, the patient strain has strong virulence potential based on community-associated virulence factors.


July 7, 2019

Complete genome sequence of Salmonella enterica subsp. enterica serovar Minnesota strain

Mango has been implicated as food vehicle in several Salmonella-causing foodborne outbreaks. Here, Salmonella enterica subsp. enterica serovar Minnesota was isolated from fresh mango fruit imported from Mexico in 2014. The complete genome sequence of S. Minnesota CFSAN017963 was sequenced using single-molecule real-time DNA sequencing. Distinct prophage regions, Salmonella pathogenicity islands, and fimbrial gene clusters were observed in comparative genomic analysis on S. Minnesota CFSAN017963 with other phylogenetically closely related Salmonella serovars. Core genome multilocus sequencing typing analysis of all the S. Minnesota isolates in the Genbank and Enterobase also revealed a high genomic diversity among the genomes analyzed.


July 7, 2019

Detection of complex structural variation from paired-end sequencing data

Detecting structural variants (SVs) from sequencing data is a key problem in genome analysis, but the full diversity of SVs is not captured by most methods. We introduce the Automated Reconstruction of Complex Structural Variants (ARC-SV) method, which detects a broad class of structural variants from paired-end whole genome sequencing (WGS) data. Analysis of samples from NA12878 and HuRef suggests that complex SVs are often misclassified by traditional methods. We validated our results both experimentally and by comparison to whole genome assembly and PacBio data; ARC-SV compares favorably to existing algorithms in general and gives state-of-the-art results on complex SV detection. By expanding the range of detectable SVs compared to commonly-used algorithms, ARC-SV allows additional information to be extracted from existing WGS data.


July 7, 2019

Dissemination and characteristics of a novel plasmid-encoded carbapenem-hydrolyzing class D beta-lactamase, OXA-436 from four patients involving six different hospitals in Denmark.

The diversity of OXA-48-like carbapenemases is continually expanding. In this study, we describe the dissemination and characteristics of a novel carbapenem-hydrolyzing class D carbapenemase (CHDL) named OXA-436. In total, six OXA-436-producing Enterobacteriaceae isolates including Enterobacter asburiae (n=3), Citrobacter freundii (n=2) and Klebsiella pneumoniae (n=1) were identified in four patients in the period between September 2013 and April 2015. All three species of OXA-436-producing Enterobacteriaceae were found in one patient. The amino acid sequence of OXA-436 showed 90.4-92.8% identity to other acquired OXA-48-like variants. Expression of OXA-436 in Escherichia coli and kinetic analysis of purified OXA-436 revealed an activity profile similar to OXA-48 and OXA-181 with activity against penicillins including temocillin, limited or no activity against extended-spectrum cephalosporins and activity against carbapenems. The blaOXA-436 gene was located on a conjugative ~314 kb IncHI2/IncHI2A plasmid belonging to pMLST ST1, in a region surrounded by chromosomal genes previously identified adjacent to blaOXA-genes in Shewanella spp. In conclusion, OXA-436 is a novel CHDL with similar functional properties as OXA-48-like CHDLs. The described geographical spread among different Enterobacteriaceae and plasmid location of blaOXA-436 illustrates its potential for further dissemination. Copyright © 2017 American Society for Microbiology.


July 7, 2019

Copy number variation probes inform diverse applications

A major contributor to inter-individual genomic variability is copy number variation (CNV). CNVs change the diploid status of the DNA, involve one or multiple genes, and may disrupt coding regions, affect regulatory elements, or change gene dosage. While some of these changes may have no phenotypic consequences, others underlie disease, explain evolutionary processes, or impact the response to medication.


July 7, 2019

N6-adenine DNA methylation is associated with the linker DNA of H2A.Z-containing well-positioned nucleosomes in Pol II-transcribed genes in Tetrahymena.

DNA N6-methyladenine (6mA) is newly rediscovered as a potential epigenetic mark across a more diverse range of eukaryotes than previously realized. As a unicellular model organism, Tetrahymena thermophila is among the first eukaryotes reported to contain 6mA modification. However, lack of comprehensive information about 6mA distribution hinders further investigations into its function and regulatory mechanism. In this study, we provide the first genome-wide, base pair-resolution map of 6mA in Tetrahymena by applying single-molecule real-time (SMRT) sequencing. We provide evidence that 6mA occurs mostly in the AT motif of the linker DNA regions. More strikingly, these linker DNA regions with 6mA are usually flanked by well-positioned nucleosomes and/or H2A.Z-containing nucleosomes. We also find that 6mA is exclusively associated with RNA polymerase II (Pol II)-transcribed genes, but is not an unambiguous mark for active transcription. These results support that 6mA is an integral part of the chromatin landscape shaped by adenosine triphosphate (ATP)-dependent chromatin remodeling and transcription.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

DNA methylation profiling using long-read Single Molecule Real-Time bisulfite sequencing (SMRT-BS).

For the past two decades, bisulfite sequencing has been a widely used method for quantitative CpG methylation detection of genomic DNA. Coupled with PCR amplicon cloning, bisulfite Sanger sequencing allows for allele-specific CpG methylation assessment; however, its time-consuming protocol and inability to multiplex has recently been overcome by next-generation bisulfite sequencing techniques. Although high-throughput sequencing platforms have enabled greater accuracy in CpG methylation quantitation as a result of increased bisulfite sequencing depth, most common sequencing platforms generate reads that are similar in length to the typical bisulfite PCR size range (~300-500 bp). Using the Pacific Biosciences (PacBio) sequencing platform, we developed single molecule real-time bisulfite sequencing (SMRT-BS), which is an accurate targeted CpG methylation analysis method capable of a high degree of multiplexing and long read lengths. SMRT-BS is reproducible and was found to be concordant with other lower throughput quantitative CpG methylation methods. Moreover, the ability to sequence up to ~1.5-2.0 kb amplicons, when coupled with an optimized bisulfite-conversion protocol, allows for more thorough assessment of CpG islands and increases the capacity for studying the relationship between single nucleotide variants and allele-specific CpG methylation.


July 7, 2019

Comparative and population genomic landscape of Phellinus noxius: A hypervariable fungus causing root rot in trees.

The order Hymenochaetales of white rot fungi contain some of the most aggressive wood decayers causing tree deaths around the world. Despite their ecological importance and the impact of diseases they cause, little is known about the evolution and transmission patterns of these pathogens. Here, we sequenced and undertook comparative genomic analyses of Hymenochaetales genomes using brown root rot fungus Phellinus noxius, wood-decomposing fungus Phellinus lamaensis, laminated root rot fungus Phellinus sulphurascens and trunk pathogen Porodaedalea pini. Many gene families of lignin-degrading enzymes were identified from these fungi, reflecting their ability as white rot fungi. Comparing against distant fungi highlighted the expansion of 1,3-beta-glucan synthases in P. noxius, which may account for its fast-growing attribute. We identified 13 linkage groups conserved within Agaricomycetes, suggesting the evolution of stable karyotypes. We determined that P. noxius has a bipolar heterothallic mating system, with unusual highly expanded ~60 kb A locus as a result of accumulating gene transposition. We investigated the population genomics of 60 P. noxius isolates across multiple islands of the Asia Pacific region. Whole-genome sequencing showed this multinucleate species contains abundant poly-allelic single nucleotide polymorphisms with atypical allele frequencies. Different patterns of intra-isolate polymorphism reflect mono-/heterokaryotic states which are both prevalent in nature. We have shown two genetically separated lineages with one spanning across many islands despite the geographical barriers. Both populations possess extraordinary genetic diversity and show contrasting evolutionary scenarios. These results provide a framework to further investigate the genetic basis underlying the fitness and virulence of white rot fungi.© 2017 John Wiley & Sons Ltd.


July 7, 2019

Genomics of parallel adaptation at two timescales in Drosophila.

Two interesting unanswered questions are the extent to which both the broad patterns and genetic details of adaptive divergence are repeatable across species, and the timescales over which parallel adaptation may be observed. Drosophila melanogaster is a key model system for population and evolutionary genomics. Findings from genetics and genomics suggest that recent adaptation to latitudinal environmental variation (on the timescale of hundreds or thousands of years) associated with Out-of-Africa colonization plays an important role in maintaining biological variation in the species. Additionally, studies of interspecific differences between D. melanogaster and its sister species D. simulans have revealed that a substantial proportion of proteins and amino acid residues exhibit adaptive divergence on a roughly few million years long timescale. Here we use population genomic approaches to attack the problem of parallelism between D. melanogaster and a highly diverged conger, D. hydei, on two timescales. D. hydei, a member of the repleta group of Drosophila, is similar to D. melanogaster, in that it too appears to be a recently cosmopolitan species and recent colonizer of high latitude environments. We observed parallelism both for genes exhibiting latitudinal allele frequency differentiation within species and for genes exhibiting recurrent adaptive protein divergence between species. Greater parallelism was observed for long-term adaptive protein evolution and this parallelism includes not only the specific genes/proteins that exhibit adaptive evolution, but extends even to the magnitudes of the selective effects on interspecific protein differences. Thus, despite the roughly 50 million years of time separating D. melanogaster and D. hydei, and despite their considerably divergent biology, they exhibit substantial parallelism, suggesting the existence of a fundamental predictability of adaptive evolution in the genus.


July 7, 2019

Meeting report on experimental approaches to evolution and ecology using yeast and other model systems.

The fourth EMBO-sponsored conference on Experimental Approaches to Evolution and Ecology Using Yeast and Other Model Systems (https://www.embl.de/training/events/2016/EAE16-01/), was held at the EMBL in Heidelberg, Germany, October 19-23, 2016. The conference was organized by Judith Berman (Tel Aviv University), Maitreya Dunham (University of Washington), Jun-Yi Leu (Academia Sinica), and Lars Steinmetz (EMBL Heidelberg and Stanford University). The meeting attracted ~120 researchers from 28 countries and covered a wide range of topics in the fields of genetics, evolutionary biology, and ecology with a unifying focus on yeast as a model system. Attendees enjoyed the Keith Haring inspired yeast florescence microscopy artwork (Figure 1), a unique feature of the meeting since its inception, and the one-minute flash talks that catalyzed discussions at two vibrant poster sessions. The meeting coincided with the 20th anniversary of the publication describing the sequence of the first eukaryotic genome, Saccharomyces cerevisiae (Goffeau et al. 1996). Many of the conference talks focused on important questions about what is contained in the genome, how genomes evolve, and the architecture and behavior of communities of phenotypically and genotypically diverse microorganisms. Here, we summarize highlights of the research talks around these themes. Nearly all presentations focused on novel findings, and we refer the reader to relevant manuscripts that have subsequently been published. Copyright © 2017, G3: Genes, Genomes, Genetics.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.