Menu
July 19, 2019

The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms.

The planarian Schmidtea mediterranea is an important model for stem cell research and regeneration, but adequate genome resources for this species have been lacking. Here we report a highly contiguous genome assembly of S. mediterranea, using long-read sequencing and a de novo assembler (MARVEL) enhanced for low-complexity reads. The S. mediterranea genome is highly polymorphic and repetitive, and harbours a novel class of giant retroelements. Furthermore, the genome assembly lacks a number of highly conserved genes, including critical components of the mitotic spindle assembly checkpoint, but planarians maintain checkpoint function. Our genome assembly provides a key model system resource that will be useful for studying regeneration and the evolutionary plasticity of core cell biological mechanisms.


July 19, 2019

Survey on the use of whole-genome sequencing for infectious diseases surveillance: Rapid expansion of European national capacities, 2015-2016.

Whole-genome sequencing (WGS) has become an essential tool for public health surveillance and molecular epidemiology of infectious diseases and antimicrobial drug resistance. It provides precise geographical delineation of spread and enables incidence monitoring of pathogens at genotype level. Coupled with epidemiological and environmental investigations, it delivers ultimate resolution for tracing sources of epidemic infections. To ascertain the level of implementation of WGS-based typing for national public health surveillance and investigation of prioritized diseases in the European Union (EU)/European Economic Area (EEA), two surveys were conducted in 2015 and 2016. The surveys were designed to determine the national public health reference laboratories’ access to WGS and operational WGS-based typing capacity for national surveillance of selected foodborne pathogens, antimicrobial-resistant pathogens, and vaccine-preventable diseases identified as priorities for European genomic surveillance. Twenty-eight and twenty-nine out of the 30 EU/EEA countries participated in the survey in 2015 and 2016, respectively. National public health reference laboratories in 22 and 25 countries had access to WGS-based typing for public health applications in 2015 and 2016, respectively. Reported reasons for limited or no access were lack of funding, staff, and expertise. Illumina technology was the most frequently used followed by Ion Torrent technology. The access to bioinformatics expertise and competence for routine WGS data analysis was limited. By mid-2016, half of the EU/EEA countries were using WGS analysis either as first- or second-line typing method for surveillance of the pathogens and antibiotic resistance issues identified as EU priorities. The sampling frame as well as bioinformatics analysis varied by pathogen/resistance issue and country. Core genome multilocus allelic profiling, also called cgMLST, was the most frequently used annotation approach for typing bacterial genomes suggesting potential bioinformatics pipeline compatibility. Further capacity development for WGS-based typing is ongoing in many countries and upon consolidation and harmonization of methods should enable pan-EU data exchange for genomic surveillance in the medium-term subject to the development of suitable data management systems and appropriate agreements for data sharing.


July 19, 2019

Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.

Without knowledge of their genomic sequences, it is impossible to make functional models of the bacteria that make up human and animal microbiota. Unfortunately, the vast majority of publicly available genomes are only working drafts, an incompleteness that causes numerous problems and constitutes a major obstacle to genotypic and phenotypic interpretation. In this work, we began with an example from the class Bacteroidia in the phylum Bacteroidetes, which is preponderant among human orodigestive microbiota. We successfully identify the genetic loci responsible for assembly breaks and misassemblies and demonstrate the importance and usefulness of long-read sequencing and curated reannotation.We showed that the fragmentation in Bacteroidia draft genomes assembled from massively parallel sequencing linearly correlates with genomic repeats of the same or greater size than the reads. We also demonstrated that some of these repeats, especially the long ones, correspond to misassembled loci in three reference Porphyromonas gingivalis genomes marked as circularized (thus complete or finished). We prove that even at modest coverage (30X), long-read resequencing together with PCR contiguity verification (rrn operons and an integrative and conjugative element or ICE) can be used to identify and correct the wrongly combined or assembled regions. Finally, although time-consuming and labor-intensive, consistent manual biocuration of three P. gingivalis strains allowed us to compare and correct the existing genomic annotations, resulting in a more accurate interpretation of the genomic differences among these strains.In this study, we demonstrate the usefulness and importance of long-read sequencing in verifying published genomes (even when complete) and generating assemblies for new bacterial strains/species with high genomic plasticity. We also show that when combined with biological validation processes and diligent biocurated annotation, this strategy helps reduce the propagation of errors in shared databases, thus limiting false conclusions based on incomplete or misleading information.


July 19, 2019

Neofunctionalization of duplicated P450 genes drives the evolution of insecticide resistance in the brown planthopper.

Gene duplication is a major source of genetic variation that has been shown to underpin the evolution of a wide range of adaptive traits [1, 2]. For example, duplication or amplification of genes encoding detoxification enzymes has been shown to play an important role in the evolution of insecticide resistance [3-5]. In this context, gene duplication performs an adaptive function as a result of its effects on gene dosage and not as a source of functional novelty [3, 6-8]. Here, we show that duplication and neofunctionalization of a cytochrome P450, CYP6ER1, led to the evolution of insecticide resistance in the brown planthopper. Considerable genetic variation was observed in the coding sequence of CYP6ER1 in populations of brown planthopper collected from across Asia, but just two sequence variants are highly overexpressed in resistant strains and metabolize imidacloprid. Both variants are characterized by profound amino-acid alterations in substrate recognition sites, and the introduction of these mutations into a susceptible P450 sequence is sufficient to confer resistance. CYP6ER1 is duplicated in resistant strains with individuals carrying paralogs with and without the gain-of-function mutations. Despite numerical parity in the genome, the susceptible and mutant copies exhibit marked asymmetry in their expression with the resistant paralogs overexpressed. In the primary resistance-conferring CYP6ER1 variant, this results from an extended region of novel sequence upstream of the gene that provides enhanced expression. Our findings illustrate the versatility of gene duplication in providing opportunities for functional and regulatory innovation during the evolution of an adaptive trait. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.


July 19, 2019

Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons.

Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018.


July 19, 2019

Ultradeep single-molecule real-time sequencing of HIV envelope reveals complete compartmentalization of highly macrophage-tropic R5 proviral variants in brain and CXCR4-using variants in immune and peripheral tissues.

Despite combined antiretroviral therapy (cART), HIV+ patients still develop neurological disorders, which may be due to persistent HIV infection and selective evolution in brain tissues. Single-molecule real-time (SMRT) sequencing technology offers an improved opportunity to study the relationship among HIV isolates in the brain and lymphoid tissues because it is capable of generating thousands of long sequence reads in a single run. Here, we used SMRT sequencing to generate ~?50,000 high-quality full-length HIV envelope sequences (>?2200 bp) from seven autopsy tissues from an HIV+/cART+ subject, including three brain and four non-brain sites. Sanger sequencing was used for comparison with SMRT data and to clone functional pseudoviruses for in vitro tropism assays. Phylogenetic analysis demonstrated that brain-derived HIV was compartmentalized from HIV outside the brain and that the variants from each of the three brain tissues grouped independently. Variants from all peripheral tissues were intermixed on the tree but independent of the brain clades. Due to the large number of sequences, a clustering analysis at three similarity thresholds (99, 99.5, and 99.9%) was also performed. All brain sequences clustered exclusive of any non-brain sequences at all thresholds; however, frontal lobe sequences clustered independently of occipital and parietal lobes. Translated sequences revealed potentially functional differences between brain and non-brain sequences in the location of putative N-linked glycosylation sites (N-sites), V1 length, V3 charge, and the number of V4 N-sites. All brain sequences were predicted to use the CCR5 co-receptor, while most non-brain sequences were predicted to use CXCR4 co-receptor. Tropism results were confirmed by in vitro infection assays. The study is the first to use a SMRT sequencing approach to study HIV compartmentalization in tissues and supports other reports of limited trafficking between brain and non-brain sequences during cART. Due to the long sequence length, we could observe changes along the entire envelope gene, likely caused by differential selective pressure in the brain that may contribute to neurological disease.


July 19, 2019

Advances in Sequencing and Resequencing in Crop Plants.

DNA sequencing technologies have changed the face of biological research over the last 20 years. From reference genomes to population level resequencing studies, these technologies have made significant contributions to our understanding of plant biology and evolution. As the technologies have increased in power, the breadth and complexity of the questions that can be asked has increased. Along with this, the challenges of managing unprecedented quantities of sequence data are mounting. This chapter describes a few aspects of the journey so far and looks forward to what may lie ahead.


July 19, 2019

A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high interhaplotype diversity.

A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism’s evolutionary potential and ability to adapt to stress conditions. Yet, it is challenging to provide haplotype-specific information at a whole-genome level. Here, we take advantage of long-read DNA sequencing technology and a tailored-assembly algorithm to disentangle the two haploid genomes of a dikaryotic pathogenic wheat rust fungus. The two genomes display high levels of nucleotide and structural variations, which lead to allelic variation and the presence of genes lacking allelic counterparts. Nonallelic candidate effector genes, which likely encode important pathogenicity factors, display distinct genome localization patterns and are less likely to be evolutionary conserved than those which are present as allelic pairs. This genomic diversity may promote rapid host adaptation and/or be related to the age of the sequenced isolate since last meiosis. Copyright © 2018 Schwessinger et al.


July 19, 2019

Utility of DNA, RNA, protein, and functional approaches to solve cryptic immunodeficiencies.

We report a female infant identified by newborn screening for severe combined immunodeficiencies (NBS SCID) with T cell lymphopenia (TCL). The patient had persistently elevated alpha-fetoprotein (AFP) with IgA deficiency, and elevated IgM. Gene sequencing for a SCID panel was uninformative. We sought to determine the cause of the immunodeficiency in this infant.We performed whole-exome sequencing (WES) on the patient and parents to identify a genetic diagnosis. Based on the WES result, we developed a novel flow cytometric panel for rapid assessment of DNA repair defects using blood samples. We also performed whole transcriptome sequencing (WTS) on fibroblast RNA from the patient and father for abnormal transcript analysis.WES revealed a pathogenic paternally inherited indel in ATM. We used the flow panel to assess several proteins in the DNA repair pathway in lymphocyte subsets. The patient had absent phosphorylation of ATM, resulting in absent or aberrant phosphorylation of downstream proteins, including ?H2AX. However, ataxia-telangiectasia (AT) is an autosomal recessive condition, and the abnormal functional data did not correspond with a single ATM variant. WTS revealed in-frame reciprocal fusion transcripts involving ATM and SLC35F2 indicating a chromosome 11 inversion within 11q22.3, of maternal origin. Inversion breakpoints were identified within ATM intron 16 and SLC35F2 intron 7.We identified a novel ATM-breaking chromosome 11 inversion in trans with a pathogenic indel (compound heterozygote) resulting in non-functional ATM protein, consistent with a diagnosis of AT. Utilization of several molecular and functional assays allowed successful resolution of this case.


July 19, 2019

The Florida manatee (Trichechus manatus latirostris) T cell receptor loci exhibit V subgroup synteny and chain-specific evolution.

The Florida manatee (Trichechus manatus latirostris) has limited diversity in the immunoglobulin heavy chain. We therefore investigated the antigen receptor loci of the other arm of the adaptive immune system: the T cell receptor. Manatees are the first species from Afrotheria, a basal eutherian superorder, to have an in-depth characterization of all T cell receptor loci. By annotating the genome and expressed transcripts, we found that each chain has distinct features that correlates to their individual functions. The genomic organization also plays a role in modulating sequence conservation between species. There were extensive V subgroup synteny blocks in the TRA and TRB loci between T. m. latirostris and human. Increased genomic locus complexity correlated to increased locus synteny. We also identified evidence for a VHD pseudogene for the first time in a eutherian mammal. These findings emphasize the value of including species within this basal eutherian radiation in comparative studies. Copyright © 2018. Published by Elsevier Ltd.


July 19, 2019

HIV envelope glycoform heterogeneity and localized diversity govern the initiation and maturation of a V2 apex broadly neutralizing antibody lineage.

Understanding how broadly neutralizing antibodies (bnAbs) to HIV envelope (Env) develop during natural infection can help guide the rational design of an HIV vaccine. Here, we described a bnAb lineage targeting the Env V2 apex and the Ab-Env co-evolution that led to development of neutralization breadth. The lineage Abs bore an anionic heavy chain complementarity-determining region 3 (CDRH3) of 25 amino acids, among the shortest known for this class of Abs, and achieved breadth with only 10% nucleotide somatic hypermutation and no insertions or deletions. The data suggested a role for Env glycoform heterogeneity in the activation of the lineage germline B cell. Finally, we showed that localized diversity at key V2 epitope residues drove bnAb maturation toward breadth, mirroring the Env evolution pattern described for another donor who developed V2-apex targeting bnAbs. Overall, these findings suggest potential strategies for vaccine approaches based on germline-targeting and serial immunogen design. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.


July 19, 2019

Genome sequence of the progenitor of wheat A subgenome Triticum urartu.

Triticum urartu (diploid, AA) is the progenitor of the A subgenome of tetraploid (Triticum turgidum, AABB) and hexaploid (Triticum aestivum, AABBDD) wheat1,2. Genomic studies of T. urartu have been useful for investigating the structure, function and evolution of polyploid wheat genomes. Here we report the generation of a high-quality genome sequence of T. urartu by combining bacterial artificial chromosome (BAC)-by-BAC sequencing, single molecule real-time whole-genome shotgun sequencing 3 , linked reads and optical mapping4,5. We assembled seven chromosome-scale pseudomolecules and identified protein-coding genes, and we suggest a model for the evolution of T. urartu chromosomes. Comparative analyses with genomes of other grasses showed gene loss and amplification in the numbers of transposable elements in the T. urartu genome. Population genomics analysis of 147 T. urartu accessions from across the Fertile Crescent showed clustering of three groups, with differences in altitude and biostress, such as powdery mildew disease. The T. urartu genome assembly provides a valuable resource for studying genetic variation in wheat and related grasses, and promises to facilitate the discovery of genes that could be useful for wheat improvement.


July 19, 2019

Discordant inheritance of chromosomal and extrachromosomal DNA elements contributes to dynamic disease evolution in glioblastoma.

To understand how genomic heterogeneity of glioblastoma (GBM) contributes to poor therapy response, we performed DNA and RNA sequencing on GBM samples and the neurospheres and orthotopic xenograft models derived from them. We used the resulting dataset to show that somatic driver alterations including single-nucleotide variants, focal DNA alterations and oncogene amplification on extrachromosomal DNA (ecDNA) elements were in majority propagated from tumor to model systems. In several instances, ecDNAs and chromosomal alterations demonstrated divergent inheritance patterns and clonal selection dynamics during cell culture and xenografting. We infer that ecDNA was unevenly inherited by offspring cells, a characteristic that affects the oncogenic potential of cells with more or fewer ecDNAs. Longitudinal patient tumor profiling found that oncogenic ecDNAs are frequently retained throughout the course of disease. Our analysis shows that extrachromosomal elements allow rapid increase of genomic heterogeneity during GBM evolution, independently of chromosomal DNA alterations.


July 19, 2019

Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits.

The ancestors of Gossypium arboreum and Gossypium herbaceum provided the A subgenome for the modern cultivated allotetraploid cotton. Here, we upgraded the G. arboreum genome assembly by integrating different technologies. We resequenced 243?G. arboreum and G. herbaceum accessions to generate a map of genome variations and found that they are equally diverged from Gossypium raimondii. Independent analysis suggested that Chinese G. arboreum originated in South China and was subsequently introduced to the Yangtze and Yellow River regions. Most accessions with domestication-related traits experienced geographic isolation. Genome-wide association study (GWAS) identified 98 significant peak associations for 11 agronomically important traits in G. arboreum. A nonsynonymous substitution (cysteine-to-arginine substitution) of GaKASIII seems to confer substantial fatty acid composition (C16:0 and C16:1) changes in cotton seeds. Resistance to fusarium wilt disease is associated with activation of GaGSTF9 expression. Our work represents a major step toward understanding the evolution of the A genome of cotton.


July 19, 2019

Male-killing toxin in a bacterial symbiont of Drosophila.

Several lineages of symbiotic bacteria in insects selfishly manipulate host reproduction to spread in a population 1 , often by distorting host sex ratios. Spiroplasma poulsonii2,3 is a helical and motile, Gram-positive symbiotic bacterium that resides in a wide range of Drosophila species 4 . A notable feature of S. poulsonii is male killing, whereby the sons of infected female hosts are selectively killed during development1,2. Although male killing caused by S. poulsonii has been studied since the 1950s, its underlying mechanism is unknown. Here we identify an S. poulsonii protein, designated Spaid, whose expression induces male killing. Overexpression of Spaid in D. melanogaster kills males but not females, and induces massive apoptosis and neural defects, recapitulating the pathology observed in S. poulsonii-infected male embryos5-11. Our data suggest that Spaid targets the dosage compensation machinery on the male X chromosome to mediate its effects. Spaid contains ankyrin repeats and a deubiquitinase domain, which are required for its subcellular localization and activity. Moreover, we found a laboratory mutant strain of S. poulsonii with reduced male-killing ability and a large deletion in the spaid locus. Our study has uncovered a bacterial protein that affects host cellular machinery in a sex-specific way, which is likely to be the long-searched-for factor responsible for S. poulsonii-induced male killing.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.