Menu
April 21, 2020

Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics.

Group A Streptococcus (GAS; Streptococcus pyogenes) is a bacterial pathogen for which a commercial vaccine for humans is not available. Employing the advantages of high-throughput DNA sequencing technology to vaccine design, we have analyzed 2,083 globally sampled GAS genomes. The global GAS population structure reveals extensive genomic heterogeneity driven by homologous recombination and overlaid with high levels of accessory gene plasticity. We identified the existence of more than 290 clinically associated genomic phylogroups across 22 countries, highlighting challenges in designing vaccines of global utility. To determine vaccine candidate coverage, we investigated all of the previously described GAS candidate antigens for gene carriage and gene sequence heterogeneity. Only 15 of 28 vaccine antigen candidates were found to have both low naturally occurring sequence variation and high (>99%) coverage across this diverse GAS population. This technological platform for vaccine coverage determination is equally applicable to prospective GAS vaccine antigens identified in future studies.


April 21, 2020

Antibiotic resistance and heavy metal tolerance plasmids: the antimicrobial bulletproof properties of Escherichia fergusonii isolated from poultry.

We describe the mobilome of Escherichia fergusonii 40A isolated from poultry, consisting of four different plasmids, p46_40A (IncX1, 45,869 bp), p80_40A (non-typable, 79,635 bp), p150_40A (IncI1-ST1, 148,340 bp) and p280_40A (IncHI2A-ST2, 279,537 bp). The mobilome-40A carries a blend of several different resistance and virulence genes, heavy metal tolerance operons and conjugation system. This mobilome 40A is a perfect tool to preserve and disseminate antimicrobial resistance and makes the bacterial isolate incredibly adapted to survive under constant antimicrobial pressure.


April 21, 2020

Contrasting Roles of Transcription Factors Spineless and EcR in the Highly Dynamic Chromatin Landscape of Butterfly Wing Metamorphosis.

Development requires highly coordinated changes in chromatin accessibility in order for proper gene regulation to occur. Here, we identify factors associated with major, discrete changes in chromatin accessibility during butterfly wing metamorphosis. By combining mRNA sequencing (mRNA-seq), assay for transposase-accessible chromatin using sequencing (ATAC-seq), and machine learning analysis of motifs, we show that distinct sets of transcription factors are predictive of chromatin opening at different developmental stages. Our data suggest an important role for nuclear hormone receptors early in metamorphosis, whereas PAS-domain transcription factors are strongly associated with later chromatin opening. Chromatin immunoprecipitation sequencing (ChIP-seq) validation of select candidate factors showed spineless binding to be a major predictor of opening chromatin. Surprisingly, binding of ecdysone receptor (EcR), a candidate accessibility factor in Drosophila, was not predictive of opening but instead marked persistent sites. This work characterizes the chromatin dynamics of insect wing metamorphosis, identifies candidate chromatin remodeling factors in insects, and presents a genome assembly of the model butterfly Junonia coenia.Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.


April 21, 2020

Chromosome-level genome assembly of Triplophysa tibetana, a fish adapted to the harsh high-altitude environment of the Tibetan Plateau.

Triplophysa is an endemic fish genus of the Tibetan Plateau in China. Triplophysa tibetana, which lives at a recorded altitude of ~4,000 m and plays an important role in the highland aquatic ecosystem, serves as an excellent model for investigating high-altitude environmental adaptation. However, evolutionary and conservation studies of T. tibetana have been limited by scarce genomic resources for the genus Triplophysa. In the present study, we applied PacBio sequencing and the Hi-C technique to assemble the T. tibetana genome. A 652-Mb genome with 1,325 contigs with an N50 length of 3.1 Mb was obtained. The 1,137 contigs were further assembled into 25 chromosomes, representing 98.7% and 80.47% of all contigs at the base and sequence number level, respectively. Approximately 260 Mb of sequence, accounting for ~39.8% of the genome, was identified as repetitive elements. DNA transposons (16.3%), long interspersed nuclear elements (12.4%) and long terminal repeats (11.0%) were the most repetitive types. In total, 24,372 protein-coding genes were predicted in the genome, and ~95% of the genes were functionally annotated via a search in public databases. Using whole genome sequence information, we found that T. tibetana diverged from its common ancestor with Danio rerio ~121.4 million years ago. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population and conservation studies of T. tibetana, but it also lays a solid foundation for further investigation into the mechanisms of environmental adaptation of endemic fishes in the Tibetan Plateau. © 2019 John Wiley & Sons Ltd.


April 21, 2020

Adaptation and Phenotypic Diversification in Arabidopsis through Loss-of-Function Mutations in Protein-Coding Genes.

According to the less-is-more hypothesis, gene loss is an engine for evolutionary change. Loss-of-function (LoF) mutations resulting in the natural knockout of protein-coding genes not only provide information about gene function but also play important roles in adaptation and phenotypic diversification. Although the less-is-more hypothesis was proposed two decades ago, it remains to be explored on a large scale. In this study, we identified 60,819 LoF variants in 1071 Arabidopsis (Arabidopsis thaliana) genomes and found that 34% of Arabidopsis protein-coding genes annotated in the Columbia-0 genome do not have any LoF variants. We found that nucleotide diversity, transposable element density, and gene family size are strongly correlated with the presence of LoF variants. Intriguingly, 0.9% of LoF variants with minor allele frequency larger than 0.5% are associated with climate change. In addition, in the Yangtze River basin population, 1% of genes with LoF mutations were under positive selection, providing important insights into the contribution of LoF mutations to adaptation. In particular, our results demonstrate that LoF mutations shape diverse phenotypic traits. Overall, our results highlight the importance of the LoF variants for the adaptation and phenotypic diversification of plants. © 2019 American Society of Plant Biologists. All rights reserved.


April 21, 2020

The vaginal microbiome and preterm birth.

The incidence of preterm birth exceeds 10% worldwide. There are significant disparities in the frequency of preterm birth among populations within countries, and women of African ancestry disproportionately bear the burden of risk in the United States. In the present study, we report a community resource that includes ‘omics’ data from approximately 12,000 samples as part of the integrative Human Microbiome Project. Longitudinal analyses of 16S ribosomal RNA, metagenomic, metatranscriptomic and cytokine profiles from 45 preterm and 90 term birth controls identified harbingers of preterm birth in this cohort of women predominantly of African ancestry. Women who delivered preterm exhibited significantly lower vaginal levels of Lactobacillus crispatus and higher levels of BVAB1, Sneathia amnii, TM7-H1, a group of Prevotella species and nine additional taxa. The first representative genomes of BVAB1 and TM7-H1 are described. Preterm-birth-associated taxa were correlated with proinflammatory cytokines in vaginal fluid. These findings highlight new opportunities for assessment of the risk of preterm birth.


April 21, 2020

Recompleting the Caenorhabditis elegans genome.

Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted =53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology. © 2019 Yoshimura et al.; Published by Cold Spring Harbor Laboratory Press.


April 21, 2020

Detection of Fusarium oxysporum f. sp. fragariae from Infected Strawberry Plants.

Isolates of the Fusarium oxysporum species complex have been characterized as plant pathogens that commonly cause vascular wilt, stunting, and yellowing of the leaves in a variety of hosts. F. oxysporum species complex isolates have been grouped into formae speciales based on their ability to cause disease on a specific host. F. oxysporum f. sp. fragariae is the causal agent of Fusarium wilt of strawberry and has become a threat to production as fumigation practices have changed in California. F. oxysporum f. sp. fragariae is polyphyletic and limited genetic markers are available for its detection. In this study, next-generation sequencing and comparative genomics were used to identify a unique genetic locus that can detect all of the somatic compatibility groups of F. oxysporum f. sp. fragariae identified in California. This locus was used to develop a TaqMan quantitative polymerase chain reaction assay and an isothermal recombinase polymerase amplification (RPA) assay that have very high sensitivity and specificity for more than 180 different isolates of the pathogen tested. RPA assay results from multiple field samples were validated with pathogenicity tests of recovered isolates.


April 21, 2020

SMRT long reads and Direct Label and Stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica).

The barn swallow (Hirundo rustica) is a migratory bird that has been the focus of a large number of ecological, behavioral, and genetic studies. To facilitate further population genetics and genomic studies, we present a reference genome assembly for the European subspecies (H. r. rustica).As part of the Genome10K effort on generating high-quality vertebrate genomes (Vertebrate Genomes Project), we have assembled a highly contiguous genome assembly using single molecule real-time (SMRT) DNA sequencing and several Bionano optical map technologies. We compared and integrated optical maps derived from both the Nick, Label, Repair, and Stain technology and from the Direct Label and Stain (DLS) technology. As proposed by Bionano, DLS more than doubled the scaffold N50 with respect to the nickase. The dual enzyme hybrid scaffold led to a further marginal increase in scaffold N50 and an overall increase of confidence in the scaffolds. After removal of haplotigs, the final assembly is approximately 1.21 Gbp in size, with a scaffold N50 value of more than 25.95 Mbp.This high-quality genome assembly represents a valuable resource for future studies of population genetics and genomics in the barn swallow and for studies concerning the evolution of avian genomes. It also represents one of the very first genomes assembled by combining SMRT long-read sequencing with the new Bionano DLS technology for scaffolding. The quality of this assembly demonstrates the potential of this methodology to substantially increase the contiguity of genome assemblies.


April 21, 2020

The history, genome and biology of NCTC 30: a non-pandemic Vibrio cholerae isolate from World War One.

The sixth global cholera pandemic lasted from 1899 to 1923. However, despite widespread fear of the disease and of its negative effects on troop morale, very few soldiers in the British Expeditionary Forces contracted cholera between 1914 and 1918. Here, we have revived and sequenced the genome of NCTC 30, a 102-year-old Vibrio cholerae isolate, which we believe is the oldest publicly available live V. cholerae strain in existence. NCTC 30 was isolated in 1916 from a British soldier convalescent in Egypt. We found that this strain does not encode cholera toxin, thought to be necessary to cause cholera, and is not part of V. cholerae lineages responsible for the pandemic disease. We also show that NCTC 30, which predates the introduction of penicillin-based antibiotics, harbours a functional ß-lactamase antibiotic resistance gene. Our data corroborate and provide molecular explanations for previous phenotypic studies of NCTC 30 and provide a new high-quality genome sequence for historical, non-pandemic V. cholerae.


April 21, 2020

Competition between mobile genetic elements drives optimization of a phage-encoded CRISPR-Cas system: insights from a natural arms race.

CRISPR-Cas systems function as adaptive immune systems by acquiring nucleotide sequences called spacers that mediate sequence-specific defence against competitors. Uniquely, the phage ICP1 encodes a Type I-F CRISPR-Cas system that is deployed to target and overcome PLE, a mobile genetic element with anti-phage activity in Vibrio cholerae. Here, we exploit the arms race between ICP1 and PLE to examine spacer acquisition and interference under laboratory conditions to reconcile findings from wild populations. Natural ICP1 isolates encode multiple spacers directed against PLE, but we find that single spacers do not interfere equally with PLE mobilization. High-throughput sequencing to assay spacer acquisition reveals that ICP1 can also acquire spacers that target the V. cholerae chromosome. We find that targeting the V. cholerae chromosome proximal to PLE is sufficient to block PLE and is dependent on Cas2-3 helicase activity. We propose a model in which indirect chromosomal spacers are able to circumvent PLE by Cas2-3-mediated processive degradation of the V. cholerae chromosome before PLE mobilization. Generally, laboratory-acquired spacers are much more diverse than the subset of spacers maintained by ICP1 in nature, showing how evolutionary pressures can constrain CRISPR-Cas targeting in ways that are often not appreciated through in vitro analyses. This article is part of a discussion meeting issue ‘The ecology and evolution of prokaryotic CRISPR-Cas adaptive immune systems’.


April 21, 2020

Comparative Genome Characterization of a Petroleum-Degrading Bacillus subtilis Strain DM2.

The complete genome sequence of Bacillus subtilis strain DM2 isolated from petroleum-contaminated soil on the Tibetan Plateau was determined. The genome of strain DM2 consists of a circular chromosome of 4,238,631 bp for 4458 protein-coding genes and a plasmid of 84,240 bp coding for 103 genes. Thirty-four genomic islands coding for 330 proteins and 5 prophages are found in the genome. The DDH value shows that strain DM2 belongs to B. subtilis subsp. subtilis subspecies, but significant variations of the genome are also present. Comparative analysis showed that the genome of strain DM2 encodes some strain-specific proteins in comparison with B. subtilis subsp. subtilis str. 168, such as carboxymuconolactone decarboxylase family protein, gfo/Idh/MocA family oxidoreductases, GlsB/YeaQ/YmgE family stress response membrane protein, HlyC/CorC family transporters, LLM class flavin-dependent oxidoreductase, and LPXTG cell wall anchor domain-containing protein. Most of the common strain-specific proteins in DM2 and MJ01 strains, or proteins unique to DM2 strain, are involved in the pathways related to stress response, signaling, and hydrocarbon degradation. Furthermore, the strain DM2 genome contains 122 genes coding for developed two-component systems and 138 genes coding for ABC transporter systems. The prominent features of the strain DM2 genome reflect the evolutionary fitness of this strain to harsh conditions and hydrocarbon utilization.


April 21, 2020

Complete Genome Sequence of Photobacterium damselae Subsp. damselae Strain SSPD1601 Isolated from Deep-Sea Cage-Cultured Sebastes schlegelii with Septic Skin Ulcer.

Photobacterium damselae subsp. damselae (PDD) is a Gram-negative bacterium that can infect a variety of aquatic organisms and humans. Based on an epidemiological investigation conducted over the past 3 years, PDD is one of the most important pathogens causing septic skin ulcer in deep-sea cage-cultured Sebastes schlegelii in the Huang-Bohai Sea area and present throughout the year with high abundance. To further understand the pathogenicity of this species, the pathogenic properties and genome of PDD strain SSPD1601 were analyzed. The results revealed that PDD strain SSPD1601 is a rod-shaped cell with a single polar flagellum, and the clinical symptoms were replicated during artificial infection. The SSPD1601 genome consists of two chromosomes and two plasmids, totaling 4,252,294?bp with 3,751 coding sequences (CDSs), 196 tRNA genes, and 47 rRNA genes. Common virulence factors including flagellin, Fur, RstB, hcpA, OMPs, htpB-Hsp60, VasK, and vgrG were found in strain SSPD1601. Furthermore, SSPD1601 is a pPHDD1-negative strain containing the hemolysin gene hlyAch and three putative hemolysins (emrA, yoaF, and VPA0226), which are likely responsible for the pathogenicity of SSPD1601. The phylogenetic analysis revealed SSPD1601 to be most closely related to Phdp Wu-1. In addition, the antibiotic resistance phenotype indicated that SSPD1601 was not sensitive to ceftazidime, pipemidic, streptomycin, cefalexin, bacitracin, cefoperazone sodium, acetylspiramycin, clarithromycin, amikacin, gentamycin, kanamycin, oxacillin, ampicillin, and trimethoprim-sulfamethoxazole, but only the bacitracin resistance gene bacA was detected based on Antibiotic Resistance Genes Database. These results expand our understanding of PDD, setting the stage for further studies of its pathogenesis and disease prevention.


April 21, 2020

Analysis of the Complete Genome Sequence of a Novel, Pseudorabies Virus Strain Isolated in Southeast Europe.

Pseudorabies virus (PRV) is the causative agent of Aujeszky’s disease giving rise to significant economic losses worldwide. Many countries have implemented national programs for the eradication of this virus. In this study, long-read sequencing was used to determine the nucleotide sequence of the genome of a novel PRV strain (PRV-MdBio) isolated in Serbia.In this study, a novel PRV strain was isolated and characterized. PRV-MdBio was found to exhibit similar growth properties to those of another wild-type PRV, the strain Kaplan. Single-molecule real-time (SMRT) sequencing has revealed that the new strain differs significantly in base composition even from strain Kaplan, to which it otherwise exhibits the highest similarity. We compared the genetic composition of PRV-MdBio to strain Kaplan and the China reference strain Ea and obtained that radical base replacements were the most common point mutations preceding conservative and silent mutations. We also found that the adaptation of PRV to cell culture does not lead to any tendentious genetic alteration in the viral genome.PRV-MdBio is a wild-type virus, which differs in base composition from other PRV strains to a relatively large extent.


April 21, 2020

A draft genome assembly of the solar-powered sea slug Elysia chlorotica.

Elysia chlorotica, a sacoglossan sea slug found off the East Coast of the United States, is well-known for its ability to sequester chloroplasts from its algal prey and survive by photosynthesis for up to 12 months in the absence of food supply. Here we present a draft genome assembly of E. chlorotica that was generated using a hybrid assembly strategy with Illumina short reads and PacBio long reads. The genome assembly comprised 9,989 scaffolds, with a total length of 557?Mb and a scaffold N50 of 442?kb. BUSCO assessment indicated that 93.3% of the expected metazoan genes were completely present in the genome assembly. Annotation of the E. chlorotica genome assembly identified 176?Mb (32.6%) of repetitive sequences and a total of 24,980 protein-coding genes. We anticipate that the annotated draft genome assembly of the E. chlorotica sea slug will promote the investigation of sacoglossan genetics, evolution, and particularly, the genetic signatures accounting for the long-term functioning of algal chloroplasts in an animal.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.