Menu
July 7, 2019

Genome graphs

There is increasing recognition that a single, monoploid reference genome is a poor universal reference structure for human genetics, because it represents only a tiny fraction of human variation. Adding this missing variation results in a structure that can be described as a mathematical graph: a genome graph. We demonstrate that, in comparison to the existing reference genome (GRCh38), genome graphs can substantially improve the fractions of reads that map uniquely and perfectly. Furthermore, we show that this fundamental simplification of read mapping transforms the variant calling problem from one in which many non-reference variants must be discovered de-novo to one in which the vast majority of variants are simply re-identified within the graph. Using standard benchmarks as well as a novel reference-free evaluation, we show that a simplistic variant calling procedure on a genome graph can already call variants at least as well as, and in many cases better than, a state-of-the-art method on the linear human reference genome. We anticipate that graph-based references will supplant linear references in humans and in other applications where cohorts of sequenced individuals are available.


July 7, 2019

Assessment of bacterial profiles in aged, home-made Sichuan paocai brine with varying titratable acidity by PacBio SMRT Sequencing technology

Sichuan paocai, a traditional Chinese fermented vegetable, is rife with lactic acid bacteria (LAB). However, the precise bacterial profiles of home-made Sichuan paocai brine (HSPB) remain unclear. In this study, the bacterial compositions of 38 aged HSPB samples with varying titratable acidity (TA) were determined by SMRT sequencing of the full-length 16S rRNA gene. The lactic and acetic acids of HSPBs were also measured to determine any relevance with the bacterial profiles. The SMRT sequencing results reveal that the HSPB bacterial communities were comprised of numerous phylogenetic taxa, including 35 phyla, 371 genera, and 593 species; the bacterial diversity decreased as HSPB acidity increased. Lactobacillus acetotolerans, which was positively correlated to HSPB acidity, was the most dominant species followed by Lactobacillus brevis, which was positively related to acetic acid in the samples. A few opportunistic pathogens (e.g. Serratia marcescens and Stenotrophomonas maltophilia) were also detected. Sample groups with lower acidity had higher bacterial diversity and more Lactobacillus species with relative abundance >1% and opportunistics than higher-acidity samples. The results presented here report the comprehensive bacterial profiles of home-made Sichuan paocai for the first time via SMRT sequencing technology and the correlation between TA and bacterial compositions. It is necessary to further investigate the opportunistics detected in this work as they relate to the safety and quality of paocai.


July 7, 2019

Comparison of pseudorabies virus China reference strain with emerging variants reveals independent virus evolution within specific geographic regions.

Pseudorabies virus (PRV) China reference strain Ea is genetically closely related to newly emerged variants; however, there is limited information about PRV Ea. Here, we compared PRV Ea with new variant strains by growth kinetics, genome sequencing, and protein expression analysis. Growth analysis showed that strain Ea forms smaller plaques than strain HNX. The full-length genome sequence of Ea revealed that it is clustered in the same subgroup as HNX. Ea and HNX strains exhibited similar extracellular virion protein polymorphisms, whereas strain Bartha expressed less VP26 and more GAPDH. In infected cells, strain Ea expressed high levels of IE180 protein, and Ea and HNX produced higher levels of UL21 protein than strain Bartha. These findings provide evidence that PRV China reference strain Ea is genetically closely related to the newly emerged variant strains, indicating that strain PRV China may have evolved independently leading to the emergence of a variant strain. Copyright © 2017 Elsevier Inc. All rights reserved.


July 7, 2019

Complete genome sequence of a denitrifying bacterium, Pseudomonas sp. CC6-YY-74, isolated from Arctic Ocean sediment

Pseudomonas sp. CC6-YY-74, a psychrotrophic, denitrifying bacterium isolated from Arctic Ocean sediment, uses NO3- or NH4+ as the sole nitrogen source to grow at low temperatures. Here we described the complete genome of Pseudomonas sp. CC6-YY-74. The genome has one circular chromosome of 5,040,792 bp (61.73 mol% G + C content), consisting of 4747 coding genes, 68 tRNA genes, as well as six rRNA operons as 16S-23S-5S rRNA. According to the annotation results, strain CC6-YY-74 encodes 52 proteins related to nitrogen metabolism, including a complete denitrifying pathway, and more than 20 kinds of hydrolytic enzymes.


July 7, 2019

Genomic analysis of factors associated with low prevalence of antibiotic resistance in extraintestinal pathogenic Escherichia coli sequence type 95 strains.

Extraintestinal pathogenic Escherichia coli (ExPEC) strains belonging to multilocus sequence type 95 (ST95) are globally distributed and a common cause of infections in humans and domestic fowl. ST95 isolates generally show a lower prevalence of acquired antimicrobial resistance than other pandemic ExPEC lineages. We took a genomic approach to identify factors that may underlie reduced resistance. We fully assembled genomes for four ST95 isolates representing the four major fimH-based lineages within ST95 and also analyzed draft-level genomes from another 82 ST95 isolates, largely from the western United States. The fully assembled genomes of antibiotic-resistant isolates carried resistance genes exclusively on large (>90-kb) IncFIB/IncFII plasmids. These replicons were common in the draft genomes as well, particularly in antibiotic-resistant isolates, but we also observed multiple instances of a smaller (8.3-kb) ampicillin resistance plasmid that had been previously identified in Salmonella enterica. Among ST95 isolates, pansusceptibility to antibiotics was significantly associated with the fimH6 lineage and the presence of homologs of the previously identified 114-kb IncFIB/IncFII plasmid pUTI89, both of which were also associated with reduced carriage of other plasmids. Potential mechanistic explanations for lineage- and plasmid-specific effects on the prevalence of antibiotic resistance within the ST95 group are discussed. IMPORTANCE Antibiotic resistance in bacterial pathogens is a major public health concern. This work was motivated by the observation that only a small proportion of ST95 isolates, a major pandemic lineage of extraintestinal pathogenic E. coli, have acquired antibiotic resistance, in contrast to many other pandemic lineages. Understanding bacterial genetic factors that may prevent acquisition of resistance could contribute to the development of new biological, medical, or public health strategies to reduce antibiotic-resistant infections.


July 7, 2019

BusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation.

Metagenomics-based studies of mixed microbial communities are impacting biotechnology, life sciences and medicine. Computational binning of metagenomic data is a powerful approach for the culture-independent recovery of population-resolved genomic sequences, i.e. from individual or closely related, constituent microorganisms. Existing binning solutions often require a priori characterized reference genomes and/or dedicated compute resources. Extending currently available reference-independent binning tools, we developed the BusyBee Web server for the automated deconvolution of metagenomic data into population-level genomic bins using assembled contigs (Illumina) or long reads (Pacific Biosciences, Oxford Nanopore Technologies). A reversible compression step as well as bootstrapped supervised binning enable quick turnaround times. The binning results are represented in interactive 2D scatterplots. Moreover, bin quality estimates, taxonomic annotations and annotations of antibiotic resistance genes are computed and visualized. Ground truth-based benchmarks of BusyBee Web demonstrate comparably high performance to state-of-the-art binning solutions for assembled contigs and markedly improved performance for long reads (median F1 scores: 70.02-95.21%). Furthermore, the applicability to real-world metagenomic datasets is shown. In conclusion, our reference-independent approach automatically bins assembled contigs or long reads, exhibits high sensitivity and precision, enables intuitive inspection of the results, and only requires FASTA-formatted input. The web-based application is freely accessible at: https://ccb-microbe.cs.uni-saarland.de/busybee.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Gas fermentation: cellular engineering possibilities and scale up.

Low carbon fuels and chemicals can be sourced from renewable materials such as biomass or from industrial and municipal waste streams. Gasification of these materials allows all of the carbon to become available for product generation, a clear advantage over partial biomass conversion into fermentable sugars. Gasification results into a synthesis stream (syngas) containing carbon monoxide (CO), carbon dioxide (CO2), hydrogen (H2) and nitrogen (N2). Autotrophy-the ability to fix carbon such as CO2 is present in all domains of life but photosynthesis alone is not keeping up with anthropogenic CO2 output. One strategy is to curtail the gaseous atmospheric release by developing waste and syngas conversion technologies. Historically microorganisms have contributed to major, albeit slow, atmospheric composition changes. The current status and future potential of anaerobic gas-fermenting bacteria with special focus on acetogens are the focus of this review.


July 7, 2019

Automated structural variant verification in human genomesw using single-molecule electronic DNA mapping.

The importance of structural variation in human disease and the difficulty of detecting structural variants larger than 50 base pairs has led to the development of several long-read sequencing technologies and optical mapping platforms. Frequently, multiple technologies and ad hoc methods are required to obtain a consensus regarding the location, size and nature of a structural variant, with no approach able to reliably bridge the gap of variant sizes between the domain of short-read approaches and the largest rearrangements observed with optical mapping. To address this unmet need, we have developed a new software package, SV-VerifyTM, which utilizes data collected with the Nabsys High Definition Mapping (HD-MappingTM) system, to perform hypothesis-based verification of putative deletions. We demonstrate that whole genome maps, constructed from electronic detection of tagged DNA, hundreds of kilobases in length, can be used effectively to facilitate calling of structural variants ranging in size from 300 base pairs to hundreds of kilobase pairs. SV-Verify implements hypothesis-based verification of putative structural variants using a set of support vector machines and is capable of concurrently testing several thousand independent hypotheses. We describe support vector machine training, utilizing a well-characterized human genome, and application of the resulting classifiers to another human genome, demonstrating high sensitivity and specificity for deletions >= 300 base pairs.


July 7, 2019

Antibody-independent mechanisms regulate the establishment of chronic Plasmodium infection.

Malaria is caused by parasites of the genus Plasmodium. All human-infecting Plasmodium species can establish long-lasting chronic infections(1-5), creating an infectious reservoir to sustain transmission(1,6). It is widely accepted that the maintenance of chronic infection involves evasion of adaptive immunity by antigenic variation(7). However, genes involved in this process have been identified in only two of five human-infecting species: Plasmodium falciparum and Plasmodium knowlesi. Furthermore, little is understood about the early events in the establishment of chronic infection in these species. Using a rodent model we demonstrate that from the infecting population, only a minority of parasites, expressing one of several clusters of virulence-associated pir genes, establishes a chronic infection. This process occurs in different species of parasites and in different hosts. Establishment of chronicity is independent of adaptive immunity and therefore different from the mechanism proposed for maintenance of chronic P. falciparum infections(7-9). Furthermore, we show that the proportions of parasites expressing different types of pir genes regulate the time taken to establish a chronic infection. Because pir genes are common to most, if not all, species of Plasmodium(10), this process may be a common way of regulating the establishment of chronic infections.


July 7, 2019

Chromosome end repair and genome stability in Plasmodium falciparum.

The human malaria parasite Plasmodium falciparum replicates within circulating red blood cells, where it is subjected to conditions that frequently cause DNA damage. The repair of DNA double-stranded breaks (DSBs) is thought to rely almost exclusively on homologous recombination (HR), due to a lack of efficient nonhomologous end joining. However, given that the parasite is haploid during this stage of its life cycle, the mechanisms involved in maintaining genome stability are poorly understood. Of particular interest are the subtelomeric regions of the chromosomes, which contain the majority of the multicopy variant antigen-encoding genes responsible for virulence and disease severity. Here, we show that parasites utilize a competitive balance between de novo telomere addition, also called “telomere healing,” and HR to stabilize chromosome ends. Products of both repair pathways were observed in response to DSBs that occurred spontaneously during routine in vitro culture or resulted from experimentally induced DSBs, demonstrating that both pathways are active in repairing DSBs within subtelomeric regions and that the pathway utilized was determined by the DNA sequences immediately surrounding the break. In combination, these two repair pathways enable parasites to efficiently maintain chromosome stability while also contributing to the generation of genetic diversity.IMPORTANCE Malaria is a major global health threat, causing approximately 430,000 deaths annually. This mosquito-transmitted disease is caused by Plasmodium parasites, with infection with the species Plasmodium falciparum being the most lethal. Mechanisms underlying DNA repair and maintenance of genome integrity in P. falciparum are not well understood and represent a gap in our understanding of how parasites survive the hostile environment of their vertebrate and insect hosts. Our work examines DNA repair in real time by using single-molecule real-time (SMRT) sequencing focused on the subtelomeric regions of the genome that harbor the multicopy gene families important for virulence and the maintenance of infection. We show that parasites utilize two competing molecular mechanisms to repair double-strand breaks, homologous recombination and de novo telomere addition, with the pathway used being determined by the surrounding DNA sequence. In combination, these two pathways balance the need to maintain genome stability with the selective advantage of generating antigenic diversity. Copyright © 2017 Calhoun et al.


July 7, 2019

Complete genome sequencing and targeted mutagenesis reveal virulence contributions of Tal2 and Tal4b of Xanthomonas translucens pv. undulosa ICMP11055 in bacterial leaf streak of wheat

Bacterial leaf streak caused by Xanthomonas translucens pv. undulosa (Xtu) is an important disease of wheat (Triticum aestivum) and barley (Hordeum vulgare) worldwide. Transcription activator-like effectors (TALEs) play determinative roles in many of the plant diseases caused by the different species and pathovars of Xanthomonas, but their role in this disease has not been characterized. ICMP11055 is a highly virulent Xtu strain from Iran. The aim of this study was to better understand genetic diversity of Xtu and to assess the role of TALEs in bacterial leaf streak of wheat by comparing the genome of this strain to the recently completely sequenced genome of a U.S. Xtu strain, and to several other draft X. translucens genomes, and by carrying out mutational analyses of the TALE (tal) genes the Iranian strain might harbor. The ICMP11055 genome, including its repeat-rich tal genes, was completely sequenced using single molecule, real-time technology (Pacific Biosciences). It consists of a single circular chromosome of 4,561,583 bp, containing 3,953 genes. Whole genome alignment with the genome of the United States Xtu strain XT4699 showed two major re-arrangements, nine genomic regions unique to ICMP11055, and one region unique to XT4699. ICMP110055 harbors 26 non-TALE type III effector genes and seven tal genes, compared to 25 and eight for XT4699. The tal genes occur singly or in pairs across five scattered loci. Four are identical to tal genes in XT4699. In addition to common repeat-variable diresidues (RVDs), the tal genes of ICMP11055, like those of XT4699, encode several RVDs rarely observed in Xanthomonas, including KG, NF, Y*, YD, and YK. Insertion and deletion mutagenesis of ICMP11055 tal genes followed by genetic complementation analysis in wheat cv. Chinese Spring revealed that Tal2 and Tal4b of ICMP11055 each contribute individually to the extent of disease caused by this strain. A largely conserved ortholog of tal2 is present in XT4699, but for tal4b, only a gene with partial, fragmented RVD sequence similarity can be found. Our results lay the foundation for identification of important host genes activated by Xtu TALEs as targets for the development of disease resistant varieties.


July 7, 2019

Comparative sequence analysis of multidrug-resistant IncA/C plasmids from Salmonella enterica

Determinants of multidrug resistance (MDR) are often encoded on mobile elements, such as plasmids, transposons, and integrons, which have the potential to transfer among foodborne pathogens, as well as to other virulent pathogens, increasing the threats these traits pose to human and veterinary health. Our understanding of MDR among Salmonella has been limited by the lack of closed plasmid genomes for comparisons across resistance phenotypes, due to difficulties in effectively separating the DNA of these high-molecular weight, low-copy-number plasmids from chromosomal DNA. To resolve this problem, we demonstrate an efficient protocol for isolating, sequencing and closing IncA/C plasmids from Salmonella sp. using single molecule real-time sequencing on a Pacific Biosciences (Pacbio) RS II Sequencer. We obtained six Salmonella enterica isolates from poultry, representing six different serovars, each exhibiting the MDR-Ampc resistance profile. Salmonella plasmids were obtained using a modified mini preparation and transformed with Escherichia coli DH10Br. A Qiagen Large-Construct kit™ was used to recover highly concentrated and purified plasmid DNA that was sequenced using PacBio technology. These six closed IncA/C plasmids ranged in size from 104 to 191 kb and shared a stable, conserved backbone containing 98 core genes, with only six differences among those core genes. The plasmids encoded a number of antimicrobial resistance genes, including those for quaternary ammonium compounds and mercury. We then compared our six IncA/C plasmid sequences: first with 14 IncA/C plasmids derived from S. enterica available at the National Center for Biotechnology Information (NCBI), and then with an additional 38 IncA/C plasmids derived from different taxa. These comparisons allowed us to build an evolutionary picture of how antimicrobial resistance may be mediated by this common plasmid backbone. Our project provides detailed genetic information about resistance genes in plasmids, advances in plasmid sequencing, and phylogenetic analyses, and important insights about how MDR evolution occurs across diverse serotypes from different animal sources, particularly in agricultural settings where antimicrobial drug use practices vary.


July 7, 2019

Comparative genomic and regulatory analyses of natamycin production of Streptomyces lydicus A02.

Streptomyces lydicus A02 is used by industry because it has a higher natamycin-producing capacity than the reference strain S. natalensis ATCC 27448. We sequenced the complete genome of A02 using next-generation sequencing platforms, and to achieve better sequence coverage and genome assembly, we utilized single-molecule real-time (SMRT) sequencing. The assembled genome comprises a 9,307,519-bp linear chromosome with a GC content of 70.67%, and contained 8,888 predicted genes. Comparative genomics and natamycin biosynthetic gene cluster (BGC) analysis showed that BGC are highly conserved among evolutionarily diverse strains, and they also shared closer genome evolution compared with other Streptomyces species. Forty gene clusters were predicted to involve in the secondary metabolism of A02, and it was richly displayed in two-component signal transduction systems (TCS) in the genome, indicating a complex regulatory systems and high diversity of metabolites. Disruption of the phoP gene of the phoR-phoP TCS and nsdA gene confirmed phosphate sensitivity and global negative regulation of natamycin production. The genome sequence and analyses presented in this study provide an important molecular basis for research on natamycin production in Streptomyces, which could facilitate rational genome modification to improve the industrial use of A02.


July 7, 2019

MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.

We present a tool that combines fast mapping, error correction, and de novo assembly (MECAT; accessible at https://github.com/xiaochuanle/MECAT) for processing single-molecule sequencing (SMS) reads. MECAT’s computing efficiency is superior to that of current tools, while the results MECAT produces are comparable or improved. MECAT enables reference mapping or de novo assembly of large genomes using SMS reads on a single computer.


July 7, 2019

Emergence and genomic diversification of a virulent serogroup W: ST-2881 (CC175) Neisseria meningitidis clone in the African meningitis belt

Countries of the African ‘meningitis belt’ are susceptible to meningococcal meningitis outbreaks. While in the past major epidemics have been primarily caused by serogroup A meningococci, W strains are currently responsible for most of the cases. After an epidemic in Mecca in 2000, W:ST-11 strains have caused many outbreaks worldwide. An unrelated W:ST-2881 clone was described for the first time in 2002, with the first meningitis cases caused by these bacteria reported in 2003. Here we describe results of a comparative whole-genome analysis of 74 W:ST-2881 strains isolated within the framework of two longitudinal colonization and disease studies conducted in Ghana and Burkina Faso. Genomic data indicate that the W:ST-2881 clone has emerged from Y:ST-175(CC175) bacteria by capsule switching. The circulating W:ST-2881 populations were composed of a variety of closely related but distinct genomic variants with no systematic differences between colonization and disease isolates. Two distinct and geographically clustered phylogenetic clonal variants were identified in Burkina Faso and a third in Ghana. On the basis of the presence or absence of 17 recombination fragments, the Ghanaian variant could be differentiated into five clusters. All 25 Ghanaian disease isolates clustered together with 23 out of 40 Ghanaian isolates associated with carriage within one cluster, indicating that W:ST-2881 clusters differ in virulence. More than half of the genes affected by horizontal gene transfer encoded proteins of the ‘cell envelope’ and the ‘transport/binding protein’ categories, which indicates that exchange of non-capsular antigens plays an important role in immune evasion.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.