Menu
September 22, 2019

Metataxonomics reveal vultures as a reservoir for Clostridium perfringens.

The Old World vulture may carry and spread pathogens for emerging infections since they feed on the carcasses of dead animals and participate in the sky burials of humans, some of whom have died from communicable diseases. Therefore, we studied the precise fecal microbiome of the Old World vulture with metataxonomics, integrating the high-throughput sequencing of almost full-length small subunit ribosomal RNA (16S rRNA) gene amplicons in tandem with the operational phylogenetic unit (OPU) analysis strategy. Nine vultures of three species were sampled using rectal swabs on the Qinghai-Tibet Plateau, China. Using the Pacific Biosciences sequencing platform, we obtained 54 135 high-quality reads of 16S rRNA amplicons with an average of 1442±6.9?bp in length and 6015±1058 reads per vulture. Those sequences were classified into 314 OPUs, including 102 known species, 50 yet to be described species and 161 unknown new lineages of uncultured representatives. Forty-five species have been reported to be responsible for human outbreaks or infections, and 23 yet to be described species belong to genera that include pathogenic species. Only six species were common to all vultures. Clostridium perfringens was the most abundant and present in all vultures, accounting for 30.8% of the total reads. Therefore, using the new technology, we found that vultures are an important reservoir for C. perfringens as evidenced by the isolation of 107 strains encoding for virulence genes, representing 45 sequence types. Our study suggests that the soil-related C. perfringens and other pathogens could have a reservoir in vultures and other animals.


September 22, 2019

Long-read isoform sequencing reveals a hidden complexity of the transcriptional landscape of Herpes Simplex Virus Type 1.

In this study, we used the amplified isoform sequencing technique from Pacific Biosciences to characterize the poly(A)(+) fraction of the lytic transcriptome of the herpes simplex virus type 1 (HSV-1). Our analysis detected 34 formerly unidentified protein-coding genes, 10 non-coding RNAs, as well as 17 polycistronic and complex transcripts. This work also led us to identify many transcript isoforms, including 13 splice and 68 transcript end variants, as well as several transcript overlaps. Additionally, we determined previously unascertained transcriptional start and polyadenylation sites. We analyzed the transcriptional activity from the complementary DNA strand in five convergent HSV gene pairs with quantitative RT-PCR and detected antisense RNAs in each gene. This part of the study revealed an inverse correlation between the expressions of convergent partners. Our work adds new insights for understanding the complexity of the pervasive transcriptional overlaps by suggesting that there is a crosstalk between adjacent and distal genes through interaction between their transcription apparatuses. We also identified transcripts overlapping the HSV replication origins, which may indicate an interplay between the transcription and replication machineries. The relative abundance of HSV-1 transcripts has also been established by using a novel method based on the calculation of sequencing reads for the analysis.


September 22, 2019

Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation.

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.© 2018 Fiddes et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Effect of Chinese rice wine sludge on the production of Chinese steamed buns

Chinese rice wine sludge (CRWS), analogous to beer yeast sludge, is the filter cake remaining after squeezing the fermentation mash of Chinese rice wine. CRWS contains high levels of protein (44.74%), nonstructural carbohydrates (37.33%), crude fiber (13.5%), and essential amino acids, which could enhance the trophic value of Chinese steamed buns. In our research, the microbiota of CRWS (mainly Saccharomyces cerevisiae and Lactobacillus sp.) was analyzed at the species level by single-molecule real-time DNA sequencing technology. Interestingly, the microbiota of CRWS was similar to that of the starter dough typically used to prepare Chinese steamed buns. Incorporation of CRWS significantly influenced the pasting properties and farinograph characteristics of the dough, which control the texture of the Chinese steamed buns, and supplementation with 5~30% CRWS caused the properties of the resulting buns to be more similar to those of northern-style steamed buns. CRWS addition also significantly enhanced the content of aroma compounds in the Chinese steamed buns.


September 22, 2019

Somatic mosaicism of an intragenic FANCB duplication in both fibroblast and peripheral blood cells observed in a Fanconi anemia patient leads to milder phenotype.

Fanconi anemia (FA) is a rare disorder characterized by congenital malformations, progressive bone marrow failure, and predisposition to cancer. Patients harboring X-linked FANCB pathogenic variants usually present with severe congenital malformations resembling VACTERL syndrome with hydrocephalus.We employed the diepoxybutane (DEB) test for FA diagnosis, arrayCGH for detection of duplication, targeted capture and next-gen sequencing for defining the duplication breakpoint, PacBio sequencing of full-length FANCB aberrant transcript, FANCD2 ubiquitination and foci formation assays for the evaluation of FANCB protein function by viral transduction of FANCB-null cells with lentiviral FANCB WT and mutant expression constructs, and droplet digital PCR for quantitation of the duplication in the genomic DNA and cDNA.We describe here an FA-B patient with a mild phenotype. The DEB diagnostic test for FA revealed somatic mosaicism. We identified a 9154 bp intragenic duplication in FANCB, covering the first coding exon 3 and the flanking regions. A four bp homology (GTAG) present at both ends of the breakpoint is consistent with microhomology-mediated duplication mechanism. The duplicated allele gives rise to an aberrant transcript containing exon 3 duplication, predicted to introduce a stop codon in FANCB protein (p.A319*). Duplication levels in the peripheral blood DNA declined from 93% to 7.9% in the span of eleven years. Moreover, the patient fibroblasts have shown 8% of wild-type (WT) allele and his carrier mother showed higher than expected levels of WT allele (79% vs. 50%) in peripheral blood, suggesting that the duplication was highly unstable.Unlike sequence point variants, intragenic duplications are difficult to precisely define, accurately quantify, and may be very unstable, challenging the proper diagnosis. The reversion of genomic duplication to the WT allele results in somatic mosaicism and may explain the relatively milder phenotype displayed by the FA-B patient described here.© 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.


September 22, 2019

Multi-platform sequencing approach reveals a novel transcriptome profile in pseudorabies virus.

Third-generation sequencing is an emerging technology that is capable of solving several problems that earlier approaches were not able to, including the identification of transcripts isoforms and overlapping transcripts. In this study, we used long-read sequencing for the analysis of pseudorabies virus (PRV) transcriptome, including Oxford Nanopore Technologies MinION, PacBio RS-II, and Illumina HiScanSQ platforms. We also used data from our previous short-read and long-read sequencing studies for the comparison of the results and in order to confirm the obtained data. Our investigations identified 19 formerly unknown putative protein-coding genes, all of which are 5′ truncated forms of earlier annotated longer PRV genes. Additionally, we detected 19 non-coding RNAs, including 5′ and 3′ truncated transcripts without in-frame ORFs, antisense RNAs, as well as RNA molecules encoded by those parts of the viral genome where no transcription had been detected before. This study has also led to the identification of three complex transcripts and 50 distinct length isoforms, including transcription start and end variants. We also detected 121 novel transcript overlaps, and two transcripts that overlap the replication origins of PRV. Furthermore,in silicoanalysis revealed 145 upstream ORFs, many of which are located on the longer 5′ isoforms of the transcripts.


September 22, 2019

100K Pathogen Genome Project.

The 100K Pathogen Genome Project is producing draft and closed genome sequences from diverse pathogens. This project expanded globally to include a snapshot of global bacterial genome diversity. The genomes form a sequence database that has a variety of uses from systematics to public health. Copyright © 2017 Weimer.


September 22, 2019

Single-cell (meta-)genomics of a dimorphic Candidatus Thiomargarita nelsonii reveals genomic plasticity.

The genus Thiomargarita includes the world’s largest bacteria. But as uncultured organisms, their physiology, metabolism, and basis for their gigantism are not well understood. Thus, a genomics approach, applied to a single Candidatus Thiomargarita nelsonii cell was employed to explore the genetic potential of one of these enigmatic giant bacteria. The Thiomargarita cell was obtained from an assemblage of budding Ca. T. nelsonii attached to a provannid gastropod shell from Hydrate Ridge, a methane seep offshore of Oregon, USA. Here we present a manually curated genome of Bud S10 resulting from a hybrid assembly of long Pacific Biosciences and short Illumina sequencing reads. With respect to inorganic carbon fixation and sulfur oxidation pathways, the Ca. T. nelsonii Hydrate Ridge Bud S10 genome was similar to marine sister taxa within the family Beggiatoaceae. However, the Bud S10 genome contains genes suggestive of the genetic potential for lithotrophic growth on arsenite and perhaps hydrogen. The genome also revealed that Bud S10 likely respires nitrate via two pathways: a complete denitrification pathway and a dissimilatory nitrate reduction to ammonia pathway. Both pathways have been predicted, but not previously fully elucidated, in the genomes of other large, vacuolated, sulfur-oxidizing bacteria. Surprisingly, the genome also had a high number of unusual features for a bacterium to include the largest number of metacaspases and introns ever reported in a bacterium. Also present, are a large number of other mobile genetic elements, such as insertion sequence (IS) transposable elements and miniature inverted-repeat transposable elements (MITEs). In some cases, mobile genetic elements disrupted key genes in metabolic pathways. For example, a MITE interrupts hupL, which encodes the large subunit of the hydrogenase in hydrogen oxidation. Moreover, we detected a group I intron in one of the most critical genes in the sulfur oxidation pathway, dsrA. The dsrA group I intron also carried a MITE sequence that, like the hupL MITE family, occurs broadly across the genome. The presence of a high degree of mobile elements in genes central to Thiomargarita’s core metabolism has not been previously reported in free-living bacteria and suggests a highly mutable genome.


September 22, 2019

GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality.

The programs GMAP and GSNAP, for aligning RNA-Seq and DNA-Seq datasets to genomes, have evolved along with advances in biological methodology to handle longer reads, larger volumes of data, and new types of biological assays. The genomic representation has been improved to include linear genomes that can compare sequences using single-instruction multiple-data (SIMD) instructions, compressed genomic hash tables with fast access using SIMD instructions, handling of large genomes with more than four billion bp, and enhanced suffix arrays (ESAs) with novel data structures for fast access. Improvements to the algorithms have included a greedy match-and-extend algorithm using suffix arrays, segment chaining using genomic hash tables, diagonalization using segmental hash tables, and nucleotide-level dynamic programming procedures that use SIMD instructions and eliminate the need for F-loop calculations. Enhancements to the functionality of the programs include standardization of indel positions, handling of ambiguous splicing, clipping and merging of overlapping paired-end reads, and alignments to circular chromosomes and alternate scaffolds. The programs have been adapted for use in pipelines by integrating their usage into R/Bioconductor packages such as gmapR and HTSeqGenie, and these pipelines have facilitated the discovery of numerous biological phenomena.


September 22, 2019

Shift in fungal communities and associated enzyme activities along an age gradient of managed Pinus sylvestris stands.

Forestry reshapes ecosystems with respect to tree age structure, soil properties and vegetation composition. These changes are likely to be paralleled by shifts in microbial community composition with potential feedbacks on ecosystem functioning. Here, we assessed fungal communities across a chronosequence of managed Pinus sylvestris stands and investigated correlations between taxonomic composition and extracellular enzyme activities. Not surprisingly, clear-cutting had a negative effect on ectomycorrhizal fungal abundance and diversity. In contrast, clear-cutting favoured proliferation of saprotrophic fungi correlated with enzymes involved in holocellulose decomposition. During stand development, the re-establishing ectomycorrhizal fungal community shifted in composition from dominance by Atheliaceae in younger stands to Cortinarius and Russula species in older stands. Late successional ectomycorrhizal taxa correlated with enzymes involved in mobilisation of nutrients from organic matter, indicating intensified nutrient limitation. Our results suggest that maintenance of functional diversity in the ectomycorrhizal fungal community may sustain long-term forest production by retaining a capacity for symbiosis-driven recycling of organic nutrient pools.


September 22, 2019

Accurate determination of bacterial abundances in human metagenomes using full-length 16S sequencing reads

DNA sequencing of PCR-amplified marker genes, especially but not limited to the 16S rRNA gene, is perhaps the most common approach for profiling microbial communities. Due to technological constraints of commonly available DNA sequencing, these approaches usually take the form of short reads sequenced from a narrow, targeted variable region, with a corresponding loss of taxonomic resolution relative to the full length marker gene. We use Pacific Biosciences single-molecule, real-time circular consensus sequencing to sequence amplicons spanning the entire length of the 16S rRNA gene. However, this sequencing technology suffers from high sequencing error rate that needs to be addressed in order to take full advantage of the longer sequence. Here, we present a method to model the sequencing error process using a generalized pair hidden Markov chain model and estimate bacterial abundances in microbial samples. We demonstrate, with simulated and real data, that our model and its associated estimation procedure are able to give accurate estimates at the species (or subspecies) level, and is more flexible than existing methods like SImple Non-Bayesian TAXonomy (SINTAX).


September 22, 2019

Meeting report: 31st International Mammalian Genome Conference, Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications.

High on the Heidelberg hills, inside the Advanced Training Centre of the European Molecular Biology Laboratory (EMBL) campus with its unique double-helix staircase, scientists gathered for the EMBL conference “Mammalian Genetics and Genomics: From Molecular Mechanisms to Translational Applications,” organized in cooperation with the International Mammalian Genome Society (IMGS) and the Mouse Molecular Genetics (MMG) group. The conference attracted 205 participants from 30 countries, representing 6 of the 7 continents-all except Antarctica. It was a richly diverse group of geneticists, clinicians, and bioinformaticians, with presentations by established and junior investigators, including many trainees. From the 24th-27th of October 2017, they shared exciting advances in mammalian genetics and genomics research, from the introduction of cutting-edge technologies to descriptions of translational studies involving highly relevant models of human disease.


September 22, 2019

Metabolism of toxic sugars by strains of the bee gut symbiont Gilliamella apicola.

Social bees collect carbohydrate-rich food to support their colonies, and yet, certain carbohydrates present in their diet or produced through the breakdown of pollen are toxic to bees. The gut microbiota of social bees is dominated by a few core bacterial species, including the Gram-negative species Gilliamella apicola We isolated 42 strains of G. apicola from guts of honey bees and bumble bees and sequenced their genomes. All of the G. apicola strains share high 16S rRNA gene similarity, but they vary extensively in gene repertoires related to carbohydrate metabolism. Predicted abilities to utilize different sugars were verified experimentally. Some strains can utilize mannose, arabinose, xylose, or rhamnose (monosaccharides that can cause toxicity in bees) as their sole carbon and energy source. All of the G. apicola strains possess a manO-associated mannose family phosphotransferase system; phylogenetic analyses suggest that this was acquired from Firmicutes through horizontal gene transfer. The metabolism of mannose is specifically dependent on the presence of mannose-6-phosphate isomerase (MPI). Neither growth rates nor the utilization of glucose and fructose are affected in the presence of mannose when the gene encoding MPI is absent from the genome, suggesting that mannose is not taken up by G. apicola strains which harbor the phosphotransferase system but do not encode the MPI. Given their ability to simultaneously utilize glucose, fructose, and mannose, as well as the ability of many strains to break down other potentially toxic carbohydrates, G. apicola bacteria may have key roles in improving dietary tolerances and maintaining the health of their bee hosts.Bees are important pollinators of agricultural plants. Our study documents the ability of Gilliamella apicola, a dominant gut bacterium in honey bees and bumble bees, to utilize several sugars that are harmful to bee hosts. Using genome sequencing and growth assays, we found that the ability to metabolize certain toxic carbohydrates is directly correlated with the presence of their respective degradation pathways, indicating that metabolic potential can be accurately predicted from genomic data in these gut symbionts. Strains vary considerably in their range of utilizable carbohydrates, which likely reflects historical horizontal gene transfer and gene deletion events. Unlike their bee hosts, G. apicola bacteria are not detrimentally affected by growth on mannose-containing medium, even in strains that cannot metabolize this sugar. These results suggest that G. apicola may be an important player in modulating nutrition in the bee gut, with ultimate effects on host health. Copyright © 2016 Zheng et al.


September 22, 2019

Transcriptional fates of human-specific segmental duplications in brain.

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.© 2018 Dougherty et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

Fungal ITS1 deep-sequencing strategies to reconstruct the composition of a 26-species community and evaluation of the gut mycobiota of healthy Japanese individuals.

The study of mycobiota remains relatively unexplored due to the lack of sufficient available reference strains and databases compared to those of bacterial microbiome studies. Deep sequencing of Internal Transcribed Spacer (ITS) regions is the de facto standard for fungal diversity analysis. However, results are often biased because of the wide variety of sequence lengths in the ITS regions and the complexity of high-throughput sequencing (HTS) technologies. In this study, a curated ITS database, ntF-ITS1, was constructed. This database can be utilized for the taxonomic assignment of fungal community members. We evaluated the efficacy of strategies for mycobiome analysis by using this database and characterizing a mock fungal community consisting of 26 species representing 15 genera using ITS1 sequencing with three HTS platforms: Illumina MiSeq (MiSeq), Ion Torrent Personal Genome Machine (IonPGM), and Pacific Biosciences (PacBio). Our evaluation demonstrated that PacBio’s circular consensus sequencing with greater than 8 full-passes most accurately reconstructed the composition of the mock community. Using this strategy for deep-sequencing analysis of the gut mycobiota in healthy Japanese individuals revealed two major mycobiota types: a single-species type composed of Candida albicans or Saccharomyces cerevisiae and a multi-species type. In this study, we proposed the best possible processing strategies for the three sequencing platforms, of which, the PacBio platform allowed for the most accurate estimation of the fungal community. The database and methodology described here provide critical tools for the emerging field of mycobiome studies.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.