Menu
September 22, 2019

Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon

A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.


September 22, 2019

Construction of a draft reference transcripts of onion (Allium cepa) using long-read sequencing

To obtain intact and full-length RNA transcripts of onion (Allium cepa), long-read sequencing technology was first applied. Total RNAs extracted from four tissues; flowers, leaves, bulbs and roots, of red–purple and yellow-colored onions (A. cepa) were sequenced using long-read sequencing (RSII platform, P4-C2 chemistry). The 99,247 polished high-quality isoforms were produced by sequence correction processes of consensus calling, quality filtering, orientation verification, misread-nucleotide correction and dot-matrix view. The dot-matrix view was subsequently used to remove artificial inverted repeats (IRs), and resultantly 421 IRs were removed. The remaining 98,826 isoforms were condensed to 35,505 through the removal process of redundant isoforms. To assess the completeness of the 35,505 isoforms, the ratio of full-length isoforms, short-read mapping to the isoforms, and differentially expressed genes among the four tissues were analyzed along with the gene ontology across the tissues. As a result, the 35,505 isoforms were verified as a collection of isoforms with high completeness, and designated as draft reference transcripts (DRTs, ver 1.0) constructed by long-read sequencing.


September 22, 2019

Bacterial diversity and community structure in Chongqing radish paocai brines revealed using PacBio single-molecule real-time sequencing technology.

Traditional Chongqing radish paocai fermented with aged brine is considered to have the most intense flavor and authentic taste. Eight ‘Yanzhi’ (red, RRPB group) and ‘Chunbulao’ (white, WRPB) radish paocai brine samples were collected from Chongqing peasant households, and the diversity and community structures of bacteria present in these brines were determined using PacBio single-molecule real-time sequencing of their full-length 16S rRNA genes.In total, 30 phyla, 218 genera, and 306 species were identified from the RRPB group, with 20 phyla, 261 genera, and 420 species present in the WRPB group. Obvious differences in bacterial profiles between the RRPB and WRPB groups were found, with the bacterial diversity of the WRPB group shown to be greater than that of the RRPB group. This study revealed several characteristics of the bacteria composition, including the predominance of heterofermentative lactic acid bacteria, the species diversity of genus Pseudomonas, and the presence of three opportunistic pathogenic species.This study provides detailed information on the bacterial diversity and community structure of Chongqing radish paocai brine samples, and suggests it may be necessary to analyze paocai brine for potential sources of bacterial contamination and take appropriate measures to exclude any pathogenic species. © 2018 Society of Chemical Industry.© 2018 Society of Chemical Industry.


September 22, 2019

Electrosynthesis of commodity chemicals by an autotrophic microbial community.

A microbial community originating from brewery waste produced methane, acetate, and hydrogen when selected on a granular graphite cathode poised at -590 mV versus the standard hydrogen electrode (SHE) with CO(2) as the only carbon source. This is the first report on the simultaneous electrosynthesis of these commodity chemicals and the first description of electroacetogenesis by a microbial community. Deep sequencing of the active community 16S rRNA revealed a dynamic microbial community composed of an invariant Archaea population of Methanobacterium spp. and a shifting Bacteria population. Acetobacterium spp. were the most abundant Bacteria on the cathode when acetogenesis dominated. Methane was generally the dominant product with rates increasing from <1 to 7 mM day(-1) (per cathode liquid volume) and was concomitantly produced with acetate and hydrogen. Acetogenesis increased to >4 mM day(-1) (accumulated to 28.5 mM over 12 days), and methanogenesis ceased following the addition of 2-bromoethanesulfonic acid. Traces of hydrogen accumulated during initial selection and subsequently accelerated to >11 mM day(-1) (versus 0.045 mM day(-1) abiotic production). The hypothesis of electrosynthetic biocatalysis occurring at the microbe-electrode interface was supported by a catalytic wave (midpoint potential of -460 mV versus SHE) in cyclic voltammetry scans of the biocathode, the lack of redox active components in the medium, and the generation of comparatively high amounts of products (even after medium exchange). In addition, the volumetric production rates of these three commodity chemicals are marked improvements for electrosynthesis, advancing the process toward economic feasibility.


September 22, 2019

Long non-coding RNA identification: comparing machine learning based tools for long non-coding transcripts discrimination

Long noncoding RNA (lncRNA) is a kind of noncoding RNA with length more than 200 nucleotides, which aroused interest of people in recent years. Lots of studies have confirmed that human genome contains many thousands of lncRNAs which exert great influence over some critical regulators of cellular process. With the advent of high-throughput sequencing technologies, a great quantity of sequences is waiting for exploitation. Thus, many programs are developed to distinguish differences between coding and long noncoding transcripts. Different programs are generally designed to be utilised under different circumstances and it is sensible and practical to select an appropriate method according to a certain situation. In this review, several popular methods and their advantages, disadvantages, and application scopes are summarised to assist people in employing a suitable method and obtaining a more reliable result.


September 22, 2019

Abiotic stresses modulate landscape of poplar transcriptome via alternative splicing differential intron retention, and isoform ratio switching.

Abiotic stresses affect plant physiology, development, growth, and alter pre-mRNA splicing. Western poplar is a model woody tree and a potential bioenergy feedstock. To investigate the extent of stress-regulated alternative splicing (AS), we conducted an in-depth survey of leaf, root, and stem xylem transcriptomes under drought, salt, or temperature stress. Analysis of approximately one billion of genome-aligned RNA-Seq reads from tissue- or stress-specific libraries revealed over fifteen millions of novel splice junctions. Transcript models supported by both RNA-Seq and single molecule isoform sequencing (Iso-Seq) data revealed a broad array of novel stress- and/or tissue-specific isoforms. Analysis of Iso-Seq data also resulted in the discovery of 15,087 novel transcribed regions of which 164 show AS. Our findings demonstrate that abiotic stresses profoundly perturb transcript isoform profiles and trigger widespread intron retention (IR) events. Stress treatments often increased or decreased retention of specific introns – a phenomenon described here as differential intron retention (DIR). Many differentially retained introns were regulated in a stress- and/or tissue-specific manner. A subset of transcripts harboring super stress-responsive DIR events showed persisting fluctuations in the degree of IR across all treatments and tissue types. To investigate coordinated dynamics of intron-containing transcripts in the study we quantified absolute copy number of isoforms of two conserved transcription factors (TFs) using Droplet Digital PCR. This case study suggests that stress treatments can be associated with coordinated switches in relative ratios between fully spliced and intron-retaining isoforms and may play a role in adjusting transcriptome to abiotic stresses.


September 22, 2019

Capturing single cell genomes of active polysaccharide degraders: an unexpected contribution of Verrucomicrobia.

Microbial hydrolysis of polysaccharides is critical to ecosystem functioning and is of great interest in diverse biotechnological applications, such as biofuel production and bioremediation. Here we demonstrate the use of a new, efficient approach to recover genomes of active polysaccharide degraders from natural, complex microbial assemblages, using a combination of fluorescently labeled substrates, fluorescence-activated cell sorting, and single cell genomics. We employed this approach to analyze freshwater and coastal bacterioplankton for degraders of laminarin and xylan, two of the most abundant storage and structural polysaccharides in nature. Our results suggest that a few phylotypes of Verrucomicrobia make a considerable contribution to polysaccharide degradation, although they constituted only a minor fraction of the total microbial community. Genomic sequencing of five cells, representing the most predominant, polysaccharide-active Verrucomicrobia phylotype, revealed significant enrichment in genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases and esterases, confirming that these organisms were well equipped for the hydrolysis of diverse polysaccharides. Remarkably, this enrichment was on average higher than in the sequenced representatives of Bacteroidetes, which are frequently regarded as highly efficient biopolymer degraders. These findings shed light on the ecological roles of uncultured Verrucomicrobia and suggest specific taxa as promising bioprospecting targets. The employed method offers a powerful tool to rapidly identify and recover discrete genomes of active players in polysaccharide degradation, without the need for cultivation.


September 22, 2019

Emergence, retention and selection: A trilogy of origination for functional de novo proteins from ancestral lncRNAs in primates.

While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age, due to their GC-rich sequence property enabling stable ORFs with lower chance of non-sense mutations. Interestingly, although the emergence and retention of these de novo genes are likely driven by neutral forces, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution, which may contribute to human-specific genetic novelties by taking advantage of existed genomic contexts.


September 22, 2019

Complete genome sequence of Elizabethkingia sp. BM10, a symbiotic bacterium of the wood-feeding termite Reticulitermes speratus KMT1.

Elizabethkingia sp. BM10 was isolated from the hindgut of the wood-feeding termite Reticulitermes speratus KMT1. It had cellobiohydrolase and ß-glucosidase activities but not endo-ß-glucanase activity. The complete sequence of its genome, which has a total size of 4,242,519 bases, is reported here. The genomic analysis identified six ß-glucosidase candidate genes and three ß-glucanase candidate genes. Copyright © 2015 Lee et al.


September 22, 2019

Metagenomic and near full-length 16S rRNA sequence data in support of the phylogenetic analysis of the rumen bacterial community in steers.

Amplicon sequencing utilizing next-generation platforms has significantly transformed how research is conducted, specifically microbial ecology. However, primer and sequencing platform biases can confound or change the way scientists interpret these data. The Pacific Biosciences RSII instrument may also preferentially load smaller fragments, which may also be a function of PCR product exhaustion during sequencing. To further examine theses biases, data is provided from 16S rRNA rumen community analyses. Specifically, data from the relative phylum-level abundances for the ruminal bacterial community are provided to determine between-sample variability. Direct sequencing of metagenomic DNA was conducted to circumvent primer-associated biases in 16S rRNA reads and rarefaction curves were generated to demonstrate adequate coverage of each amplicon. PCR products were also subjected to reduced amplification and pooling to reduce the likelihood of PCR product exhaustion during sequencing on the Pacific Biosciences platform. The taxonomic profiles for the relative phylum-level and genus-level abundance of rumen microbiota as a function of PCR pooling for sequencing on the Pacific Biosciences RSII platform were provided. For more information, see “Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers” P.R. Myer, M. Kim, H.C. Freetly, T.P.L. Smith (2016) [1].


September 22, 2019

Retention of seed trees fails to lifeboat ectomycorrhizal fungal diversity in harvested Scots pine forests.

Fennoscandian forestry has in the past decades changed from natural regeneration of forests towards replantation of clear-cuts, which negatively impacts ectomycorrhizal fungal (EMF) diversity. Retention of trees during harvesting enables EMF survival, and we therefore expected EMF communities to be more similar to those in old natural stands after forest regeneration using seed trees compared to full clear-cutting and replanting. We sequenced fungal internal transcribed spacer 2 (ITS2) amplicons to assess EMF communities in 10- to 60-year-old Scots pine stands regenerated either using seed trees or through replanting of clear-cuts with old natural stands as reference. We also investigated local EMF communities around retained old trees. We found that retention of seed trees failed to mitigate the impact of harvesting on EMF community composition and diversity. With increasing stand age, EMF communities became increasingly similar to those in old natural stands and permanently retained trees maintained EMF locally. From our observations, we conclude that EMF communities, at least common species, post-harvest are more influenced by environmental filtering, resulting from environmental changes induced by harvest, than by the continuity of trees. These results suggest that retention of intact forest patches is a more efficient way to conserve EMF diversity than retaining dispersed single trees.© FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


September 22, 2019

The microbiota of freshwater fish and freshwater niches contain omega-3 producing Shewanella species.

Approximately 30 years ago, it was discovered that free-living bacteria isolated from cold ocean depths could produce polyunsaturated fatty acids (PUFA) such as eicosapentaenoic acid (EPA) (20:5n-3) or docosahexaenoic acid (DHA) (22:6n-3), two PUFA essential for human health. Numerous laboratories have also discovered that EPA- and/or DHA-producing bacteria, many of them members of the Shewanella genus, could be isolated from the intestinal tracts of omega-3 fatty acid-rich marine fish. If bacteria contribute omega-3 fatty acids to the host fish in general or if they assist some bacterial species in adaptation to cold, then cold freshwater fish or habitats should also harbor these producers. Thus, we undertook a study to see if these niches also contained omega-3 fatty acid producers. We were successful in isolating and characterizing unique EPA-producing strains of Shewanella from three strictly freshwater native fish species, i.e., lake whitefish (Coregonus clupeaformis), lean lake trout (Salvelinus namaycush), and walleye (Sander vitreus), and from two other freshwater nonnative fish, i.e., coho salmon (Oncorhynchus kisutch) and seeforellen brown trout (Salmo trutta). We were also able to isolate four unique free-living strains of EPA-producing Shewanella from freshwater habitats. Phylogenetic and phenotypic analyses suggest that one producer is clearly a member of the Shewanella morhuae species and another is sister to members of the marine PUFA-producing Shewanella baltica species. However, the remaining isolates have more ambiguous relationships, sharing a common ancestor with non-PUFA-producing Shewanella putrefaciens isolates rather than marine S. baltica isolates despite having a phenotype more consistent with S. baltica strains. Copyright © 2015, American Society for Microbiology. All Rights Reserved.


September 22, 2019

Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower.

The flower of the safflower (Carthamus tinctorius L.) has been widely used in traditional Chinese medicine for the ability to improve cerebral blood flow. Flavonoids are the primary bioactive components in safflower, and their biosynthesis has attracted widespread interest. Previous studies mostly used second-generation sequencing platforms to survey the putative flavonoid biosynthesis genes. For a better understanding of transcription data and the putative genes involved in flavonoid biosynthesis in safflower, we carry our study.High-quality RNA was extracted from six types of safflower tissue. The RNAs of different tissues were mixed equally and used for multiple size-fractionated libraries (1-2, 2-3 and 3-6 k) library construction. Five cells were carried (2 cells for 1-2 and for 2-3 k libraries and 1 cell for 3-6 k libraries). 10.43Gb clean data and 38,302 de-redundant sequences were captured. 44 unique isoforms were annotated as encoding enzymes involved in flavonoid biosynthesis. The full length flavonoid genes were characterized and their evolutional relationship and expressional pattern were analyzed. They can be divided into eight families, with a large differences in the tissue expression. The temporal expressions under MeJA treatment were also measured, 9 genes are significantly up-regulated and 2 genes are significantly down-regulated. The genes involved in flavonoid synthesis in safflower were predicted in our study. Besides, the SSR and lncRNA are also analyzed in our study.Full-length transcriptome sequences were used in our study. The genes involved in flavonoid synthesis in safflower were predicted in our study. Combined the determination of flavonoids, CtC4H2, CtCHS3, CtCHI3, CtF3H3, CtF3H1 are mainly participated in MeJA promoting the synthesis of flavonoids. Our results also provide a valuable resource for further study on safflower.


September 22, 2019

Transcriptome-wide survey of pseudorabies virus using next- and third-generation sequencing platforms.

Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100?bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50?bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore’s MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.