Menu
September 22, 2019

Long-read sequencing of chicken transcripts and identification of new transcript isoforms.

The chicken has long served as an important model organism in many fields, and continues to aid our understanding of animal development. Functional genomics studies aimed at probing the mechanisms that regulate development require high-quality genomes and transcript annotations. The quality of these resources has improved dramatically over the last several years, but many isoforms and genes have yet to be identified. We hope to contribute to the process of improving these resources with the data presented here: a set of long cDNA sequencing reads, and a curated set of new genes and transcript isoforms not currently represented in the most up-to-date genome annotation currently available to the community of researchers who rely on the chicken genome.


September 22, 2019

High-resolution characterization of the human microbiome.

The human microbiome plays an important and increasingly recognized role in human health. Studies of the microbiome typically use targeted sequencing of the 16S rRNA gene, whole metagenome shotgun sequencing, or other meta-omic technologies to characterize the microbiome’s composition, activity, and dynamics. Processing, analyzing, and interpreting these data involve numerous computational tools that aim to filter, cluster, annotate, and quantify the obtained data and ultimately provide an accurate and interpretable profile of the microbiome’s taxonomy, functional capacity, and behavior. These tools, however, are often limited in resolution and accuracy and may fail to capture many biologically and clinically relevant microbiome features, such as strain-level variation or nuanced functional response to perturbation. Over the past few years, extensive efforts have been invested toward addressing these challenges and developing novel computational methods for accurate and high-resolution characterization of microbiome data. These methods aim to quantify strain-level composition and variation, detect and characterize rare microbiome species, link specific genes to individual taxa, and more accurately characterize the functional capacity and dynamics of the microbiome. These methods and the ability to produce detailed and precise microbiome information are clearly essential for informing microbiome-based personalized therapies. In this review, we survey these methods, highlighting the challenges each method sets out to address and briefly describing methodological approaches. Copyright © 2016 Elsevier Inc. All rights reserved.


September 22, 2019

Alternative isoform analysis of Ttc8 expression in the rat pineal gland using a multi-platform sequencing approach reveals neural regulation.

Alternative isoform regulation (AIR) vastly increases transcriptome diversity and plays an important role in numerous biological processes and pathologies. However, the detection and analysis of isoform-level differential regulation is difficult, particularly in the face of complex and incompletely-annotated transcriptomes. Here we have used Illumina short-read/high-throughput RNA-Seq to identify 55 genes that exhibit neurally-regulated AIR in the pineal gland, and then used two other complementary experimental platforms to further study and characterize the Ttc8 gene, which is involved in Bardet-Biedl syndrome and non-syndromic retinitis pigmentosa. Use of the JunctionSeq analysis tool led to the detection of several novel exons and splice junctions in this gene, including two novel alternative transcription start sites which were found to display disproportionately strong neurally-regulated differential expression in several independent experiments. These high-throughput sequencing results were validated and augmented via targeted qPCR and long-read Pacific Biosciences SMRT sequencing. We confirmed the existence of numerous novel splice junctions and the selective upregulation of the two novel start sites. In addition, we identified more than 20 novel isoforms of the Ttc8 gene that are co-expressed in this tissue. By using information from multiple independent platforms we not only greatly reduce the risk of errors, biases, and artifacts influencing our results, we also are able to characterize the regulation and splicing of the Ttc8 gene more deeply and more precisely than would be possible via any single platform. The hybrid method outlined here represents a powerful strategy in the study of the transcriptome.


September 22, 2019

Enigmatic Diphyllatea eukaryotes: culturing and targeted PacBio RS amplicon sequencing reveals a higher order taxonomic diversity and global distribution.

The class Diphyllatea belongs to a group of enigmatic unicellular eukaryotes that play a key role in reconstructing the morphological innovation and diversification of early eukaryotic evolution. Despite its evolutionary significance, very little is known about the phylogeny and species diversity of Diphyllatea. Only three species have described morphology, being taxonomically divided by flagella number, two or four, and cell size. Currently, one 18S rRNA Diphyllatea sequence is available, with environmental sequencing surveys reporting only a single partial sequence from a Diphyllatea-like organism. Accordingly, geographical distribution of Diphyllatea based on molecular data is limited, despite morphological data suggesting the class has a global distribution. We here present a first attempt to understand species distribution, diversity and higher order structure of Diphyllatea.We cultured 11 new strains, characterised these morphologically and amplified their rRNA for a combined 18S-28S rRNA phylogeny. We sampled environmental DNA from multiple sites and designed new Diphyllatea-specific PCR primers for long-read PacBio RSII technology. Near full-length 18S rRNA sequences from environmental DNA, in addition to supplementary Diphyllatea sequence data mined from public databases, resolved the phylogeny into three deeply branching and distinct clades (Diphy I – III). Of these, the Diphy III clade is entirely novel, and in congruence with Diphy II, composed of species morphologically consistent with the earlier described Collodictyon triciliatum. The phylogenetic split between the Diphy I and Diphy II?+?III clades corresponds with a morphological division of Diphyllatea into bi- and quadriflagellate cell forms.This altered flagella composition must have occurred early in the diversification of Diphyllatea and may represent one of the earliest known morphological transitions among eukaryotes. Further, the substantial increase in molecular data presented here confirms Diphyllatea has a global distribution, seemingly restricted to freshwater habitats. Altogether, the results reveal the advantage of combining a group-specific PCR approach and long-read high-throughput amplicon sequencing in surveying enigmatic eukaryote lineages. Lastly, our study shows the capacity of PacBio RS when targeting a protist class for increasing phylogenetic resolution.


September 22, 2019

Shannon: an information-optimal de novo RNA-Seq assembler

De novo assembly of short RNA-Seq reads into transcripts is challenging due to sequence similarities in transcriptomes arising from gene duplications and alternative splicing of transcripts. We present Shannon, an RNA-Seq assembler with an optimality guarantee derived from principles of information theory: Shannon reconstructs nearly all information-theoretically reconstructable transcripts. Shannon is based on a theory we develop for de novo RNA-Seq assembly that reveals differing abundances among transcripts to be the key, rather than the barrier, to effective assembly. The assembly problem is formulated as a sparsest-flow problem on a transcript graph, and the heart of Shannon is a novel iterative flow-decomposition algorithm. This algorithm provably solves the information-theoretically reconstructable instances in linear-time even though the general sparsest-flow problem is NP-hard. Shannon also incorporates several additional new algorithmic advances: a new error-correction algorithm based on successive cancelation, a multi-bridging algorithm that carefully utilizes read information in the k-mer de Bruijn graph, and an approximate graph partitioning algorithm to split the transcriptome de Bruijn graph into smaller components. In tests on large RNA-Seq datasets, Shannon obtains significant increases in sensitivity along with improvements in specificity in comparison to state-of-the-art assemblers.


September 22, 2019

Genomics and host specialization of honey bee and bumble bee gut symbionts.

Gilliamella apicola and Snodgrassella alvi are dominant members of the honey bee (Apis spp.) and bumble bee (Bombus spp.) gut microbiota. We generated complete genomes of the type strains G. apicola wkB1(T) and S. alvi wkB2(T) (isolated from Apis), as well as draft genomes for four other strains from Bombus. G. apicola and S. alvi were found to occupy very different metabolic niches: The former is a saccharolytic fermenter, whereas the latter is an oxidizer of carboxylic acids. Together, they may form a syntrophic network for partitioning of metabolic resources. Both species possessed numerous genes [type 6 secretion systems, repeats in toxin (RTX) toxins, RHS proteins, adhesins, and type IV pili] that likely mediate cell-cell interactions and gut colonization. Variation in these genes could account for the host fidelity of strains observed in previous phylogenetic studies. Here, we also show the first experimental evidence, to our knowledge, for this specificity in vivo: Strains of S. alvi were able to colonize their native bee host but not bees of another genus. Consistent with specific, long-term host association, comparative genomic analysis revealed a deep divergence and little or no gene flow between Apis and Bombus gut symbionts. However, within a host type (Apis or Bombus), we detected signs of horizontal gene transfer between G. apicola and S. alvi, demonstrating the importance of the broader gut community in shaping the evolution of any one member. Our results show that host specificity is likely driven by multiple factors, including direct host-microbe interactions, microbe-microbe interactions, and social transmission.


September 22, 2019

A manganese superoxide dismutase (MnSOD) from red lip mullet, Liza haematocheila: Evaluation of molecular structure, immune response, and antioxidant function.

Manganese superoxide dismutase (MnSOD) is a nuclear-encoded antioxidant metalloenzyme. The main function of this enzyme is to dismutase the toxic superoxide anion (O2-) into less toxic hydrogen peroxide (H2O2) and oxygen (O2). Structural analysis of mullet MnSOD (MuMnSOD) was performed using different bioinformatics tools. Pairwise alignment revealed that the protein sequence matched to that derived from Larimichthys crocea with a 95.2% sequence identity. Phylogenetic tree analysis showed that the MuMnSOD was included in the category of teleosts. Multiple sequence alignment showed that a SOD Fe-N domain, SOD Fe-C domain, and Mn/Fe SOD signature were highly conserved among the other examined MnSOD orthologs. Quantitative real-time PCR showed that the highest MuMnSOD mRNA expression level was in blood cells. The highest expression level of MuMnSOD was observed in response to treatment with both Lactococcus garvieae and lipopolysaccharide (LPS) at 6?h post treatment in the head kidney and blood. Potential ROS-scavenging ability of the purified recombinant protein (rMuMnSOD) was examined by the xanthine oxidase assay (XOD assay). The optimum temperature and pH for XOD activity were found to be 25?°C and pH 7, respectively. Relative XOD activity was significantly increased with the dose of rMuMnSOD, revealing its dose dependency. Activity of rMuMnSOD was inhibited by potassium cyanide (KCN) and N-N’-diethyl-dithiocarbamate (DDC). Moreover, expression of MuMnSOD resulted in considerable growth retardation of both gram-positive and gram-negative bacteria. Results of the current study suggest that MuMnSOD acts as an antioxidant enzyme and participates in the immune response in mullet. Copyright © 2018 Elsevier Ltd. All rights reserved.


September 22, 2019

Emergence, retention and selection: A trilogy of origination for functional de novo proteins from ancestral lncRNAs in primates.

While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age, due to their GC-rich sequence property enabling stable ORFs with lower chance of non-sense mutations. Interestingly, although the emergence and retention of these de novo genes are likely driven by neutral forces, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution, which may contribute to human-specific genetic novelties by taking advantage of existed genomic contexts.


September 22, 2019

Full-length transcriptome sequences and the identification of putative genes for flavonoid biosynthesis in safflower.

The flower of the safflower (Carthamus tinctorius L.) has been widely used in traditional Chinese medicine for the ability to improve cerebral blood flow. Flavonoids are the primary bioactive components in safflower, and their biosynthesis has attracted widespread interest. Previous studies mostly used second-generation sequencing platforms to survey the putative flavonoid biosynthesis genes. For a better understanding of transcription data and the putative genes involved in flavonoid biosynthesis in safflower, we carry our study.High-quality RNA was extracted from six types of safflower tissue. The RNAs of different tissues were mixed equally and used for multiple size-fractionated libraries (1-2, 2-3 and 3-6 k) library construction. Five cells were carried (2 cells for 1-2 and for 2-3 k libraries and 1 cell for 3-6 k libraries). 10.43Gb clean data and 38,302 de-redundant sequences were captured. 44 unique isoforms were annotated as encoding enzymes involved in flavonoid biosynthesis. The full length flavonoid genes were characterized and their evolutional relationship and expressional pattern were analyzed. They can be divided into eight families, with a large differences in the tissue expression. The temporal expressions under MeJA treatment were also measured, 9 genes are significantly up-regulated and 2 genes are significantly down-regulated. The genes involved in flavonoid synthesis in safflower were predicted in our study. Besides, the SSR and lncRNA are also analyzed in our study.Full-length transcriptome sequences were used in our study. The genes involved in flavonoid synthesis in safflower were predicted in our study. Combined the determination of flavonoids, CtC4H2, CtCHS3, CtCHI3, CtF3H3, CtF3H1 are mainly participated in MeJA promoting the synthesis of flavonoids. Our results also provide a valuable resource for further study on safflower.


September 22, 2019

Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome.

The majority of microbial genomic diversity remains unexplored. This is largely due to our inability to culture most microorganisms in isolation, which is a prerequisite for traditional genome sequencing. Single-cell sequencing has allowed researchers to circumvent this limitation. DNA is amplified directly from a single cell using the whole-genome amplification technique of multiple displacement amplification (MDA). However, MDA from a single chromosome copy suffers from amplification bias and a large loss of specificity from even very small amounts of DNA contamination, which makes assembling a genome difficult and completely finishing a genome impossible except in extraordinary circumstances. Gel microdrop cultivation allows culturing of a diverse microbial community and provides hundreds to thousands of genetically identical cells as input for an MDA reaction. We demonstrate the utility of this approach by comparing sequencing results of gel microdroplets and single cells following MDA. Bias is reduced in the MDA reaction and genome sequencing, and assembly is greatly improved when using gel microdroplets. We acquired multiple near-complete genomes for two bacterial species from human oral and stool microbiome samples. A significant amount of genome diversity, including single nucleotide polymorphisms and genome recombination, is discovered. Gel microdroplets offer a powerful and high-throughput technology for assembling whole genomes from complex samples and for probing the pan-genome of naturally occurring populations.


September 22, 2019

Two novel lncRNAs discovered in human mitochondrial DNA using PacBio full-length transcriptome data.

In this study, we established a general framework to use PacBio full-length transcriptome sequencing for the investigation of mitochondrial RNAs. As a result, we produced the first full-length human mitochondrial transcriptome using public PacBio data and characterized the human mitochondrial genome with more comprehensive and accurate information. Other results included determination of the H-strand primary transcript, identification of the ND5/ND6AS/tRNAGluAS transcript, discovery of palindrome small RNAs (psRNAs) and construction of the “mitochondrial cleavage” model, etc. These results reported for the first time in this study fundamentally changed annotations of human mitochondrial genome and enriched knowledge in the field of animal mitochondrial studies. The most important finding was two novel long non-coding RNAs (lncRNAs) of MDL1 and MDL1AS exist ubiquitously in animal mitochondrial genomes. Copyright © 2017. Published by Elsevier B.V.


September 22, 2019

Accurate characterization of the IFITM locus using MiSeq and PacBio sequencing shows genetic variation in Galliformes.

Interferon inducible transmembrane (IFITM) proteins are effectors of the immune system widely characterized for their role in restricting infection by diverse enveloped and non-enveloped viruses. The chicken IFITM (chIFITM) genes are clustered on chromosome 5 and to date four genes have been annotated, namely chIFITM1, chIFITM3, chIFITM5 and chIFITM10. However, due to poor assembly of this locus in the Gallus Gallus v4 genome, accurate characterization has so far proven problematic. Recently, a new chicken reference genome assembly Gallus Gallus v5 was generated using Sanger, 454, Illumina and PacBio sequencing technologies identifying considerable differences in the chIFITM locus over the previous genome releases.We re-sequenced the locus using both Illumina MiSeq and PacBio RS II sequencing technologies and we mapped RNA-seq data from the European Nucleotide Archive (ENA) to this finalized chIFITM locus. Using SureSelect probes capture probes designed to the finalized chIFITM locus, we sequenced the locus of a different chicken breed, namely a White Leghorn, and a turkey.We confirmed the Gallus Gallus v5 consensus except for two insertions of 5 and 1 base pair within the chIFITM3 and B4GALNT4 genes, respectively, and a single base pair deletion within the B4GALNT4 gene. The pull down revealed a single amino acid substitution of A63V in the CIL domain of IFITM2 compared to Red Jungle fowl and 13, 13 and 11 differences between IFITM1, 2 and 3 of chickens and turkeys, respectively. RNA-seq shows chIFITM2 and chIFITM3 expression in numerous tissue types of different chicken breeds and avian cell lines, while the expression of the putative chIFITM1 is limited to the testis, caecum and ileum tissues.Locus resequencing using these capture probes and RNA-seq based expression analysis will allow the further characterization of genetic diversity within Galliformes.


September 22, 2019

RNAi-based treatment of chronically infected patients and chimpanzees reveals that integrated hepatitis B virus DNA is a source of HBsAg.

Chronic hepatitis B virus (HBV) infection is a major health concern worldwide, frequently leading to liver cirrhosis, liver failure, and hepatocellular carcinoma. Evidence suggests that high viral antigen load may play a role in chronicity. Production of viral proteins is thought to depend on transcription of viral covalently closed circular DNA (cccDNA). In a human clinical trial with an RNA interference (RNAi)-based therapeutic targeting HBV transcripts, ARC-520, HBV S antigen (HBsAg) was strongly reduced in treatment-naïve patients positive for HBV e antigen (HBeAg) but was reduced significantly less in patients who were HBeAg-negative or had received long-term therapy with nucleos(t)ide viral replication inhibitors (NUCs). HBeAg positivity is associated with greater disease risk that may be moderately reduced upon HBeAg loss. The molecular basis for this unexpected differential response was investigated in chimpanzees chronically infected with HBV. Several lines of evidence demonstrated that HBsAg was expressed not only from the episomal cccDNA minichromosome but also from transcripts arising from HBV DNA integrated into the host genome, which was the dominant source in HBeAg-negative chimpanzees. Many of the integrants detected in chimpanzees lacked target sites for the small interfering RNAs in ARC-520, explaining the reduced response in HBeAg-negative chimpanzees and, by extension, in HBeAg-negative patients. Our results uncover a heretofore underrecognized source of HBsAg that may represent a strategy adopted by HBV to maintain chronicity in the presence of host immunosurveillance. These results could alter trial design and endpoint expectations of new therapies for chronic HBV. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


September 22, 2019

De novo transcriptome assembly of the Chinese pearl barley, adlay, by full-length isoform and short-read RNA sequencing.

Adlay (Coix lacryma-jobi) is a tropical grass that has long been used in traditional Chinese medicine and is known for its nutritional benefits. Recent studies have shown that vitamin E compounds in adlay protect against chronic diseases such as cancer and heart disease. However, the molecular basis of adlay’s health benefits remains unknown. Here, we generated adlay gene sets by de novo transcriptome assembly using long-read isoform sequencing (Iso-Seq) and short-read RNA-Sequencing (RNA-Seq). The gene sets obtained from Iso-seq and RNA-seq contained 31,177 genes and 57,901 genes, respectively. We confirmed the validity of the assembled gene sets by experimentally analyzing the levels of prolamin and vitamin E biosynthesis-associated proteins in adlay plant tissues and seeds. We compared the screened adlay genes with known gene families from closely related plant species, such as rice, sorghum and maize. We also identified tissue-specific genes from the adlay leaf, root, and young and mature seed, and experimentally validated the differential expression of 12 randomly-selected genes. Our study of the adlay transcriptome will provide a valuable resource for genetic studies that can enhance adlay breeding programs in the future.


September 22, 2019

Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing.

We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and a very low false positive rate. The results show that IDP-fusion will be useful for unraveling the complexity of multiple fusion splices and fusion isoforms within tumorigenesis-relevant fusion genes. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.