Menu
September 22, 2019

Extensive alternative splicing of KIR transcripts.

The killer-cell Ig-like receptors (KIR) form a multigene entity involved in modulating immune responses through interactions with MHC class I molecules. The complexity of the KIR cluster is reflected by, for instance, abundant levels of allelic polymorphism, gene copy number variation, and stochastic expression profiles. The current transcriptome study involving human and macaque families demonstrates that KIR family members are also subjected to differential levels of alternative splicing, and this seems to be gene dependent. Alternative splicing may result in the partial or complete skipping of exons, or the partial inclusion of introns, as documented at the transcription level. This post-transcriptional process can generate multiple isoforms from a single KIR gene, which diversifies the characteristics of the encoded proteins. For example, alternative splicing could modify ligand interactions, cellular localization, signaling properties, and the number of extracellular domains of the receptor. In humans, we observed abundant splicing for KIR2DL4, and to a lesser extent in the lineage III KIR genes. All experimentally documented splice events are substantiated by in silico splicing strength predictions. To a similar extent, alternative splicing is observed in rhesus macaques, a species that shares a close evolutionary relationship with humans. Splicing profiles of Mamu-KIR1D and Mamu-KIR2DL04 displayed a great diversity, whereas Mamu-KIR3DL20 (lineage V) is consistently spliced to generate a homolog of human KIR2DL5 (lineage I). The latter case represents an example of convergent evolution. Although just a single KIR splice event is shared between humans and macaques, the splicing mechanisms are similar, and the predicted consequences are comparable. In conclusion, alternative splicing adds an additional layer of complexity to the KIR gene system in primates, and results in a wide structural and functional variety of KIR receptors and its isoforms, which may play a role in health and disease.


September 22, 2019

Full-length transcriptome survey and expression analysis of Cassia obtusifolia to discover putative genes related to aurantio-obtusin biosynthesis, seed formation and development, and stress response.

The seed is the pharmaceutical and breeding organ of Cassia obtusifolia, a well-known medical herb containing aurantio-obtusin (a kind of anthraquinone), food, and landscape. In order to understand the molecular mechanism of the biosynthesis of aurantio-obtusin, seed formation and development, and stress response of C. obtusifolia, it is necessary to understand the genomics information. Although previous seed transcriptome of C. obtusifolia has been carried out by short-read next-generation sequencing (NGS) technology, the vast majority of the resulting unigenes did not represent full-length cDNA sequences and supply enough gene expression profile information of the various organs or tissues. In this study, fifteen cDNA libraries, which were constructed from the seed, root, stem, leaf, and flower (three repetitions with each organ) of C. obtusifolia, were sequenced using hybrid approach combining single-molecule real-time (SMRT) and NGS platform. More than 4,315,774 long reads with 9.66 Gb sequencing data and 361,427,021 short reads with 108.13 Gb sequencing data were generated by SMRT and NGS platform, respectively. 67,222 consensus isoforms were clustered from the reads and 81.73% (61,016) of which were longer than 1000 bp. Furthermore, the 67,222 consensus isoforms represented 58,106 nonredundant transcripts, 98.25% (57,092) of which were annotated and 25,573 of which were assigned to specific metabolic pathways by KEGG. CoDXS and CoDXR genes were directly used for functional characterization to validate the accuracy of sequences obtained from transcriptome. A total of 658 seed-specific transcripts indicated their special roles in physiological processes in seed. Analysis of transcripts which were involved in the early stage of anthraquinone biosynthesis suggested that the aurantio-obtusin in C. obtusifolia was mainly generated from isochorismate and Mevalonate/methylerythritol phosphate (MVA/MEP) pathway, and three reactions catalyzed by Menaquinone-specific isochorismate synthase (ICS), 1-deoxy-d-xylulose-5-phosphate synthase (DXS) and isopentenyl diphosphate (IPPS) might be the limited steps. Several seed-specific CYPs, SAM-dependent methyltransferase, and UDP-glycosyltransferase (UDPG) supplied promising candidate genes in the late stage of anthraquinone biosynthesis. In addition, four seed-specific transcriptional factors including three MYB Transcription Factor (MYB) and one MADS-box Transcription Factor (MADS) transcriptional factors) and alternative splicing might be involved with seed formation and development. Meanwhile, most members of Hsp20 genes showed high expression level in seed and flower; seven of which might have chaperon activities under various abiotic stresses. Finally, the expressional patterns of genes with particular interests showed similar trends in both transcriptome assay and qRT-PCR. In conclusion, this is the first full-length transcriptome sequencing reported in Caesalpiniaceae family, and thus providing a more complete insight into aurantio-obtusin biosynthesis, seed formation and development, and stress response as well in C. obtusifolia.


September 22, 2019

The habu genome reveals accelerated evolution of venom protein genes.

Evolution of novel traits is a challenging subject in biological research. Several snake lineages developed elaborate venom systems to deliver complex protein mixtures for prey capture. To understand mechanisms involved in snake venom evolution, we decoded here the ~1.4-Gb genome of a habu, Protobothrops flavoviridis. We identified 60 snake venom protein genes (SV) and 224 non-venom paralogs (NV), belonging to 18 gene families. Molecular phylogeny reveals early divergence of SV and NV genes, suggesting that one of the four copies generated through two rounds of whole-genome duplication was modified for use as a toxin. Among them, both SV and NV genes in four major components were extensively duplicated after their diversification, but accelerated evolution is evident exclusively in the SV genes. Both venom-related SV and NV genes are significantly enriched in microchromosomes. The present study thus provides a genetic background for evolution of snake venom composition.


September 22, 2019

JAFFA: High sensitivity transcriptome-focused fusion gene detection.

Genomic instability is a hallmark of cancer and, as such, structural alterations and fusion genes are common events in the cancer landscape. RNA sequencing (RNA-Seq) is a powerful method for profiling cancers, but current methods for identifying fusion genes are optimised for short reads. JAFFA (https://github.com/Oshlack/JAFFA/wiki) is a sensitive fusion detection method that outperforms other methods with reads of 100 bp or greater. JAFFA compares a cancer transcriptome to the reference transcriptome, rather than the genome, where the cancer transcriptome is inferred using long reads directly or by de novo assembling short reads.


September 22, 2019

Next-generation approaches to advancing eco-immunogenomic research in critically endangered primates.

High-throughput sequencing platforms are generating massive amounts of genomic data from nonmodel species, and these data sets are valuable resources that can be mined to advance a number of research areas. An example is the growing amount of transcriptome data that allow for examination of gene expression in nonmodel species. Here, we show how publicly available transcriptome data from nonmodel primates can be used to design novel research focused on immunogenomics. We mined transcriptome data from the world’s most endangered group of primates, the lemurs of Madagascar, for sequences corresponding to immunoglobulins. Our results confirmed homology between strepsirrhine and haplorrhine primate immunoglobulins and allowed for high-throughput sequencing of expressed antibodies (Ig-seq) in Coquerel’s sifaka (Propithecus coquereli). Using both Pacific Biosciences RS and Ion Torrent PGM sequencing, we performed Ig-seq on two individuals of Coquerel’s sifaka. We generated over 150 000 sequences of expressed antibodies, allowing for molecular characterization of the antigen-binding region. Our analyses suggest that similar VDJ expression patterns exist across all primates, with sequences closely related to the human VH 3 immunoglobulin family being heavily represented in sifaka antibodies. Moreover, the antigen-binding region of sifaka antibodies exhibited similar amino acid variation with respect to haplorrhine primates. Our study represents the first attempt to characterize sequence diversity of the expressed antibody repertoire in a species of lemur. We anticipate that methods similar to ours will provide the framework for investigating the adaptive immune response in wild populations of other nonmodel organisms and can be used to advance the burgeoning field of eco-immunology. © 2014 John Wiley & Sons Ltd.


September 22, 2019

Computational analysis of alternative splicing in plant genomes.

Computational analyses play crucial roles in characterizing splicing isoforms in plant genomes. In this review, we provide a survey of computational tools used in recently published, genome-scale splicing analyses in plants. We summarize the commonly used software and pipelines for read mapping, isoform reconstruction, isoform quantification, and differential expression analysis. We also discuss methods for analyzing long reads and the strategies to combine long and short reads in identifying splicing isoforms. We review several tools for characterizing local splicing events, splicing graphs, coding potential, and visualizing splicing isoforms. We further discuss the procedures for identifying conserved splicing isoforms across plant species. Finally, we discuss the outlook of integrating other genomic data with splicing analyses to identify regulatory mechanisms of AS on genome-wide scale. Copyright © 2018 Elsevier B.V. All rights reserved.


September 22, 2019

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.


September 22, 2019

Genome and evolution of the shade-requiring medicinal herb Panax ginseng.

Panax ginseng C. A. Meyer, reputed as the king of medicinal herbs, has slow growth, long generation time, low seed production and complicated genome structure that hamper its study. Here, we unveil the genomic architecture of tetraploid P. ginseng by de novo genome assembly, representing 2.98 Gbp with 59 352 annotated genes. Resequencing data indicated that diploid Panax species diverged in association with global warming in Southern Asia, and two North American species evolved via two intercontinental migrations. Two whole genome duplications (WGD) occurred in the family Araliaceae (including Panax) after divergence with the Apiaceae, the more recent one contributing to the ability of P. ginseng to overwinter, enabling it to spread broadly through the Northern Hemisphere. Functional and evolutionary analyses suggest that production of pharmacologically important dammarane-type ginsenosides originated in Panax and are produced largely in shoot tissues and transported to roots; that newly evolved P. ginseng fatty acid desaturases increase freezing tolerance; and that unprecedented retention of chlorophyll a/b binding protein genes enables efficient photosynthesis under low light. A genome-scale metabolic network provides a holistic view of Panax ginsenoside biosynthesis. This study provides valuable resources for improving medicinal values of ginseng either through genomics-assisted breeding or metabolic engineering.© 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


September 22, 2019

Integrative analysis of three RNA sequencing methods identifies mutually exclusive exons of MADS-box isoforms during early bud development in Picea abies.

Recent efforts to sequence the genomes and transcriptomes of several gymnosperm species have revealed an increased complexity in certain gene families in gymnosperms as compared to angiosperms. One example of this is the gymnosperm sister clade to angiosperm TM3-like MADS-box genes, which at least in the conifer lineage has expanded in number of genes. We have previously identified a member of this sub-clade, the conifer gene DEFICIENS AGAMOUS LIKE 19 (DAL19), as being specifically upregulated in cone-setting shoots. Here, we show through Sanger sequencing of mRNA-derived cDNA and mapping to assembled conifer genomic sequences that DAL19 produces six mature mRNA splice variants in Picea abies. These splice variants use alternate first and last exons, while their four central exons constitute a core region present in all six transcripts. Thus, they are likely to be transcript isoforms. Quantitative Real-Time PCR revealed that two mutually exclusive first DAL19 exons are differentially expressed across meristems that will form either male or female cones, or vegetative shoots. Furthermore, mRNA in situ hybridization revealed that two mutually exclusive last DAL19 exons were expressed in a cell-specific pattern within bud meristems. Based on these findings in DAL19, we developed a sensitive approach to transcript isoform assembly from short-read sequencing of mRNA. We applied this method to 42 putative MADS-box core regions in P. abies, from which we assembled 1084 putative transcripts. We manually curated these transcripts to arrive at 933 assembled transcript isoforms of 38 putative MADS-box genes. 152 of these isoforms, which we assign to 28 putative MADS-box genes, were differentially expressed across eight female, male, and vegetative buds. We further provide evidence of the expression of 16 out of the 38 putative MADS-box genes by mapping PacBio Iso-Seq circular consensus reads derived from pooled sample sequencing to assembled transcripts. In summary, our analyses reveal the use of mutually exclusive exons of MADS-box gene isoforms during early bud development in P. abies, and we find that the large number of identified MADS-box transcripts in P. abies results not only from expansion of the gene family through gene duplication events but also from the generation of numerous splice variants.


September 22, 2019

Hybrid error correction and de novo assembly of single-molecule sequencing reads.

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.


September 22, 2019

High-resolution comparative analysis of great ape genomes.

Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


September 22, 2019

High-resolution expression map of the Arabidopsis root reveals alternative splicing and lincRNA regulation.

The extent to which alternative splicing and long intergenic noncoding RNAs (lincRNAs) contribute to the specialized functions of cells within an organ is poorly understood. We generated a comprehensive dataset of gene expression from individual cell types of the Arabidopsis root. Comparisons across cell types revealed that alternative splicing tends to remove parts of coding regions from a longer, major isoform, providing evidence for a progressive mechanism of splicing. Cell-type-specific intron retention suggested a possible origin for this common form of alternative splicing. Coordinated alternative splicing across developmental stages pointed to a role in regulating differentiation. Consistent with this hypothesis, distinct isoforms of a transcription factor were shown to control developmental transitions. lincRNAs were generally lowly expressed at the level of individual cell types, but co-expression clusters provided clues as to their function. Our results highlight insights gained from analysis of expression at the level of individual cell types. Copyright © 2016 Elsevier Inc. All rights reserved.


September 22, 2019

Analysis of aquaporins from the euryhaline barnacle Balanus improvisus reveals differential expression in response to changes in salinity.

Barnacles are sessile macro-invertebrates, found along rocky shores in coastal areas worldwide. The euryhaline bay barnacle Balanus improvisus (Darwin, 1854) (= Amphibalanus improvisus) can tolerate a wide range of salinities, but the molecular mechanisms underlying the osmoregulatory capacity of this truly brackish species are not well understood. Aquaporins are pore-forming integral membrane proteins that facilitate transport of water, small solutes and ions through cellular membranes, and that have been shown to be important for osmoregulation in many organisms. The knowledge of the function of aquaporins in crustaceans is, however, limited and nothing is known about them in barnacles. We here present the repertoire of aquaporins from a thecostracan crustacean, the barnacle B. improvisus, based on genome and transcriptome sequencing. Our analyses reveal that B. improvisus contains eight genes for aquaporins. Phylogenetic analysis showed that they represented members of the classical water aquaporins (Aqp1, Aqp2), the aquaglyceroporins (Glp1, Glp2), the unorthodox aquaporin (Aqp12) and the arthropod-specific big brain aquaporin (Bib). Interestingly, we also found two big brain-like proteins (BibL1 and BibL2) constituting a new group of aquaporins not yet described in arthropods. In addition, we found that the two water-specific aquaporins were expressed as C-terminal splice variants. Heterologous expression of some of the aquaporins followed by functional characterization showed that Aqp1 transported water and Glp2 water and glycerol, agreeing with the predictions of substrate specificity based on 3D modeling and phylogeny. To investigate a possible role for the B. improvisus aquaporins in osmoregulation, mRNA expression changes in adult barnacles were analysed after long-term acclimation to different salinities. The most pronounced expression difference was seen for AQP1 with a substantial (>100-fold) decrease in the mantle tissue in low salinity (3 PSU) compared to high salinity (33 PSU). Our study provides a base for future mechanistic studies on the role of aquaporins in osmoregulation.


September 22, 2019

A survey of transcriptome complexity in Sus scrofa using single-molecule long-read sequencing.

Alternative splicing (AS) and fusion transcripts produce a vast expansion of transcriptomes and proteomes diversity. However, the reliability of these events and the extend of epigenetic mechanisms have not been adequately addressed due to its limitation of uncertainties about the complete structure of mRNA. Here we combined single-molecule real-time sequencing, Illumina RNA-seq and DNA methylation data to characterize the landscapes of DNA methylation on AS, fusion isoforms formation and lncRNA feature and further to unveil the transcriptome complexity of pig. Our analysis identified an unprecedented scale of high-quality full-length isoforms with over 28,127 novel isoforms from 26,881 novel genes. More than 92,000 novel AS events were detected and intron retention predominated in AS model, followed by exon skipping. Interestingly, we found that DNA methylation played an important role in generating various AS isoforms by regulating splicing sites, promoter regions and first exons. Furthermore, we identified a large of fusion transcripts and novel lncRNAs, and found that DNA methylation of the promoter and gene body could regulate lncRNA expression. Our results significantly improved existed gene models of pig and unveiled that pig AS and epigenetic modify were more complex than previously thought.


September 22, 2019

Comprehensive transcriptome analysis of Sarcophaga peregrina, a forensically important fly species.

Sarcophaga peregrina (flesh fly) is a frequently found fly species in Palaearctic, Oriental, and Australasian regions that can be used to estimate minimal postmortem intervals important for forensic investigations. Despite its forensic importance, the genome information of S. peregrina has not been fully described. Therefore, we generated a comprehensive gene expression dataset using RNA sequencing and carried out de novo assembly to characterize the S. peregrina transcriptome. We obtained precise sequence information for RNA transcripts using two different methods. Based on primary sequence information, we identified sets of assembled unigenes and predicted coding sequences. Functional annotation of the aligned unigenes was performed using the UniProt, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes databases. As a result, 26,580,352 and 83,221 raw reads were obtained using the Illumina MiSeq and Pacbio RS II Iso-Seq sequencing applications, respectively. From these reads, 55,730 contigs were successfully annotated. The present study provides the resulting genome information of S. peregrina, which is valuable for forensic applications.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.