April 21, 2020  |  

The landscape of SNCA transcripts across synucleinopathies: New insights from long reads sequencing analysis

Dysregulation of alpha-synuclein expression has been implicated in the pathogenesis of synucleinopathies, in particular Parkinsontextquoterights Disease (PD) and Dementia with Lewy bodies (DLB). Previous studies have shown that the alternatively spliced isoforms of the SNCA gene are differentially expressed in different parts of the brain for PD and DLB patients. Similarly, SNCA isoforms with skipped exons can have a functional impact on the protein domains. The large intronic region of the SNCA gene was also shown to harbor structural variants that affect transcriptional levels. Here we apply the first study of using long read sequencing with targeted capture of both the gDNA and cDNA of the SNCA gene in brain tissues of PD, DLB, and control samples using the PacBio Sequel system. The targeted full-length cDNA (Iso-Seq) data confirmed complex usage of known alternative start sites and variable 3textquoteright UTR lengths, as well as novel 5textquoteright starts and 3textquoteright ends not previously described. The targeted gDNA data allowed phasing of up to 81% of the ~114kb SNCA region, with the longest phased block excedding 54 kb. We demonstrate that long gDNA and cDNA reads have the potential to reveal long-range information not previously accessible using traditional sequencing methods. This approach has a potential impact in studying disease risk genes such as SNCA, providing new insights into the genetic etiologies, including perturbations to the landscape the gene transcripts, of human complex diseases such as synucleinopathies.


April 21, 2020  |  

Schizophrenia risk variants influence multiple classes of transcripts of sorting nexin 19 (SNX19).

Genome-wide association studies (GWAS) have identified many genomic loci associated with risk for schizophrenia, but unambiguous identification of the relationship between disease-associated variants and specific genes, and in particular their effect on risk conferring transcripts, has proven difficult. To better understand the specific molecular mechanism(s) at the schizophrenia locus in 11q25, we undertook cis expression quantitative trait loci (cis-eQTL) mapping for this 2 megabase genomic region using postmortem human brain samples. To comprehensively assess the effects of genetic risk upon local expression, we evaluated multiple transcript features: genes, exons, and exon-exon junctions in multiple brain regions-dorsolateral prefrontal cortex (DLPFC), hippocampus, and caudate. Genetic risk variants strongly associated with expression of SNX19 transcript features that tag multiple rare classes of SNX19 transcripts, whereas they only weakly affected expression of an exon-exon junction that tags the majority of abundant transcripts. The most prominent class of SNX19 risk-associated transcripts is predicted to be overexpressed, defined by an exon-exon splice junction between exons 8 and 10 (junc8.10) and that is predicted to encode proteins that lack the characteristic nexin C terminal domain. Risk alleles were also associated with either increased or decreased expression of multiple additional classes of transcripts. With RACE, molecular cloning, and long read sequencing, we found a number of novel SNX19 transcripts that further define the set of potential etiological transcripts. We explored epigenetic regulation of SNX19 expression and found that DNA methylation at CpG sites near the primary transcription start site and within exon 2 partially mediate the effects of risk variants on risk-associated expression. ATAC sequencing revealed that some of the most strongly risk-associated SNPs are located within a region of open chromatin, suggesting a nearby regulatory element is involved. These findings indicate a potentially complex molecular etiology, in which risk alleles for schizophrenia generate epigenetic alterations and dysregulation of multiple classes of SNX19 transcripts.


April 21, 2020  |  

Draft genome assembly and transcriptome sequencing of the golden algae Hydrurus foetidus (Chrysophyceae)

Hydrurusfoetidus is a freshwater alga belonging to the phylum Heterokonta. It thrives in cold rivers in polar and high alpine regions. It has several morphological traits reminiscent of single-celled eukaryotes, but can also form macroscopic thalli. Despite its ability to produce polyunsaturated fatty acids, its life under cold conditions and its variable morphology, very little is known about its genome and transcriptome. Here, we present an extensive set of next-generation sequencing data, including genomic short reads from Illumina sequencing and long reads from Nanopore sequencing, as well as full length cDNAs from PacBio IsoSeq sequencing and a small RNA dataset (smaller than 200 bp) sequenced with Illumina. We combined this data with, to our knowledge, the first draft genome assembly of a chrysophyte algae. The assembly consists of 5069 contigs to a total assembly size of 171 Mb and a 77% BUSCO completeness. The new data generated here may contribute to a better understanding of the evolution and ecological roles of chrysophyte algae, as well as to resolve the branching patterns within the Heterokonta.


April 21, 2020  |  

TIN2 Functions with TPP1/POT1 To Stimulate Telomerase Processivity.

TIN2 is an important regulator of telomere length, and mutations in TINF2, the gene encoding TIN2, cause short-telomere syndromes. While the genetics underscore the importance of TIN2, the mechanism through which TIN2 regulates telomere length remains unclear. Here, we tested the effects of human TIN2 on telomerase activity. We identified a new isoform in human cells, TIN2M, that is expressed at levels similar to those of previously studied TIN2 isoforms. All three TIN2 isoforms localized to and maintained telomere integrity in vivo, and localization was not disrupted by telomere syndrome mutations. Using direct telomerase activity assays, we discovered that TIN2 stimulated telomerase processivity in vitro All of the TIN2 isoforms stimulated telomerase to similar extents. Mutations in the TPP1 TEL patch abrogated this stimulation, suggesting that TIN2 functions with TPP1/POT1 to stimulate telomerase processivity. We conclude from our data and previously published work that TIN2/TPP1/POT1 is a functional shelterin subcomplex. Copyright © 2019 Pike et al.


April 21, 2020  |  

Hybrid Sequencing of Full-Length cDNA Transcripts of the Medicinal Plant Scutellaria baicalensis.

Scutellaria baicalensis is a well-known medicinal plant that produces biologically active flavonoids, such as baicalin, baicalein, and wogonin. Pharmacological studies have shown that these compounds have anti-inflammatory, anti-bacterial, and anti-cancer activities. Therefore, it is of great significance to investigate the genetic information of S. baicalensis, particularly the genes related to the biosynthetic pathways of these compounds. Here, we constructed the full-length transcriptome of S. baicalensis using a hybrid sequencing strategy and acquired 338,136 full-length sequences, accounting for 93.3% of the total reads. After the removal of redundancy and correction with Illumina short reads, 75,785 nonredundant transcripts were generated, among which approximately 98% were annotated with significant hits in the protein databases, and 11,135 sequences were classified as lncRNAs. Differentially expressed gene (DEG) analysis showed that most of the genes related to flavonoid biosynthesis were highly expressed in the roots, consistent with previous reports that the flavonoids were mainly synthesized and accumulated in the roots of S. baicalensis. By constructing unique transcription models, a total of 44,071 alternative splicing (AS) events were identified, with intron retention (IR) accounting for the highest proportion (44.5%). A total of 94 AS events were present in five key genes related to flavonoid biosynthesis, suggesting that AS may play important roles in the regulation of flavonoid biosynthesis in S. baicalensis. This study provided a large number of highly accurate full-length transcripts, which represents a valuable genetic resource for further research of the molecular biology of S. baicalensis, such as the development, breeding, and biosynthesis of active ingredients.


April 21, 2020  |  

Transcriptome Profiling Provides Insight into the Genes in Carotenoid Biosynthesis during the Mesocarp and Seed Developmental Stages of Avocado (Persea americana).

Avocado (Persea americana Mill.) is an economically important crop because of its high nutritional value. However, the absence of a sequenced avocado reference genome has hindered investigations of secondary metabolism. For next-generation high-throughput transcriptome sequencing, we obtained 365,615,152 and 348,623,402 clean reads as well as 109.13 and 104.10 Gb of sequencing data for avocado mesocarp and seed, respectively, during five developmental stages. High-quality reads were assembled into 100,837 unigenes with an average length of 847.40 bp (N50 = 1725 bp). Additionally, 16,903 differentially expressed genes (DEGs) were detected, 17 of which were related to carotenoid biosynthesis. The expression levels of most of these 17 DEGs were higher in the mesocarp than in the seed during five developmental stages. In this study, the avocado mesocarp and seed transcriptome were also sequenced using single-molecule long-read sequencing to acquired 25.79 and 17.67 Gb clean data, respectively. We identified 233,014 and 238,219 consensus isoforms in avocado mesocarp and seed, respectively. Furthermore, 104 and 59 isoforms were found to correspond to the putative 11 carotenoid biosynthetic-related genes in the avocado mesocarp and seed, respectively. The isoform numbers of 10 out of the putative 11 genes involved in the carotenoid biosynthetic pathway were higher in the mesocarp than those in the seed. Besides, alpha- and beta-carotene contents in the avocado mesocarp and seed during five developmental stages were also measured, and they were higher in the mesocarp than in the seed, which validated the results of transcriptome profiling. Gene expression changes and the associated variations in gene dosage could influence carotenoid biosynthesis. These results will help to further elucidate carotenoid biosynthesis in avocado.


April 21, 2020  |  

Dynamic Changes in Metabolite Accumulation and the Transcriptome during Leaf Growth and Development in Eucommia ulmoides.

Eucommia ulmoides Oliver is widely distributed in China. This species has been used mainly in medicine due to the high concentration of chlorogenic acid (CGA), flavonoids, lignans, and other compounds in the leaves and barks. However, the categories of metabolites, dynamic changes in metabolite accumulation and overall molecular mechanisms involved in metabolite biosynthesis during E. ulmoides leaf growth and development remain unknown. Here, a total of 515 analytes, including 127 flavonoids, 46 organic acids, 44 amino acid derivatives, 9 phenolamides, and 16 vitamins, were identified from four E. ulmoides samples using ultraperformance liquid chromatography-mass spectrometry (UPLC-MS) (for widely targeted metabolites). The accumulation of most flavonoids peaked in growing leaves, followed by old leaves. UPLC-MS analysis indicated that CGA accumulation increased steadily to a high concentration during leaf growth and development, and rutin showed a high accumulation level in leaf buds and growing leaves. Based on single-molecule long-read sequencing technology, 69,020 transcripts and 2880 novel loci were identified in E. ulmoides. Expression analysis indicated that isoforms in the flavonoid biosynthetic pathway and flavonoid metabolic pathway were highly expressed in growing leaves and old leaves. Co-expression network analysis suggested a potential direct link between the flavonoid and phenylpropanoid biosynthetic pathways via the regulation of transcription factors, including MYB (v-myb avian myeloblastosis viral oncogene homolog) and bHLH (basic/helix-loop-helix). Our study predicts dynamic metabolic models during leaf growth and development and will support further molecular biological studies of metabolite biosynthesis in E. ulmoides. In addition, our results significantly improve the annotation of the E. ulmoides genome.


April 21, 2020  |  

Sequence and Evolutionary Features for the Alternatively Spliced Exons of Eukaryotic Genes.

Alternative splicing of pre-mRNAs is a crucial mechanism for maintaining protein diversity in eukaryotes without requiring a considerable increase of genes in the number. Due to rapid advances in high-throughput sequencing technologies and computational algorithms, it is anticipated that alternative splicing events will be more intensively studied to address different kinds of biological questions. The occurrences of alternative splicing mean that all exons could be classified to be either constitutively or alternatively spliced depending on whether they are virtually included into all mature mRNAs. From an evolutionary point of view, therefore, the alternatively spliced exons would have been associated with distinctive biological characteristics in comparison with constitutively spliced exons. In this paper, we first outline the representative types of alternative splicing events and exon classification, and then review sequence and evolutionary features for the alternatively spliced exons. The main purpose is to facilitate understanding of the biological implications of alternative splicing in eukaryotes. This knowledge is also helpful to establish computational approaches for predicting the splicing pattern of exons.


April 21, 2020  |  

Single-Cell Virus Sequencing of Influenza Infections That Trigger Innate Immunity.

Influenza virus-infected cells vary widely in their expression of viral genes and only occasionally activate innate immunity. Here, we develop a new method to assess how the genetic variation in viral populations contributes to this heterogeneity. We do this by determining the transcriptome and full-length sequences of all viral genes in single cells infected with a nominally “pure” stock of influenza virus. Most cells are infected by virions with defects, some of which increase the frequency of innate-immune activation. These immunostimulatory defects are diverse and include mutations that perturb the function of the viral polymerase protein PB1, large internal deletions in viral genes, and failure to express the virus’s interferon antagonist NS1. However, immune activation remains stochastic in cells infected by virions with these defects and occasionally is triggered even by virions that express unmutated copies of all genes. Our work shows that the diverse spectrum of defects in influenza virus populations contributes to-but does not completely explain-the heterogeneity in viral gene expression and immune activation in single infected cells.IMPORTANCE Because influenza virus has a high mutation rate, many cells are infected by mutated virions. But so far, it has been impossible to fully characterize the sequence of the virion infecting any given cell, since conventional techniques such as flow cytometry and single-cell transcriptome sequencing (scRNA-seq) only detect if a protein or transcript is present, not its sequence. Here we develop a new approach that uses long-read PacBio sequencing to determine the sequences of virions infecting single cells. We show that viral genetic variation explains some but not all of the cell-to-cell variability in viral gene expression and innate immune induction. Overall, our study provides the first complete picture of how viral mutations affect the course of infection in single cells.Copyright © 2019 Russell et al.


April 21, 2020  |  

A chromosomal-level genome assembly for the insect vector for Chagas disease, Triatoma rubrofasciata.

Triatoma rubrofasciata is a widespread pathogen vector for Chagas disease, an illness that affects approximately 7 million people worldwide. Despite its importance to human health, its evolutionary origin has not been conclusively determined. A reference genome for T. rubrofasciata is not yet available.We have sequenced the genome of a female individual with T. rubrofasciatausing a single molecular DNA sequencing technology (i.e., PacBio Sequel platform) and have successfully reconstructed a whole-genome (680-Mb) assembly that covers 90% of the nuclear genome (757 Mb). Through Hi-C analysis, we have reconstructed full-length chromosomes of this female individual that has 13 unique chromosomes (2n = 24 = 22 + X1 + X2) with a contig N50 of 2.72 Mb and a scaffold N50 of 50.7 Mb. This genome has achieved a high base-level accuracy of 99.99%. This platinum-grade genome assembly has 12,691 annotated protein-coding genes. More than 95.1% of BUSCO genes were single-copy completed, indicating a high level of completeness of the genome.The platinum-grade genome assembly and its annotation provide valuable information for future in-depth comparative genomics studies, including sexual determination analysis in T. rubrofasciata and the pathogenesis of Chagas disease. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Large Scale Profiling of Protein Isoforms Using Label-Free Quantitative Proteomics Revealed the Regulation of Nonsense-Mediated Decay in Moso Bamboo (Phyllostachys edulis).

Moso bamboo is an important forest species with a variety of ecological, economic, and cultural values. However, the gene annotation information of moso bamboo is only based on the transcriptome sequencing, lacking the evidence of proteome. The lignification and fiber in moso bamboo leads to a difficulty in the extraction of protein using conventional methods, which seriously hinders research on the proteomics of moso bamboo. The purpose of this study is to establish efficient methods for extracting the total proteins from moso bamboo for following mass spectrometry-based quantitative proteome identification. Here, we have successfully established a set of efficient methods for extracting total proteins of moso bamboo followed by mass spectrometry-based label-free quantitative proteome identification, which further improved the protein annotation of moso bamboo genes. In this study, 10,376 predicted coding genes were confirmed by quantitative proteomics, accounting for 35.8% of all annotated protein-coding genes. Proteome analysis also revealed the protein-coding potential of 1015 predicted long noncoding RNA (lncRNA), accounting for 51.03% of annotated lncRNAs. Thus, mass spectrometry-based proteomics provides a reliable method for gene annotation. Especially, quantitative proteomics revealed the translation patterns of proteins in moso bamboo. In addition, the 3284 transcript isoforms from 2663 genes identified by Pacific BioSciences (PacBio) single-molecule real-time long-read isoform sequencing (Iso-Seq) was confirmed on the protein level by mass spectrometry. Furthermore, domain analysis of mass spectrometry-identified proteins encoded in the same genomic locus revealed variations in domain composition pointing towards a functional diversification of protein isoform. Finally, we found that part transcripts targeted by nonsense-mediated mRNA decay (NMD) could also be translated into proteins. In summary, proteomic analysis in this study improves the proteomics-assisted genome annotation of moso bamboo and is valuable to the large-scale research of functional genomics in moso bamboo. In summary, this study provided a theoretical basis and technical support for directional gene function analysis at the proteomics level in moso bamboo.


April 21, 2020  |  

Survey of the Bradysia odoriphaga Transcriptome Using PacBio Single-Molecule Long-Read Sequencing.

The damage caused by Bradysia odoriphaga is the main factor threatening the production of vegetables in the Liliaceae family. However, few genetic studies of B. odoriphaga have been conducted because of a lack of genomic resources. Many long-read sequencing technologies have been developed in the last decade; therefore, in this study, the transcriptome including all development stages of B. odoriphaga was sequenced for the first time by Pacific single-molecule long-read sequencing. Here, 39,129 isoforms were generated, and 35,645 were found to have annotation results when checked against sequences available in different databases. Overall, 18,473 isoforms were distributed in 25 various Clusters of Orthologous Groups, and 11,880 isoforms were categorized into 60 functional groups that belonged to the three main Gene Ontology classifications. Moreover, 30,610 isoforms were assigned into 44 functional categories belonging to six main Kyoto Encyclopedia of Genes and Genomes functional categories. Coding DNA sequence (CDS) prediction showed that 36,419 out of 39,129 isoforms were predicted to have CDS, and 4319 simple sequence repeats were detected in total. Finally, 266 insecticide resistance and metabolism-related isoforms were identified as candidate genes for further investigation of insecticide resistance and metabolism in B. odoriphaga.


April 21, 2020  |  

Transcriptome Analysis Reveals the Accumulation Mechanism of Anthocyanins in Buckwheat (Fagopyrum esculentum Moench) Cotyledons and Flowers.

Buckwheat (Fagopyrum esculentum) is a valuable crop which can produce multiple human beneficial secondary metabolites, for example, the anthocyanins in sprouts and flowers. However, as the predominant group of visible polyphenols in pigmentation, little is known about the molecular mechanisms underlying the anthocyanin biosynthesis within buckwheat. In this study, a comparative transcriptome analysis of green and red common buckwheat cultivars was carried out through RNA sequencing. Overall, 3727 and 5323 differently expressed genes (DEGs) were identified in flowers and cotyledons, respectively. Through GO and KEGG analysis, we revealed that DEGs in flowers and cotyledons are predominately involved in biosynthesis of anthocyanin. A total of 42 unigenes encoding 11 structural enzymes of the anthocyanin biosynthesis were identified as DEGs. We also identified some transcription factor families involved in the regulation of anthocyanin biosynthesis. Real-time qPCR validation of candidate genes was performed in flowers and cotyledons, and the results suggested that the high expression level of structural genes involved in anthocyanin biosynthetic pathway promotes anthocyanin accumulation. Our results provide the insight understanding for coloration of red common buckwheat.


April 21, 2020  |  

A draft nuclear-genome assembly of the acoel flatworm Praesagittifera naikaiensis.

Acoels are primitive bilaterians with very simple soft bodies, in which many organs, including the gut, are not developed. They provide platforms for studying molecular and developmental mechanisms involved in the formation of the basic bilaterian body plan, whole-body regeneration, and symbiosis with photosynthetic microalgae. Because genomic information is essential for future research on acoel biology, we sequenced and assembled the nuclear genome of an acoel, Praesagittifera naikaiensis.To avoid sequence contamination derived from symbiotic microalgae, DNA was extracted from embryos that were free of algae. More than 290x sequencing coverage was achieved using a combination of Illumina (paired-end and mate-pair libraries) and PacBio sequencing. RNA sequencing and Iso-Seq data from embryos, larvae, and adults were also obtained. First, a preliminary ~17-kilobase pair (kb) mitochondrial genome was assembled, which was deleted from the nuclear sequence assembly. As a result, a draft nuclear genome assembly was ~656 Mb in length, with a scaffold N50 of 117 kb and a contig N50 of 57 kb. Although ~70% of the assembled sequences were likely composed of repetitive sequences that include DNA transposons and retrotransposons, the draft genome was estimated to contain 22,143 protein-coding genes, ~99% of which were substantiated by corresponding transcripts. We could not find horizontally transferred microalgal genes in the acoel genome. Benchmarking Universal Single-Copy Orthologs analyses indicated that 77% of the conserved single-copy genes were complete. Pfam domain analyses provided a basic set of gene families for transcription factors and signaling molecules.Our present sequencing and assembly of the P. naikaiensis nuclear genome are comparable to those of other metazoan genomes, providing basic information for future studies of genic and genomic attributes of this animal group. Such studies may shed light on the origins and evolution of simple bilaterians. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Single-Molecule Real-Time (SMRT) Full-Length RNA-Sequencing Reveals Novel and Distinct mRNA Isoforms in Human Bone Marrow Cell Subpopulations.

Hematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single-molecule real-time (SMRT) full-length RNA-sequencing. This analysis revealed a ~5-fold higher number of transcript isoforms than previously detected and showed a distinct composition of individual transcript isoforms characteristic for bone marrow subpopulations. A detailed analysis of messenger RNA (mRNA) isoforms transcribed from the ANXA1 and EEF1A1 loci confirmed their distinct composition. The expression of proteins predicted from the transcriptome analysis was evaluated by mass spectrometry and validated previously unknown protein isoforms predicted e.g., for EEF1A1. These protein isoforms distinguished the lineage negative cell population from the lineage positive cell population. Finally, transcript isoforms expressed from paralogous gene loci (e.g., CFD, GATA2, HLA-A, B, and C) also distinguished cell subpopulations but were only detectable by full-length RNA sequencing. Thus, qualitatively distinct transcript isoforms from individual genomic loci separate bone marrow cell subpopulations indicating complex transcriptional regulation and protein isoform generation during hematopoiesis.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.