The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propose the implementation of the TeloPrime Full Length cDNA Amplification kit to the Pacific Biosciences Iso-Seq technology in order to enrich for genuine full-length transcripts in the cDNA libraries. We provide evidence that TeloPrime outperforms the commonly used SMARTer…
Lupins (Lupinus spp.) are nitrogen-fixing legumes that accumulate toxic alkaloids in their protein-rich beans. These anti-nutritional compounds belong to the family of quinolizidine alkaloids (QAs), which are of interest to the pharmaceutical and chemical industries. To unleash the potential of lupins as protein crops and as sources of QAs, a thorough understanding of the QA pathway is needed. However, only the first enzyme in the pathway, lysine decarboxylase (LDC), is known. Here, we report the transcriptome of a high-QA variety of narrow-leafed lupin (L. angustifolius), obtained using eight different tissues and two different sequencing technologies. In addition, we present a…
Panax ginseng C. A. Meyer, reputed as the king of medicinal herbs, has slow growth, long generation time, low seed production and complicated genome structure that hamper its study. Here, we unveil the genomic architecture of tetraploid P. ginseng by de novo genome assembly, representing 2.98 Gbp with 59 352 annotated genes. Resequencing data indicated that diploid Panax species diverged in association with global warming in Southern Asia, and two North American species evolved via two intercontinental migrations. Two whole genome duplications (WGD) occurred in the family Araliaceae (including Panax) after divergence with the Apiaceae, the more recent one contributing to the ability…
Despite the economic importance of sugarcane in sugar and bioenergy production, there is not yet a reference genome available. Most of the sugarcane transcriptomic studies have been based on Saccharum officinarum gene indices (SoGI), expressed sequence tags (ESTs) and de novo assembled transcript contigs from short-reads; hence knowledge of the sugarcane transcriptome is limited in relation to transcript length and number of transcript isoforms.The sugarcane transcriptome was sequenced using PacBio isoform sequencing (Iso-Seq) of a pooled RNA sample derived from leaf, internode and root tissues, of different developmental stages, from 22 varieties, to explore the potential for capturing full-length transcript…
Circular RNA (circRNA) discovery, expression patterns and experimental validation in developing tea leaves indicates its correlation with circRNA-parental genes and potential roles in ceRNA interaction network. Circular RNAs (circRNAs) have recently emerged as a novel class of abundant endogenous stable RNAs produced by circularization with regulatory potential. However, identification of circRNAs in plants, especially in non-model plants with large genomes, is challenging. In this study, we undertook a systematic identification of circRNAs from different stage tissues of tea plant (Camellia sinensis) leaf development using rRNA-depleted circular RNA-seq. By combining two state-of-the-art detecting tools, we characterized 3174 circRNAs, of which 342…
Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its…
The flower of the safflower (Carthamus tinctorius L.) has been widely used in traditional Chinese medicine for the ability to improve cerebral blood flow. Flavonoids are the primary bioactive components in safflower, and their biosynthesis has attracted widespread interest. Previous studies mostly used second-generation sequencing platforms to survey the putative flavonoid biosynthesis genes. For a better understanding of transcription data and the putative genes involved in flavonoid biosynthesis in safflower, we carry our study.High-quality RNA was extracted from six types of safflower tissue. The RNAs of different tissues were mixed equally and used for multiple size-fractionated libraries (1-2, 2-3 and…
Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering…
Adlay (Coix lacryma-jobi) is a tropical grass that has long been used in traditional Chinese medicine and is known for its nutritional benefits. Recent studies have shown that vitamin E compounds in adlay protect against chronic diseases such as cancer and heart disease. However, the molecular basis of adlay’s health benefits remains unknown. Here, we generated adlay gene sets by de novo transcriptome assembly using long-read isoform sequencing (Iso-Seq) and short-read RNA-Sequencing (RNA-Seq). The gene sets obtained from Iso-seq and RNA-seq contained 31,177 genes and 57,901 genes, respectively. We confirmed the validity of the assembled gene sets by experimentally analyzing…
Salvianolic acids are among the main bioactive components in Salvia miltiorrhiza, and their biosynthesis has attracted widespread interest. However, previous studies on the biosynthesis of phenolic acids using next-generation sequencing platforms are limited with regard to the assembly of full-length transcripts. Based on hybrid-seq (next-generation and single molecular real-time sequencing) of the S. miltiorrhiza root transcriptome, we experimentally identified 15 full-length transcripts and four alternative splicing events of enzyme-coding genes involved in the biosynthesis of rosmarinic acid. Moreover, we herein demonstrate that lithospermic acid B accumulates in the phloem and xylem of roots, in agreement with the expression patterns of…
Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high-quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi-C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50…
Solanum sisymbriifolium, also known as “Litchi Tomato” or “Sticky Nightshade,” is an undomesticated and poorly researched plant related to potato and tomato. Unlike the latter species, S. sisymbriifolium induces eggs of the cyst nematode, Globodera pallida, to hatch and migrate into its roots, but then arrests further nematode maturation. In order to provide researchers with a partial blueprint of its genetic make-up so that the mechanism of this response might be identified, we used single molecule real time (SMRT) sequencing to compile a high quality de novo transcriptome of 41,189 unigenes drawn from individually sequenced bud, root, stem, and leaf…
The Russian dandelion Taraxacum kok-saghyz Rodin (TKS), a member of the Composite family and a potential alternative source of natural rubber (NR) and inulin, is an ideal model system for studying rubber biosynthesis. Here we present the draft genome of TKS, the first assembled NR-producing weed plant. The draft TKS genome assembly has a length of 1.29 Gb, containing 46,731 predicted protein-coding genes and 68.56% repeats, in which the LTR-RT elements predominantly contribute to the genome enlargement. We analyzed the heterozygous regions/genes, suggesting its possible involvement in inbreeding depression. Through comparative studies between rubber-producing and non-rubber-producing plants, we found that…
A number of plant groups have been proposed as ideal systems to explore plastid inheritance, plastome evolution and plastome-nuclear genome coevolution. Quick generation times and a compact nuclear genome in Arabidopsis thaliana, the relative ease of plastid isolation from Spinacia oleracea and the tractability of plastid transformation in Nicotiana tabacum are all desirable attributes in a model system; however, these and most other groups all lack novelty in terms of plastome structure and nucleotide sequence evolution. Contemporary sequencing and assembly technologies have facilitated analyses of atypical plastomes and, as predicted by early investigations, Geraniaceae plastomes have experienced unprecedented rearrangements relative…
Green algae represent a key segment of the global species capable of photoautotrophic-driven biological carbon fixation. Algae partition fixed-carbon into chemical compounds required for biomass, while diverting excess carbon into internal storage compounds such as starch and lipids or, in certain cases, into targeted extracellular compounds. Two green algae were selected to probe for critical components associated with sugar production and release in a model alga. Chlorella sorokiniana UTEX 1602 – which does not release significant quantities of sugars to the extracellular space – was selected as a control to compare with the maltose-releasing Micractinium conductrix SAG 241.80 – which…