Menu
April 21, 2020

A reference-grade wild soybean genome.

Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, with a final assembled genome size of 1013.2?Mb and a contig N50 of 3.3?Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz trypsin inhibitor (KTI) genes. Such findings illustrate the power of this assembly in the analysis of large structural variations in soybean germplasm collections. The wild soybean genome assembly has wide applications in comparative genomic and evolutionary studies, as well as in crop breeding and improvement programs.


April 21, 2020

Study of the whole genome, methylome and transcriptome of Cordyceps militaris.

The complete genome of Cordyceps militaris was sequenced using single-molecule real-time (SMRT) sequencing technology at a coverage over 300×. The genome size was 32.57?Mb, and 14 contigs ranging from 0.35 to 4.58?Mb with an N50 of 2.86?Mb were assembled, including 4 contigs with telomeric sequences on both ends and an additional 8 contigs with telomeric sequences on either the 5′ or 3′ end. A methylome database of the genome was constructed using SMRT and m4C and m6A methylated nucleotides, and many unknown modification types were identified. The major m6A methylation motif is GA and GGAG, and the major m4C methylation motif is GC or CG/GC. In the C. militaris genome DNA, there were four types of methylated nucleotides that we confirmed using high-resolution LCMS-IT-TOF. Using PacBio Iso-Seq, a total of 31,133 complete cDNA sequences were obtained in the fruiting body. The conserved domains of the nontranscribed regions of the genome include TATA boxes, which are the initial regions of genome replication. There were 406 structural variants between the HN and CM01 strains, and there were 1,114 structural variants between the HN and ATCC strains.


April 21, 2020

Getting the Entire Message: Progress in Isoform Sequencing

The advent of second-generation sequencing and its application to RNA sequencing has revolutionized the field of genomics by allowing the quantification of expression of entire genes as well as single TSS, exons and splice sites, RNA-editing sites as well as polyA-sites. However, due to the sequencing of fragments of cDNAs these methods have not given a reliable picture of complete RNA isoforms. Third-generation sequencing has filled this gap and allows end-to-end sequencing of entire RNA/cDNA molecules. This approach to transcriptomics has been a ‘niche’ technology for a couple of years but now is becoming mainstream with many different applications. Here, we review the background and progress made to date in this rapidly growing field. We start by reviewing the progressive realization that alternative splicing is omnipresent. We then focus on long-non-coding RNA isoforms and the distinct combination patterns of exons in non-coding and coding genes. We consider the implications of the recent technologies of direct RNA sequencing and single-cell isoform RNA sequencing. Finally, we discuss the parameters that define the success of long-read RNA sequencing experiments and strategies commonly used to make the most of such data.


April 21, 2020

Combining next-generation sequencing and single-molecule sequencing to explore brown plant hopper responses to contrasting genotypes of japonica rice.

The brown plant hopper (BPH), Nilaparvata lugens, is one of the major pest of rice (Oryza sativa). Plant defenses against insect herbivores have been extensively studied, but our understanding of insect responses to host plants’ resistance mechanisms is still limited. The purpose of this study is to characterize transcripts of BPH and reveal the responses of BPH insects to resistant rice at transcription level by using the advanced molecular techniques, the next-generation sequencing (NGS) and the single-molecule, real-time (SMRT) sequencing.The current study obtained 24,891 collapsed isoforms of full-length transcripts, and 20,662 were mapped to known annotated genes, including 17,175 novel transcripts. The current study also identified 915 fusion genes, 1794 novel genes, 2435 long non-coding RNAs (lncRNAs), and 20,356 alternative splicing events. Moreover, analysis of differentially expressed genes (DEGs) revealed that genes involved in metabolic and cell proliferation processes were significantly enriched in up-regulated and down-regulated sets, respectively, in BPH fed on resistant rice relative to BPH fed on susceptible wild type rice. Furthermore, the FoxO signaling pathway was involved and genes related to BPH starvation response (Nlbmm), apoptosis and autophagy (caspase 8, ATG13, BNIP3 and IAP), active oxygen elimination (catalase, MSR, ferritin) and detoxification (GST, CarE) were up-regulated in BPH responses to resistant rice.The current study provides the first demonstrations of the full diversity and complexity of the BPH transcriptome, and indicates that BPH responses to rice resistance, might be related to starvation stress responses, nutrient transformation, oxidative decomposition, and detoxification. The current result findings will facilitate further exploration of molecular mechanisms of interaction between BPH insects and host rice.


April 21, 2020

A genomic resource derived from the integration of genome sequences, expressed transcripts and genetic markers in ramie.

The redundancy of genomic resources, including transcript and molecular markers, and their uncertain position in the genome have dramatically hindered the study of traits in ramie, an important natural fiber crop.We obtained a high-quality transcriptome consisting of 30,591 non-redundant transcripts using single-molecule long-read sequencing and proposed it as a universal ramie transcriptome. Additionally, 55,882 single nucleotide polymorphisms (SNPs) were identified and a high-density genetic map was developed. Based on this genetic map, 181.7?Mb ramie genome sequences were assembled into 14 chromosomes. For the convenient use of these resources, 29,286 (~?95.7%) of the transcripts and all 55,882 SNPs, along with 1827 previously reported sequence repeat markers (SSRs), were mapped into the ramie genome, and 22,343 (~?73.0%) transcripts, 50,154 (~?89.7%) SNPs, and 1466 (~?80.3%) SSRs were assigned to a specific location in the corresponding chromosome.This is the first study to characterize the ramie transcriptome by long-read sequencing, and the substantial number of transcripts of significant length obtained will accelerate our understanding of ramie growth and development. This integration of genome sequences, expressed transcripts, and genetic markers will provide an extremely useful resource for genetic, molecular, and breeding studies of ramie.


April 21, 2020

Reviving the Transcriptome Studies: An Insight into the Emergence of Single-molecule Transcriptome Sequencing

Advances in transcriptomics have provided an exceptional opportunity to study functional implications of the genetic variability. Technologies such as RNA-Seq have emerged as state-of-the-art techniques for transcriptome analysis that take advantage of high-throughput next-generation sequencing. However, similar to their predecessors, these approaches continue to impose major challenges on full-length transcript structure identification, primarily due to inherent limitations of read length. With the development of single-molecule sequencing (SMS) from PacBio, a growing number of studies on the transcriptome of different organisms have been reported. SMS has emerged as advantageous for comprehensive genome annotation including identification of novel genes/isoforms, long non-coding RNAs and fusion transcripts. This approach can be used across a broad spectrum of species to better interpret the coding information of the genome, and facilitate the biological function study. We provide an overview of SMS platform and its diverse applications in various biological studies, and our perspective on the challenges associated with the transcriptome studies.


April 21, 2020

Comparative transcriptome analyses of genes involved in sulforaphane metabolism at different treatment in Chinese kale using full-length transcriptome sequencing.

Sulforaphane is a natural isothiocyanate available from cruciferous vegetables with multiple characteristics including antioxidant, antitumor and anti-inflammatory effect. Single-molecule real-time (SMRT) sequencing has been used for long-read de novo assembly of plant genome. Here, we investigated the molecular mechanism related to glucosinolates biosynthesis in Chinese kale using combined NGS and SMRT sequencing.SMRT sequencing produced 185,134 unigenes, higher than 129,325 in next-generation sequencing (NGS). NaCl (75?mM), methyl jasmonate (MeJA, 40?µM), selenate (Se, sodium selenite 100?µM), and brassinolide (BR, 1.5?µM) treatment induced 6893, 13,287, 13,659 and 11,041 differentially expressed genes (DEGs) in Chinese kale seedlings comparing with control. These genes were associated with pathways of glucosinolates biosynthesis, including phenylalanine, tyrosine and tryptophan biosynthesis, cysteine and methionine metabolism, and glucosinolate biosynthesis. We found NaCl decreased sulforaphane and glucosinolates (indolic and aliphatic) contents and downregulated expression of cytochrome P45083b1 (CYP83b1), S-alkyl-thiohydroximatelyase or carbon-sulfur lyase (SUR1) and UDP-glycosyltransferase 74B1 (UGT74b1). MeJA increased sulforaphane and glucosinolates contents and upregulated the expression of CYP83b1, SUR1 and UGT74b1; Se increased sulforaphane; BR increased expression of CYP83b1, SUR1 and UGT74b1, and increased glucosinolates contents. The desulfoglucosinolate sulfotransferases ST5a_b_c were decreased by all treatments.We confirmed that NaCl inhibited the biosynthesis of both indolic and aliphatic glucosinolates, while MeJA and BR increased them. MeJA and BR treatments, conferred the biosynthesis of glucosinolates, and Se and MeJA contributed to sulforaphane in Chinese kale via regulating the expression of CYP83b1, SUR1 and UGT74b1.


April 21, 2020

Reconstruction of the full-length transcriptome atlas using PacBio Iso-Seq provides insight into the alternative splicing in Gossypium australe.

Gossypium australe F. Mueller (2n?=?2x?=?26, G2 genome) possesses valuable characteristics. For example, the delayed gland morphogenesis trait causes cottonseed protein and oil to be edible while retaining resistance to biotic stress. However, the lack of gene sequences and their alternative splicing (AS) in G. australe remain unclear, hindering to explore species-specific biological morphogenesis.Here, we report the first sequencing of the full-length transcriptome of the Australian wild cotton species, G. australe, using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq) from the pooled cDNA of ten tissues to identify transcript loci and splice isoforms. We reconstructed the G. australe full-length transcriptome and identified 25,246 genes, 86 pre-miRNAs and 1468 lncRNAs. Most genes (12,832, 50.83%) exhibited two or more isoforms, suggesting a high degree of transcriptome complexity in G. australe. A total of 31,448 AS events in five major types were found among the 9944 gene loci. Among these five major types, intron retention was the most frequent, accounting for 68.85% of AS events. 29,718 polyadenylation sites were detected from 14,536 genes, 7900 of which have alternative polyadenylation sites (APA). In addition, based on our AS events annotations, RNA-Seq short reads from germinating seeds showed that differential expression of these events occurred during seed germination. Ten AS events that were randomly selected were further confirmed by RT-PCR amplification in leaf and germinating seeds.The reconstructed gene sequences and their AS in G. australe would provide information for exploring beneficial characteristics in G. australe.


April 21, 2020

Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data.

Our understanding of the pig transcriptome is limited. RNA transcript diversity among nine tissues was assessed using poly(A) selected single-molecule long-read isoform sequencing (Iso-seq) and Illumina RNA sequencing (RNA-seq) from a single White cross-bred pig. Across tissues, a total of 67,746 unique transcripts were observed, including 60.5% predicted protein-coding, 36.2% long non-coding RNA and 3.3% nonsense-mediated decay transcripts. On average, 90% of the splice junctions were supported by RNA-seq within tissue. A large proportion (80%) represented novel transcripts, mostly produced by known protein-coding genes (70%), while 17% corresponded to novel genes. On average, four transcripts per known gene (tpg) were identified; an increase over current EBI (1.9 tpg) and NCBI (2.9 tpg) annotations and closer to the number reported in human genome (4.2 tpg). Our new pig genome annotation extended more than 6000 known gene borders (5′ end extension, 3′ end extension, or both) compared to EBI or NCBI annotations. We validated a large proportion of these extensions by independent pig poly(A) selected 3′-RNA-seq data, or human FANTOM5 Cap Analysis of Gene Expression data. Further, we detected 10,465 novel genes (81% non-coding) not reported in current pig genome annotations. More than 80% of these novel genes had transcripts detected in >?1 tissue. In addition, more than 80% of novel intergenic genes with at least one transcript detected in liver tissue had H3K4me3 or H3K36me3 peaks mapping to their promoter and gene body, respectively, in independent liver chromatin immunoprecipitation data. These validated results show significant improvement over current pig genome annotations.


April 21, 2020

Hybrid-Transcriptome Sequencing and Associated Metabolite Analysis Reveal Putative Genes Involved in Flower Color Difference in Rose Mutants.

Gene mutation is a common phenomenon in nature that often leads to phenotype differences, such as the variations in flower color that frequently occur in roses. With the aim of revealing the genomic information and inner mechanisms, the differences in the levels of both transcription and secondary metabolism between a pair of natural rose mutants were investigated by using hybrid RNA-sequencing and metabolite analysis. Metabolite analysis showed that glycosylated derivatives of pelargonidin, e.g., pelargonidin 3,5 diglucoside and pelargonidin 3-glucoside, which were not detected in white flowers (Rosa ‘Whilte Mrago Koster’), constituted the major pigments in pink flowers. Conversely, the flavonol contents of petal, such as kaempferol-3-glucoside, quercetin 3-glucoside, and rutin, were higher in white flowers. Hybrid RNA-sequencing obtained a total of 107,280 full-length transcripts in rose petal which were annotated in major databases. Differentially expressed gene (DEG) analysis showed that the expression of genes involved in the flavonoid biosynthesis pathway was significantly different, e.g., CHS, FLS, DFR, LDOX, which was verified by qRT-PCR during flowering. Additionally, two MYB transcription factors were found and named RmMYBAN2 and RmMYBPA1, and their expression patterns during flowering were also analyzed. These findings indicate that these genes may be involved in the flower color difference in the rose mutants, and competition between anthocyanin and flavonol biosynthesis is a primary cause of flower color variation, with its regulation reflected by transcriptional and secondary metabolite levels.


April 21, 2020

Analysis of Transcriptome and Epitranscriptome in Plants Using PacBio Iso-Seq and Nanopore-Based Direct RNA Sequencing.

Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way transcriptomes are analyzed. These methods offer many advantages over most widely used high-throughput short-read RNA sequencing (RNA-Seq) approaches and allow a comprehensive analysis of transcriptomes in identifying full-length splice isoforms and several other post-transcriptional events. In addition, direct RNA-Seq provides valuable information about RNA modifications, which are lost during the PCR amplification step in other methods. Here, we present a comprehensive summary of important applications of these technologies in plants, including identification of complex alternative splicing (AS), full-length splice variants, fusion transcripts, and alternative polyadenylation (APA) events. Furthermore, we discuss the impact of the newly developed nanopore direct RNA-Seq in advancing epitranscriptome research in plants. Additionally, we summarize computational tools for identifying and quantifying full-length isoforms and other co/post-transcriptional events and discussed some of the limitations with these methods. Sequencing of transcriptomes using these new single-molecule long-read methods will unravel many aspects of transcriptome complexity in unprecedented ways as compared to previous short-read sequencing approaches. Analysis of plant transcriptomes with these new powerful methods that require minimum sample processing is likely to become the norm and is expected to uncover novel co/post-transcriptional gene regulatory mechanisms that control biological outcomes during plant development and in response to various stresses.


April 21, 2020

Iso-Seq analysis of the Taxus cuspidata transcriptome reveals the complexity of Taxol biosynthesis.

Taxus cuspidata is well known worldwide for its ability to produce Taxol, one of the top-selling natural anticancer drugs. However, current Taxol production cannot match the increasing needs of the market, and novel strategies should be considered to increase the supply of Taxol. Since the biosynthetic mechanism of Taxol remains largely unknown, elucidating this pathway in detail will be very helpful in exploring alternative methods for Taxol production.Here, we sequenced Taxus cuspidata transcriptomes with next-generation sequencing (NGS) and third-generation sequencing (TGS) platforms. After correction with Illumina reads and removal of redundant reads, more than 180,000 nonredundant transcripts were generated from the raw Iso-Seq data. Using Cogent software and an alignment-based method, we identified a total of 139 cytochrome P450s (CYP450s), 31 BAHD acyltransferases (ACTs) and 1940 transcription factors (TFs). Based on phylogenetic and coexpression analysis, we identified 9 CYP450s and 7 BAHD ACTs as potential lead candidates for Taxol biosynthesis and 6 TFs that are possibly involved in the regulation of this process. Using coexpression analysis of genes known to be involved in Taxol biosynthesis, we elucidated the stem biosynthetic pathway. In addition, we analyzed the expression patterns of 12 characterized genes in the Taxol pathway and speculated that the isoprene precursors for Taxol biosynthesis were mainly synthesized via the MEP pathway. In addition, we found and confirmed that the alternative splicing patterns of some genes varied in different tissues, which may be an important tissue-specific method of posttranscriptional regulation.A strategy was developed to generate corrected full-length or nearly full-length transcripts without assembly to ensure sequence accuracy, thus greatly improving the reliability of coexpression and phylogenetic analysis and greatly facilitating gene cloning and characterization. This strategy was successfully utilized to elucidate the Taxol biosynthetic pathway, which will greatly contribute to the goals of improving the Taxol content in Taxus spp. using molecular breeding or plant management strategies and synthesizing Taxol in microorganisms using synthetic biological technology.


April 21, 2020

De novo transcriptome assembly of the cubomedusa Tripedalia cystophora, including the analysis of a set of genes involved in peptidergic neurotransmission.

The phyla Cnidaria, Placozoa, Ctenophora, and Porifera emerged before the split of proto- and deuterostome animals, about 600 million years ago. These early metazoans are interesting, because they can give us important information on the evolution of various tissues and organs, such as eyes and the nervous system. Generally, cnidarians have simple nervous systems, which use neuropeptides for their neurotransmission, but some cnidarian medusae belonging to the class Cubozoa (box jellyfishes) have advanced image-forming eyes, probably associated with a complex innervation. Here, we describe a new transcriptome database from the cubomedusa Tripedalia cystophora.Based on the combined use of the Illumina and PacBio sequencing technologies, we produced a highly contiguous transcriptome database from T. cystophora. We then developed a software program to discover neuropeptide preprohormones in this database. This script enabled us to annotate seven novel T. cystophora neuropeptide preprohormone cDNAs: One coding for 19 copies of a peptide with the structure pQWLRGRFamide; one coding for six copies of a different RFamide peptide; one coding for six copies of pQPPGVWamide; one coding for eight different neuropeptide copies with the C-terminal LWamide sequence; one coding for thirteen copies of a peptide with the RPRAamide C-terminus; one coding for four copies of a peptide with the C-terminal GRYamide sequence; and one coding for seven copies of a cyclic peptide, of which the most frequent one has the sequence CTGQMCWFRamide. We could also identify orthologs of these seven preprohormones in the cubozoans Alatina alata, Carybdea xaymacana, Chironex fleckeri, and Chiropsalmus quadrumanus. Furthermore, using TBLASTN screening, we could annotate four bursicon-like glycoprotein hormone subunits, five opsins, and 52 other family-A G protein-coupled receptors (GPCRs), which also included two leucine-rich repeats containing G protein-coupled receptors (LGRs) in T. cystophora. The two LGRs are potential receptors for the glycoprotein hormones, while the other GPCRs are candidate receptors for the above-mentioned neuropeptides.By combining Illumina and PacBio sequencing technologies, we have produced a new high-quality de novo transcriptome assembly from T. cystophora that should be a valuable resource for identifying the neuronal components that are involved in vision and other behaviors in cubomedusae.


April 21, 2020

Comparative transcriptome analysis identified candidate genes involved in mycelium browning in Lentinula edodes.

Lentinula edodes is one of the most popular edible mushroom species in the world and contains useful medicinal components, such as lentinan. The light-induced formation of brown film on the vegetative mycelial tissues of L. edodes is an important process for ensuring the quantity and quality of this edible mushroom. To understand the molecular mechanisms underlying this critical developmental process in L. edodes, we characterized the morphological phenotypic changes in a strain, Chamaram, associated with abnormal brown film formation and compared its genome-wide transcriptional features.In the present study, we performed genome-wide transcriptome analyses of different vegetative mycelium growth phenotypes, namely, early white, normal brown, and defective dark yellow partial brown films phenotypes which were exposed to different light conditions. The analysis revealed the identification of clusters of genes specific to the light-induced brown film phenotypes. These genes were significantly associated with light sensing via photoreceptors such as FMN- and FAD-bindings, signal transduction by kinases and GPCRs, melanogenesis via activation of tyrosinases, and cell wall degradation by glucanases, chitinases, and laccases, which suggests these processes are involved in the formation of mycelial browning in L. edodes. Interestingly, hydrophobin genes such as SC1 and SC3 exhibited divergent expression levels in the normal and abnormal brown mycelial films, indicating the ability of these genes to act in fruiting body initiation and formation of dikaryotic mycelia. Furthermore, we identified the up-regulation of glycoside hydrolase domain-containing genes in the normal brown film but not in the abnormal film phenotype, suggesting that cell wall degradation in the normal brown film phenotype is crucial in the developmental processes related to the initiation and formation of fruiting bodies.This study systematically analysed the expression patterns of light-induced browning-related genes in L. edodes. Our findings provide information for further investigations of browning formation mechanisms in L. edodes and a foundation for future L. edodes breeding.


April 21, 2020

The genome of the soybean cyst nematode (Heterodera glycines) reveals complex patterns of duplications involved in the evolution of parasitism genes.

Heterodera glycines, commonly referred to as the soybean cyst nematode (SCN), is an obligatory and sedentary plant parasite that causes over a billion-dollar yield loss to soybean production annually. Although there are genetic determinants that render soybean plants resistant to certain nematode genotypes, resistant soybean cultivars are increasingly ineffective because their multi-year usage has selected for virulent H. glycines populations. The parasitic success of H. glycines relies on the comprehensive re-engineering of an infection site into a syncytium, as well as the long-term suppression of host defense to ensure syncytial viability. At the forefront of these complex molecular interactions are effectors, the proteins secreted by H. glycines into host root tissues. The mechanisms of effector acquisition, diversification, and selection need to be understood before effective control strategies can be developed, but the lack of an annotated genome has been a major roadblock.Here, we use PacBio long-read technology to assemble a H. glycines genome of 738 contigs into 123?Mb with annotations for 29,769 genes. The genome contains significant numbers of repeats (34%), tandem duplicates (18.7?Mb), and horizontal gene transfer events (151 genes). A large number of putative effectors (431 genes) were identified in the genome, many of which were found in transposons.This advance provides a glimpse into the host and parasite interplay by revealing a diversity of mechanisms that give rise to virulence genes in the soybean cyst nematode, including: tandem duplications containing over a fifth of the total gene count, virulence genes hitchhiking in transposons, and 107 horizontal gene transfers not reported in other plant parasitic nematodes thus far. Through extensive characterization of the H. glycines genome, we provide new insights into H. glycines biology and shed light onto the mystery underlying complex host-parasite interactions. This genome sequence is an important prerequisite to enable work towards generating new resistance or control measures against H. glycines.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.