Menu
September 22, 2019  |  

Enigmatic Diphyllatea eukaryotes: culturing and targeted PacBio RS amplicon sequencing reveals a higher order taxonomic diversity and global distribution.

The class Diphyllatea belongs to a group of enigmatic unicellular eukaryotes that play a key role in reconstructing the morphological innovation and diversification of early eukaryotic evolution. Despite its evolutionary significance, very little is known about the phylogeny and species diversity of Diphyllatea. Only three species have described morphology, being taxonomically divided by flagella number, two or four, and cell size. Currently, one 18S rRNA Diphyllatea sequence is available, with environmental sequencing surveys reporting only a single partial sequence from a Diphyllatea-like organism. Accordingly, geographical distribution of Diphyllatea based on molecular data is limited, despite morphological data suggesting the class has a global distribution. We here present a first attempt to understand species distribution, diversity and higher order structure of Diphyllatea.We cultured 11 new strains, characterised these morphologically and amplified their rRNA for a combined 18S-28S rRNA phylogeny. We sampled environmental DNA from multiple sites and designed new Diphyllatea-specific PCR primers for long-read PacBio RSII technology. Near full-length 18S rRNA sequences from environmental DNA, in addition to supplementary Diphyllatea sequence data mined from public databases, resolved the phylogeny into three deeply branching and distinct clades (Diphy I – III). Of these, the Diphy III clade is entirely novel, and in congruence with Diphy II, composed of species morphologically consistent with the earlier described Collodictyon triciliatum. The phylogenetic split between the Diphy I and Diphy II?+?III clades corresponds with a morphological division of Diphyllatea into bi- and quadriflagellate cell forms.This altered flagella composition must have occurred early in the diversification of Diphyllatea and may represent one of the earliest known morphological transitions among eukaryotes. Further, the substantial increase in molecular data presented here confirms Diphyllatea has a global distribution, seemingly restricted to freshwater habitats. Altogether, the results reveal the advantage of combining a group-specific PCR approach and long-read high-throughput amplicon sequencing in surveying enigmatic eukaryote lineages. Lastly, our study shows the capacity of PacBio RS when targeting a protist class for increasing phylogenetic resolution.


September 22, 2019  |  

Identification and analysis of glutathione S-transferase gene family in sweet potato reveal divergent GST-mediated networks in aboveground and underground tissues in response to abiotic stresses.

Sweet potato, a hexaploid species lacking a reference genome, is one of the most important crops in many developing countries, where abiotic stresses are a primary cause of reduction of crop yield. Glutathione S-transferases (GSTs) are multifunctional enzymes that play important roles in oxidative stress tolerance and cellular detoxification.A total of 42 putative full-length GST genes were identified from two local transcriptome databases and validated by molecular cloning and Sanger sequencing. Sequence and intraspecific phylogenetic analyses revealed extensive differentiation in their coding sequences and divided them into eight subfamilies. Interspecific phylogenetic and comparative analyses indicated that most examined GST paralogs might originate and diverge before the speciation of sweet potato. Results from large-scale RNA-seq and quantitative real-time PCR experiments exhibited extensive variation in gene-expression profiles across different tissues and varieties, which implied strong evolutionary divergence in their gene-expression regulation. Moreover, we performed five manipulated stress experiments and uncovered highly divergent stress-response patterns of sweet potato GST genes in aboveground and underground tissues.Our study identified a large number of sweet potato GST genes, systematically investigated their evolutionary diversification, and provides new insights into the GST-mediated stress-response mechanisms in this worldwide crop.


September 22, 2019  |  

De novo clustering of long-read transcriptome data using a greedy, quality-value based algorithm

Long-read sequencing of transcripts with PacBio Iso-Seq and Oxford Nanopore Technologies has proven to be central to the study of complex isoform landscapes in many organisms. However, current de novo transcript reconstruction algorithms from long-read data are limited, leaving the potential of these technologies unfulfilled. A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin. To address this challenge, we develop isONclust, a clustering algorithm that is greedy (in order to scale) and makes use of quality values (in order to handle variable error rates). We test isONclust on three simulated and five biological datasets, across a breadth of organisms, technologies, and read depths. Our results demonstrate that isONclust is a substantial improvement over previous approaches, both in terms of overall accuracy and/or scalability to large datasets. Our tool is available at https://github.com/ksahlin/isONclust.


September 22, 2019  |  

Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon

A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.


September 22, 2019  |  

De novo assembly and characterizing of the culm-derived meta-transcriptome from the polyploid sugarcane genome based on coding transcripts

Sugarcane biomass has been used for sugar, bioenergy and biomaterial production. The majority of the sugarcane biomass comes from the culm, which makes it important to understand the genetic control of biomass production in this part of the plant. A meta-transcriptome of the culm was obtained in an earlier study by using about one billion paired-end (150 bp) reads of deep RNA sequencing of samples from 20 diverse sugarcane genotypes and combining de novo assemblies from different assemblers and different settings. Although many genes could be recovered, this resulted in a large combined assembly which created the need for clustering to reduce transcript redundancy while maintaining gene content. Here, we present a comprehensive analysis of the effect of different assembly settings and clustering methods on de novo assembly, annotation and transcript profiling focusing especially on the coding transcripts from the highly polyploid sugarcane genome. The new coding sequence-based transcript clustering resulted in a better representation of transcripts compared to the earlier approach, having 121,987 contigs, which included 78,052 main and 43,935 alternative transcripts. About 73%, 67%, 61% and 10% of the transcriptome was annotated against the NCBI NR protein database, GO terms, orthologous groups and KEGG orthologies, respectively. Using this set for a differential gene expression analysis between the young and mature sugarcane culm tissues, a total of 822 transcripts were found to be differentially expressed, including key transcripts involved in sugar/fiber accumulation in sugarcane. In the context of the lack of a whole genome sequence for sugarcane, the availability of a well annotated culm-derived meta-transcriptome through deep sequencing provides useful information on coding genes specific to the sugarcane culm and will certainly contribute to understanding the process of carbon partitioning, and biomass accumulation in the sugarcane culm.


September 22, 2019  |  

High-quality RNA isolation from wheat immature grains

Grain quality is one of the most important targets in wheat breeding. Transcriptome analyses of wheat developing grains and endosperm have been performed using the microarray and RNA sequencing (RNA-seq) approaches (Wan et al. 2008, 2009; Nemeth et al. 2010; Pellny et al. 2012; Dong et al. 2015). For the RNA-seq analysis of the grain transcriptome and precise quantification of each transcript in developing grain and endosperm, the high-quality RNA is essential. For the microarray analysis, =7.3 RIN (RNA integrity number) value for the RNA sample quality is required according to the Agilent microarray protocol. In the previous report for the transcriptome of wheat developing grains, the total RNA samples with =8.0 RIN values were used for the RNA-seq analysis based on the PacBio and Illumina platforms (Dong et al. 2015). Some RNA extraction buffers containing SDS, CTAB, or TRIzol® reagent (Thermo Fisher Scientific, Waltham, Massachusetts) and several commercial kits for RNA isolation have been used to isolate total RNA from wheat grain and endosperm (Kawakami et al. 1992; Wan et al. 2008; Kang et al. 2013). However, total RNA samples from the wheat developing and immature grains are often damaged due to high content of polysaccharides and high stickiness of the solution homogenized with the RNA extraction buffer, and thus extraction of the high-quality RNA with high RIN value is quite difficult. Here, we report a protocol for the wheat grain RNA extraction using Maxwell RSC Plant RNA Kit (Promega, Madison, Wisconsin).


September 22, 2019  |  

Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research.

The large and complex hexaploid genome has greatly hindered genomics studies of common wheat (Triticum aestivum, AABBDD). Here, we investigated transcripts in common wheat developing caryopses using the emerging single-molecule real-time (SMRT) sequencing technology PacBio RSII, and assessed the resultant data for improving common wheat genome annotation and grain transcriptome research.We obtained 197,709 full-length non-chimeric (FLNC) reads, 74.6 % of which were estimated to carry complete open reading frame. A total of 91,881 high-quality FLNC reads were identified and mapped to 16,188 chromosomal loci, corresponding to 13,162 known genes and 3026 new genes not annotated previously. Although some FLNC reads could not be unambiguously mapped to the current draft genome sequence, many of them are likely useful for studying highly similar homoeologous or paralogous loci or for improving chromosomal contig assembly in further research. The 91,881 high-quality FLNC reads represented 22,768 unique transcripts, 9591 of which were newly discovered. We found 180 transcripts each spanning two or three previously annotated adjacent loci, suggesting that they should be merged to form correct gene models. Finally, our data facilitated the identification of 6030 genes differentially regulated during caryopsis development, and full-length transcripts for 72 transcribed gluten gene members that are important for the end-use quality control of common wheat.Our work demonstrated the value of PacBio transcript sequencing for improving common wheat genome annotation through uncovering the loci and full-length transcripts not discovered previously. The resource obtained may aid further structural genomics and grain transcriptome studies of common wheat.


September 22, 2019  |  

Long reads: their purpose and place.

In recent years long-read technologies have moved from being a niche and specialist field to a point of relative maturity likely to feature frequently in the genomic landscape. Analogous to next generation sequencing, the cost of sequencing using long-read technologies has materially dropped whilst the instrument throughput continues to increase. Together these changes present the prospect of sequencing large numbers of individuals with the aim of fully characterizing genomes at high resolution. In this article, we will endeavour to present an introduction to long-read technologies showing: what long reads are; how they are distinct from short reads; why long reads are useful and how they are being used. We will highlight the recent developments in this field, and the applications and potential of these technologies in medical research, and clinical diagnostics and therapeutics.


September 22, 2019  |  

Full-length RNA sequencing reveals unique transcriptome composition in bermudagrass.

Bermudagrass [Cynodon dactylon (L.) Pers.] is an important perennial warm-season turfgrass species with great economic value. However, the reference genome and transcriptome information are still deficient in bermudagrass, which severely impedes functional and molecular breeding studies. In this study, through analyzing a mixture sample of leaves, stolons, shoots, roots and flowers with single-molecule long-read sequencing technology from Pacific Biosciences (PacBio), we reported the first full-length transcriptome dataset of bermudagrass (C. dactylon cultivar Yangjiang) comprising 78,192 unigenes. Among the unigenes, 66,409 were functionally annotated, whereas 27,946 were found to have two or more isoforms. The annotated full-length unigenes provided many new insights into gene sequence characteristics and systematic phylogeny of bermudagrass. By comparison with transcriptome dataset in nine grass species, KEGG pathway analyses further revealed that C4 photosynthesis-related genes, notably the phosphoenolpyruvate carboxylase and pyruvate, phosphate dikinase genes, are specifically enriched in bermudagrass. These results not only explained the possible reason why bermudagrass flourishes in warm areas but also provided a solid basis for future studies in this important turfgrass species. Copyright © 2018 Elsevier Masson SAS. All rights reserved.


September 22, 2019  |  

Assessment of an organ-specific de novo transcriptome of the nematode trap-crop, Solanum sisymbriifolium

Solanum sisymbriifolium, also known as “Litchi Tomato” or “Sticky Nightshade,” is an undomesticated and poorly researched plant related to potato and tomato. Unlike the latter species, S. sisymbriifolium induces eggs of the cyst nematode, Globodera pallida, to hatch and migrate into its roots, but then arrests further nematode maturation. In order to provide researchers with a partial blueprint of its genetic make-up so that the mechanism of this response might be identified, we used single molecule real time (SMRT) sequencing to compile a high quality de novo transcriptome of 41,189 unigenes drawn from individually sequenced bud, root, stem, and leaf RNA populations. Functional annotation and BUSCO analysis showed that this transcriptome was surprisingly complete, even though it represented genes expressed at a single time point. By sequencing the 4 organ libraries separately, we found we could get a reliable snapshot of transcript distributions in each organ. A divergent site analysis of the merged transcriptome indicated that this species might have undergone a recent genome duplication and re-diploidization. Further analysis indicated that the plant then retained a disproportionate number of genes associated with photosynthesis and amino acid metabolism in comparison to genes with characteristics of R-proteins or involved in secondary metabolism. The former processes may have given S. sisymbriifolium a bigger competitive advantage than the latter did. Copyright © 2018 Wixom et al.


September 22, 2019  |  

Genome and secretome analysis of Pochonia chlamydosporia provide new insight into egg-parasitic mechanisms.

Pochonia chlamydosporia infects eggs and females of economically important plant-parasitic nematodes. The fungal isolates parasitizing different nematodes are genetically distinct. To understand their intraspecific genetic differentiation, parasitic mechanisms, and adaptive evolution, we assembled seven putative chromosomes of P. chlamydosporia strain 170 isolated from root-knot nematode eggs (~44?Mb, including 7.19% of transposable elements) and compared them with the genome of the strain 123 (~41?Mb) isolated from cereal cyst nematode. We focus on secretomes of the fungus, which play important roles in pathogenicity and fungus-host/environment interactions, and identified 1,750 secreted proteins, with a high proportion of carboxypeptidases, subtilisins, and chitinases. We analyzed the phylogenies of these genes and predicted new pathogenic molecules. By comparative transcriptome analysis, we found that secreted proteins involved in responses to nutrient stress are mainly comprised of proteases and glycoside hydrolases. Moreover, 32 secreted proteins undergoing positive selection and 71 duplicated gene pairs encoding secreted proteins are identified. Two duplicated pairs encoding secreted glycosyl hydrolases (GH30), which may be related to fungal endophytic process and lost in many insect-pathogenic fungi but exist in nematophagous fungi, are putatively acquired from bacteria by horizontal gene transfer. The results help understanding genetic origins and evolution of parasitism-related genes.


September 22, 2019  |  

Analysis of the hybrid genomes of two field isolates of the soil-borne fungal species Verticillium longisporum.

Brassica plant species are attacked by a number of pathogens; among them, the ones with a soil-borne lifestyle have become increasingly important. Verticillium stem stripe caused by Verticillium longisporum is one example. This fungal species is thought to be of a hybrid origin, having a genome composed of combinations of lineages denominated A and D. In this study we report the draft genomes of 2 V. longisporum field isolates sequenced using the Illumina technology. Genomic characterization and lineage composition, followed by selected gene analysis to facilitate the comprehension of its genomic features and potential effector categories were performed.The draft genomes of 2 Verticillium longisporum single spore isolates (VL1 and VL2) have an estimated ungapped size of about 70 Mb. The total number of protein encoding genes identified in VL1 was 20,793, whereas 21,072 gene models were predicted in VL2. The predicted genome size, gene contents, including the gene families coding for carbohydrate active enzymes were almost double the numbers found in V. dahliae and V. albo-atrum. Single nucleotide polymorphisms (SNPs) were frequently distributed in the two genomes but the distribution of heterozygosity and depth was not independent. Further analysis of potential parental lineages suggests that the V. longisporum genome is composed of two parts, A1 and D1, where A1 is more ancient than the parental lineage genome D1, the latter being more closer related to V. dahliae. Presence of the mating-type genes MAT1-1-1 and MAT1-2-1 in the V. longisporum genomes were confirmed. However, the MAT genes in V. dahliae, V. albo-atrum and V. longisporum have experienced extensive nucleotide changes at least partly explaining the present asexual nature of these fungal species.The established draft genome of V. longisporum is comparatively large compared to other studied ascomycete fungi. Consequently, high numbers of genes were predicted in the two V. longisporum genomes, among them many secreted proteins and carbohydrate active enzyme (CAZy) encoding genes. The genome is composed of two parts, where one lineage is more ancient than the part being more closely related to V. dahliae. Dissimilar mating-type sequences were identified indicating possible ancient hybridization events.


September 22, 2019  |  

Comparative mapping of the ASTRINGENCY locus controlling fruit astringency in hexaploid persimmon (Diospyros kaki Thunb.) with the diploid D. lotus reference genome

Persimmon (Diospyros kaki) is a tree crop species that originated in East Asia, consists mainly of hexaploid individuals (2n = 6x = 90) with some nonaploid individuals. One of the unique characteristics of persimmon is the continuous accumulation of proanthocyanidins (PAs) in its fruit until the middle of fruit development, resulting in a strong astringent taste even at commercial fruit maturity. Among persimmon cultivars, pollination-constant and non-astringent (PCNA) types cease PA accumulation in early fruit development and become non-astringent at commercial maturity. PCNA is an allelic trait to non-PCNA and is controlled by a single locus called the ASTRINGENCY (AST) locus. Previous segregation analyses indicated that the AST locus shows hexasomic inheritance; a recessive allele, ast, at this locus confers PCNA. Here, we report a shuttle mapping approach to delimit the AST locus region in the hexaploid persimmon genome by using D. lotus, a diploid relative of D. kaki, as a reference. A D. lotus F1 population of 333 individuals and 296 D. kaki siblings segregating for the PCNA trait were used to map the AST region using haplotype-specific markers covering the AST region. This indicated that the AST locus is syntenic to an approximately 915-kb region of the D. lotus genome. In this 915-kb region, we found several candidates for AST that were revealed from the fruit transcriptome of a population segregating for the PCNA trait. These results could provide important clues for the isolation of AST in hexaploid persimmon.


September 22, 2019  |  

Aberration or analogy? The atypical plastomes of Geraniaceae

A number of plant groups have been proposed as ideal systems to explore plastid inheritance, plastome evolution and plastome-nuclear genome coevolution. Quick generation times and a compact nuclear genome in Arabidopsis thaliana, the relative ease of plastid isolation from Spinacia oleracea and the tractability of plastid transformation in Nicotiana tabacum are all desirable attributes in a model system; however, these and most other groups all lack novelty in terms of plastome structure and nucleotide sequence evolution. Contemporary sequencing and assembly technologies have facilitated analyses of atypical plastomes and, as predicted by early investigations, Geraniaceae plastomes have experienced unprecedented rearrangements relative to the canonical structure and exhibit remarkably high rates of synonymous and nonsynonymous nucleotide substitutions. While not the only lineage with unusual plastome features, likely no other group represents the array of aberrant phenomena recorded for the family. In this chapter, Geraniaceae plastomes will be discussed and, where possible, compared with other taxa.


September 22, 2019  |  

Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza.

The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young ‘AA’ subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 ‘Miracle Rice’, which relieved famine and drove the Green Revolution in Asia 50 years ago.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.