Menu
September 22, 2019

Enigmatic Diphyllatea eukaryotes: culturing and targeted PacBio RS amplicon sequencing reveals a higher order taxonomic diversity and global distribution.

The class Diphyllatea belongs to a group of enigmatic unicellular eukaryotes that play a key role in reconstructing the morphological innovation and diversification of early eukaryotic evolution. Despite its evolutionary significance, very little is known about the phylogeny and species diversity of Diphyllatea. Only three species have described morphology, being taxonomically divided by flagella number, two or four, and cell size. Currently, one 18S rRNA Diphyllatea sequence is available, with environmental sequencing surveys reporting only a single partial sequence from a Diphyllatea-like organism. Accordingly, geographical distribution of Diphyllatea based on molecular data is limited, despite morphological data suggesting the class has a global distribution. We here present a first attempt to understand species distribution, diversity and higher order structure of Diphyllatea.We cultured 11 new strains, characterised these morphologically and amplified their rRNA for a combined 18S-28S rRNA phylogeny. We sampled environmental DNA from multiple sites and designed new Diphyllatea-specific PCR primers for long-read PacBio RSII technology. Near full-length 18S rRNA sequences from environmental DNA, in addition to supplementary Diphyllatea sequence data mined from public databases, resolved the phylogeny into three deeply branching and distinct clades (Diphy I – III). Of these, the Diphy III clade is entirely novel, and in congruence with Diphy II, composed of species morphologically consistent with the earlier described Collodictyon triciliatum. The phylogenetic split between the Diphy I and Diphy II?+?III clades corresponds with a morphological division of Diphyllatea into bi- and quadriflagellate cell forms.This altered flagella composition must have occurred early in the diversification of Diphyllatea and may represent one of the earliest known morphological transitions among eukaryotes. Further, the substantial increase in molecular data presented here confirms Diphyllatea has a global distribution, seemingly restricted to freshwater habitats. Altogether, the results reveal the advantage of combining a group-specific PCR approach and long-read high-throughput amplicon sequencing in surveying enigmatic eukaryote lineages. Lastly, our study shows the capacity of PacBio RS when targeting a protist class for increasing phylogenetic resolution.


September 22, 2019

Transcriptome-wide investigation of circular RNAs in rice.

Various stable circular RNAs (circRNAs) are newly identified to be the abundance of noncoding RNAs in Archaea, Caenorhabditis elegans, mice, and humans through high-throughput deep sequencing coupled with analysis of massive transcriptional data. CircRNAs play important roles in miRNA function and transcriptional controlling by acting as competing endogenous RNAs or positive regulators on their parent coding genes. However, little is known regarding circRNAs in plants. Here, we report 2354 rice circRNAs that were identified through deep sequencing and computational analysis of ssRNA-seq data. Among them, 1356 are exonic circRNAs. Some circRNAs exhibit tissue-specific expression. Rice circRNAs have a considerable number of isoforms, including alternative backsplicing and alternative splicing circularization patterns. Parental genes with multiple exons are preferentially circularized. Only 484 circRNAs have backsplices derived from known splice sites. In addition, only 92 circRNAs were found to be enriched for miniature inverted-repeat transposable elements (MITEs) in flanking sequences or to be complementary to at least 18-bp flanking intronic sequences, indicating that there are some other production mechanisms in addition to direct backsplicing in rice. Rice circRNAs have no significant enrichment for miRNA target sites. A transgenic study showed that overexpression of a circRNA construct could reduce the expression level of its parental gene in transgenic plants compared with empty-vector control plants. This suggested that circRNA and its linear form might act as a negative regulator of its parental gene. Overall, these analyses reveal the prevalence of circRNAs in rice and provide new biological insights into rice circRNAs.© 2015 Lu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.


September 22, 2019

The expressed portion of the barley genome

In this chapter, we refer to the expressed portion of the barley genome as the relatively small fraction of the total cellular DNA that either contains the genes that ultimately produce proteins, or that directly/indirectly controls the level, location and/or timing of when these genes are expressed and proteins are produced. We start by describing the dynamics of tissue and time-dependent gene expression and how common patterns across multiple samples can provide clues about gene networks involved in common biological processes. We then describe some of the complexities of how a single mRNA template can be differentially processed by alternative splicing to generate multiple different proteins or provide a mechanism to regulate the amount of functional gene product in a cell at a given point in time. We extend our analysis, using a number of biological examples, to address how diverse families of small non-coding microRNAs specifically regulate gene expression, and complete our appraisal by looking at the physical/molecular environment around genes that can result in either the promotion or repression of gene expression. We conclude by assessing some of the issues that remain around our ability to fully exploit the depth and power of current approaches for analysing gene expression and propose improvements that could be made using new but available sequencing and bioinformatics technologies.


September 22, 2019

Identification and analysis of glutathione S-transferase gene family in sweet potato reveal divergent GST-mediated networks in aboveground and underground tissues in response to abiotic stresses.

Sweet potato, a hexaploid species lacking a reference genome, is one of the most important crops in many developing countries, where abiotic stresses are a primary cause of reduction of crop yield. Glutathione S-transferases (GSTs) are multifunctional enzymes that play important roles in oxidative stress tolerance and cellular detoxification.A total of 42 putative full-length GST genes were identified from two local transcriptome databases and validated by molecular cloning and Sanger sequencing. Sequence and intraspecific phylogenetic analyses revealed extensive differentiation in their coding sequences and divided them into eight subfamilies. Interspecific phylogenetic and comparative analyses indicated that most examined GST paralogs might originate and diverge before the speciation of sweet potato. Results from large-scale RNA-seq and quantitative real-time PCR experiments exhibited extensive variation in gene-expression profiles across different tissues and varieties, which implied strong evolutionary divergence in their gene-expression regulation. Moreover, we performed five manipulated stress experiments and uncovered highly divergent stress-response patterns of sweet potato GST genes in aboveground and underground tissues.Our study identified a large number of sweet potato GST genes, systematically investigated their evolutionary diversification, and provides new insights into the GST-mediated stress-response mechanisms in this worldwide crop.


September 22, 2019

LSCplus: a fast solution for improving long read accuracy by short read alignment.

The single molecule, real time (SMRT) sequencing technology of Pacific Biosciences enables the acquisition of transcripts from end to end due to its ability to produce extraordinarily long reads (>10 kb). This new method of transcriptome sequencing has been applied to several projects on humans and model organisms. However, the raw data from SMRT sequencing are of relatively low quality, with a random error rate of approximately 15 %, for which error correction using next-generation sequencing (NGS) short reads is typically necessary. Few tools have been designed that apply a hybrid sequencing approach that combines NGS and SMRT data, and the most popular existing tool for error correction, LSC, has computing resource requirements that are too intensive for most laboratory and research groups. These shortcomings severely limit the application of SMRT long reads for transcriptome analysis.Here, we report an improved tool (LSCplus) for error correction with the LSC program as a reference. LSCplus overcomes the disadvantage of LSC’s time consumption and improves quality. Only 1/3-1/4 of the time and 1/20-1/25 of the error correction time is required using LSCplus compared with that required for using LSC.LSCplus is freely available at http://www.herbbol.org:8001/lscplus/ . Sample calculations are provided illustrating the precision and efficiency of this method regarding error correction and isoform detection.


September 22, 2019

A single-cell genome for Thiovulum sp.

We determined a significant fraction of the genome sequence of a representative of Thiovulum, the uncultivated genus of colorless sulfur Epsilonproteobacteria, by analyzing the genome sequences of four individual cells collected from phototrophic mats from Elkhorn Slough, California. These cells were isolated utilizing a microfluidic laser-tweezing system, and their genomes were amplified by multiple-displacement amplification prior to sequencing. Thiovulum is a gradient bacterium found at oxic-anoxic marine interfaces and noted for its distinctive morphology and rapid swimming motility. The genomic sequences of the four individual cells were assembled into a composite genome consisting of 221 contigs covering 2.083 Mb including 2,162 genes. This single-cell genome represents a genomic view of the physiological capabilities of isolated Thiovulum cells. Thiovulum is the second-fastest bacterium ever observed, swimming at 615 µm/s, and this genome shows that this rapid swimming motility is a result of a standard flagellar machinery that has been extensively characterized in other bacteria. This suggests that standard flagella are capable of propelling bacterial cells at speeds much faster than typically thought. Analysis of the genome suggests that naturally occurring Thiovulum populations are more diverse than previously recognized and that studies performed in the past probably address a wide range of unrecognized genotypic and phenotypic diversities of Thiovulum. The genome presented in this article provides a basis for future isolation-independent studies of Thiovulum, where single-cell and metagenomic tools can be used to differentiate between different Thiovulum genotypes.


September 22, 2019

Direct chromosome-length haplotyping by single-cell sequencing.

Haplotypes are fundamental to fully characterize the diploid genome of an individual, yet methods to directly chart the unique genetic makeup of each parental chromosome are lacking. Here we introduce single-cell DNA template strand sequencing (Strand-seq) as a novel approach to phasing diploid genomes along the entire length of all chromosomes. We demonstrate this by building a complete haplotype for a HapMap individual (NA12878) at high accuracy (concordance 99.3%), without using generational information or statistical inference. By use of this approach, we mapped all meiotic recombination events in a family trio with high resolution (median range ~14 kb) and phased larger structural variants like deletions, indels, and balanced rearrangements like inversions. Lastly, the single-cell resolution of Strand-seq allowed us to observe loss of heterozygosity regions in a small number of cells, a significant advantage for studies of heterogeneous cell populations, such as cancer cells. We conclude that Strand-seq is a unique and powerful approach to completely phase individual genomes and map inheritance patterns in families, while preserving haplotype differences between single cells.© 2016 Porubský et al.; Published by Cold Spring Harbor Laboratory Press.


September 22, 2019

The Florida manatee (Trichechus manatus latirostris) immunoglobulin heavy chain suggests the importance of clan III variable segments in repertoire diversity.

Manatees are a vulnerable, charismatic sentinel species from the evolutionarily divergent Afrotheria. Manatee health and resistance to infectious disease is of great concern to conservation groups, but little is known about their immune system. To develop manatee-specific tools for monitoring health, we first must have a general knowledge of how the immunoglobulin heavy (IgH) chain locus is organized and transcriptionally expressed. Using the genomic scaffolds of the Florida manatee (Trichechus manatus latirostris), we characterized the potential IgH segmental diversity and constant region isotypic diversity and performed the first Afrotherian repertoire analysis. The Florida manatee has low V(D)J combinatorial diversity (3744 potential combinations) and few constant region isotypes. They also lack clan III V segments, which may have caused reduced VH segment numbers. However, we found productive somatic hypermutation concentrated in the complementarity determining regions. In conclusion, manatees have limited IGHV clan and combinatorial diversity. This suggests that clan III V segments are essential for maintaining IgH locus diversity. Copyright © 2017 Elsevier Ltd. All rights reserved.


September 22, 2019

Tracking alternatively spliced isoforms from long reads by SpliceHunter.

Alternative splicing increases the functional complexity of a genome by generating multiple isoforms and potentially proteins from the same gene. Vast amounts of alternative splicing events are routinely detected by short read deep sequencing technologies but their functional interpretation is hampered by an uncertain transcript context. Emerging long-read sequencing technologies provide a more complete picture of full-length transcript sequences. We introduce SpliceHunter, a tool for the computational interpretation of long reads generated by for example Pacific Biosciences instruments. SpliceHunter defines and tracks isoforms and novel transcription units across time points, compares their splicing pattern to a reference annotation, and translates them into potential protein sequences.


September 22, 2019

PacBio sequencing and its applications.

Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Additionally, PacBio’s sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.


September 22, 2019

Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing.

Switchgrass (Panicum virgatum L.) is an important bioenergy crop widely used for lignocellulosic research. While extensive transcriptomic analyses have been conducted on this species using short read-based sequencing techniques, very little has been reliably derived regarding alternatively spliced (AS) transcripts.We present an analysis of transcriptomes of six switchgrass tissue types pooled together, sequenced using Pacific Biosciences (PacBio) single-molecular long-read technology. Our analysis identified 105,419 unique transcripts covering 43,570 known genes and 8795 previously unknown genes. 45,168 are novel transcripts of known genes. A total of 60,096 AS transcripts are identified, 45,628 being novel. We have also predicted 1549 transcripts of genes involved in cell wall construction and remodeling, 639 being novel transcripts of known cell wall genes. Most of the predicted transcripts are validated against Illumina-based short reads. Specifically, 96% of the splice junction sites in all the unique transcripts are validated by at least five Illumina reads. Comparisons between genes derived from our identified transcripts and the current genome annotation revealed that among the gene set predicted by both analyses, 16,640 have different exon-intron structures.Overall, substantial amount of new information is derived from the PacBio RNA data regarding both the transcriptome and the genome of switchgrass.


September 22, 2019

Analysis of transcripts and splice isoforms in red clover (Trifolium pratense L.) by single-molecule long-read sequencing.

Red clover (Trifolium pratense L.) is an important cool-season legume plant, which is the most widely planted forage legume after alfalfa. Although a draft genome sequence was published already, the sequences and completed structure of mRNA transcripts remain unclear, which limit further explore on red clover.In this study, the red clover transcriptome was sequenced using single-molecule long-read sequencing to identify full-length splice isoforms, and 29,730 novel isoforms from known genes and 2194 novel isoforms from novel genes were identified. A total of 5492 alternative splicing events was identified and the majority of alter spliced events in red clover was corrected as intron retention. In addition, of the 15,229 genes detected by SMRT, 8719 including 186,517 transcripts have at least one poly(A) site. Furthermore, we identified 4333 long non-coding RNAs and 3762 fusion transcripts.We analyzed full-length transcriptome of red clover with PacBio SMRT. Those new findings provided important information for improving red clover draft genome annotation and fully characterization of red clover transcriptome.


September 22, 2019

Order of removal of conventional and nonconventional introns from nuclear transcripts of Euglena gracilis.

Nuclear genes of euglenids and marine diplonemids harbor atypical, nonconventional introns which are not observed in the genomes of other eukaryotes. Nonconventional introns do not have the conserved borders characteristic for spliceosomal introns or the sequence complementary to U1 snRNA at the 5′ end. They form a stable secondary structure bringing together both exon/intron junctions, nevertheless, this conformation does not resemble the form of self-splicing or tRNA introns. In the genes studied so far, frequent nonconventional introns insertions at new positions have been observed, whereas conventional introns have been either found at the conserved positions, or simply lost. In this work, we examined the order of intron removal from Euglena gracilis transcripts of the tubA and gapC genes, which contain two types of introns: nonconventional and spliceosomal. The relative order of intron excision was compared for pairs of introns belonging to different types. Furthermore, intermediate products of splicing were analyzed using the PacBio Next Generation Sequencing system. The analysis led to the main conclusion that nonconventional introns are removed in a rapid way but later than spliceosomal introns. Moreover, the observed accumulation of transcripts with conventional introns removed and nonconventional present may suggest the existence of a time gap between the two types of splicing.


September 22, 2019

Long-read sequencing and de novo assembly of a Chinese genome.

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93?Gb (contig N50: 8.3?Mb, scaffold N50: 22.0?Mb, including 39.3?Mb N-bases), together with 206?Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8?Mb of HX1-specific sequences, including 4.1?Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.


September 22, 2019

Fungal community shifts underpin declining mycelial production and turnover across a Pinus sylvestris chronosequence

Fungi play critical roles in ecosystem processes such as decomposition and nutrient cycling, but have also been highlighted as significant contributors to organic matter build-up in boreal forest soils. Ectomycorrhizal (ECM) mycelial biomass and necromass dynamics have recently been highlighted as essential for regulating build-up of soil organic matter. Understanding the extent to which shifts in mycelial community or growth trait composition cause changes in mycelial production and turnover over ecological gradients would aid a mechanistic understanding of these important processes at an ecosystem scale. Here, we test the hypotheses that shifting species and mycelial trait (exploration type) composition within the mycelial community underpin changes in biomass turnover with increasing forest age. We quantified mycelial turnover and assessed fungal community composition in a chronosequence of eight, 12- to 158-year-old, managed Pinus sylvestris forests. Turnover was estimated by determining mycelial biomass (ergosterol) in a sequence of ingrowth mesh bags and applying mathematical models. Fungal communities in the bags were identified using Pacific Biosciences sequencing of fungal ITS2 amplicons. To evaluate the accuracy of this method to represent all ECM fungi, community composition in bags was followed over time and compared with communities in soil. Mycelial communities changed with stand age, but we found no evidence that there were concurrent shifts in mycelial exploration types. Forest age and turnover were significantly correlated with ECM mycelial community composition and collectively explained 39.4% of total variation. The similarity between fungal communities in mesh bags and in soil was strongly forest age dependent, with communities in mesh bags diverging from soil communities in stands older than 60 years. However, in all stands, when bag incubation time exceeded 75 days, communities became more similar to soil communities. Synthesis. Our results support the idea that shifts in fungal community composition underpin the forest age-related decrease in mycelial turnover; however, since ingrowth mesh bags exclude some mycorrhizal species in older forests, it remains a possibility that turnover estimates were not reflecting the entire community. While we found no evidence that mycelial exploration types of fungi changed systematically with forest age, we suggest that other traits that relate to biomass turnover and necromass degradation require further study, as they may explain the extent to which ectomycorrhizal fungi regulate and contribute to soil organic matter accumulation.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.