Menu
April 21, 2020  |  

Genome and transcriptome sequencing of the astaxanthin-producing green microalga, Haematococcus pluvialis.

Haematococcus pluvialis is a freshwater species of Chlorophyta, family Haematococcaceae. It is well known for its capacity to synthesize high amounts of astaxanthin, which is a strong antioxidant that has been utilized in aquaculture and cosmetics. To improve astaxanthin yield and to establish genetic resources for H. pluvialis, we performed whole-genome sequencing, assembly, and annotation of this green microalga. A total of 83.1 Gb of raw reads were sequenced. After filtering the raw reads, we subsequently generated a draft assembly with a genome size of 669.0?Mb, a scaffold N50 of 288.6?kb, and predicted 18,545 genes. We also established a robust phylogenetic tree from 14 representative algae species. With additional transcriptome data, we revealed some novel potential genes that are involved in the synthesis, accumulation, and regulation of astaxanthin production. In addition, we generated an isoform-level reference transcriptome set of 18,483 transcripts with high confidence. Alternative splicing analysis demonstrated that intron retention is the most frequent mode. In summary, we report the first draft genome of H. pluvialis. These genomic resources along with transcriptomic data provide a solid foundation for the discovery of the genetic basis for theoretical and commercial astaxanthin enrichment.


April 21, 2020  |  

A reference genome for pea provides insight into legume genome evolution.

We report the first annotated chromosome-level reference genome assembly for pea, Gregor Mendel’s original genetic model. Phylogenetics and paleogenomics show genomic rearrangements across legumes and suggest a major role for repetitive elements in pea genome evolution. Compared to other sequenced Leguminosae genomes, the pea genome shows intense gene dynamics, most likely associated with genome size expansion when the Fabeae diverged from its sister tribes. During Pisum evolution, translocation and transposition differentially occurred across lineages. This reference sequence will accelerate our understanding of the molecular basis of agronomically important traits and support crop improvement.


April 21, 2020  |  

A chromosome-level genome assembly of Cydia pomonella provides insights into chemical ecology and insecticide resistance.

The codling moth Cydia pomonella, a major invasive pest of pome fruit, has spread around the globe in the last half century. We generated a chromosome-level scaffold assembly including the Z chromosome and a portion of the W chromosome. This assembly reveals the duplication of an olfactory receptor gene (OR3), which we demonstrate enhances the ability of C. pomonella to exploit kairomones and pheromones in locating both host plants and mates. Genome-wide association studies contrasting insecticide-resistant and susceptible strains identify hundreds of single nucleotide polymorphisms (SNPs) potentially associated with insecticide resistance, including three SNPs found in the promoter of CYP6B2. RNAi knockdown of CYP6B2 increases C. pomonella sensitivity to two insecticides, deltamethrin and azinphos methyl. The high-quality genome assembly of C. pomonella informs the genetic basis of its invasiveness, suggesting the codling moth has distinctive capabilities and adaptive potential that may explain its worldwide expansion.


April 21, 2020  |  

Metatranscriptomic evidence for classical and RuBisCO-mediated CO2 reduction to methane facilitated by direct interspecies electron transfer in a methanogenic system.

In a staged anaerobic fluidized-bed ceramic membrane bioreactor, metagenomic and metatranscriptomic analyses were performed to decipher the microbial interactions on the granular activated carbon. Metagenome bins, representing the predominating microbes in the bioreactor: syntrophic propionate-oxidizing bacteria (SPOB), acetoclastic Methanothrix concilii, and exoelectrogenic Geobacter lovleyi, were successfully recovered for the reconstruction and analysis of metabolic pathways involved in the transformation of fatty acids to methane. In particular, SPOB degraded propionate into acetate, which was further converted into methane and CO2 by M. concilii via the acetoclastic methanogenesis. Concurrently, G. lovleyi oxidized acetate into CO2, releasing electrons into the extracellular environment. By accepting these electrons through direct interspecies electron transfer (DIET), M. concilii was capable of performing CO2 reduction for further methane formation. Most notably, an alternative RuBisCO-mediated CO2 reduction (the reductive hexulose-phosphate (RHP) pathway) is transcriptionally-active in M. concilii. This RHP pathway enables M. concilii dominance and energy gain by carbon fixation and methanogenesis, respectively via a methyl-H4MPT intermediate, constituting the third methanogenesis route. The complete acetate reduction (2 mole methane formation/1 mole acetate consumption), coupling of acetoclastic methanogenesis and two CO2 reduction pathways, are thermodynamically favorable even under very low substrate condition (down to to 10-5?M level). Such tight interactions via both mediated and direct interspecies electron transfer (MIET and DIET), induced by the conductive GAC promote the overall efficiency of bioenergy processes.


April 21, 2020  |  

Genome-wide mutational biases fuel transcriptional diversity in the Mycobacterium tuberculosis complex.

The Mycobacterium tuberculosis complex (MTBC) members display different host-specificities and virulence phenotypes. Here, we have performed a comprehensive RNAseq and methylome analysis of the main clades of the MTBC and discovered unique transcriptional profiles. The majority of genes differentially expressed between the clades encode proteins involved in host interaction and metabolic functions. A significant fraction of changes in gene expression can be explained by positive selection on single mutations that either create or disrupt transcriptional start sites (TSS). Furthermore, we show that clinical strains have different methyltransferases inactivated and thus different methylation patterns. Under the tested conditions, differential methylation has a minor direct role on transcriptomic differences between strains. However, disruption of a methyltransferase in one clinical strain revealed important expression differences suggesting indirect mechanisms of expression regulation. Our study demonstrates that variation in transcriptional profiles are mainly due to TSS mutations and have likely evolved due to differences in host characteristics.


April 21, 2020  |  

Extensive intraspecific gene order and gene structural variations in upland cotton cultivars.

Multiple cotton genomes (diploid and tetraploid) have been assembled. However, genomic variations between cultivars of allotetraploid upland cotton (Gossypium hirsutum L.), the most widely planted cotton species in the world, remain unexplored. Here, we use single-molecule long read and Hi-C sequencing technologies to assemble genomes of the two upland cotton cultivars TM-1 and zhongmiansuo24 (ZM24). Comparisons among TM-1 and ZM24 assemblies and the genomes of the diploid ancestors reveal a large amount of genetic variations. Among them, the top three longest structural variations are located on chromosome A08 of the tetraploid upland cotton, which account for ~30% total length of this chromosome. Haplotype analyses of the mapping population derived from these two cultivars and the germplasm panel show suppressed recombination rates in this region. This study provides additional genomic resources for the community, and the identified genetic variations, especially the reduced meiotic recombination on chromosome A08, will help future breeding.


April 21, 2020  |  

Urinary tract colonization is enhanced by a plasmid that regulates uropathogenic Acinetobacter baumannii chromosomal genes.

Multidrug resistant (MDR) Acinetobacter baumannii poses a growing threat to global health. Research on Acinetobacter pathogenesis has primarily focused on pneumonia and bloodstream infections, even though one in five A. baumannii strains are isolated from urinary sites. In this study, we highlight the role of A. baumannii as a uropathogen. We develop the first A. baumannii catheter-associated urinary tract infection (CAUTI) murine model using UPAB1, a recent MDR urinary isolate. UPAB1 carries the plasmid pAB5, a member of the family of large conjugative plasmids that represses the type VI secretion system (T6SS) in multiple Acinetobacter strains. pAB5 confers niche specificity, as its carriage improves UPAB1 survival in a CAUTI model and decreases virulence in a pneumonia model. Comparative proteomic and transcriptomic analyses show that pAB5 regulates the expression of multiple chromosomally-encoded virulence factors besides T6SS. Our results demonstrate that plasmids can impact bacterial infections by controlling the expression of chromosomal genes.


April 21, 2020  |  

Getting the Entire Message: Progress in Isoform Sequencing

The advent of second-generation sequencing and its application to RNA sequencing has revolutionized the field of genomics by allowing the quantification of expression of entire genes as well as single TSS, exons and splice sites, RNA-editing sites as well as polyA-sites. However, due to the sequencing of fragments of cDNAs these methods have not given a reliable picture of complete RNA isoforms. Third-generation sequencing has filled this gap and allows end-to-end sequencing of entire RNA/cDNA molecules. This approach to transcriptomics has been a ‘niche’ technology for a couple of years but now is becoming mainstream with many different applications. Here, we review the background and progress made to date in this rapidly growing field. We start by reviewing the progressive realization that alternative splicing is omnipresent. We then focus on long-non-coding RNA isoforms and the distinct combination patterns of exons in non-coding and coding genes. We consider the implications of the recent technologies of direct RNA sequencing and single-cell isoform RNA sequencing. Finally, we discuss the parameters that define the success of long-read RNA sequencing experiments and strategies commonly used to make the most of such data.


April 21, 2020  |  

Complete genome sequence analysis of the thermoacidophilic verrucomicrobial methanotroph “Candidatus Methylacidiphilum kamchatkense” strain Kam1 and comparison with its closest relatives.

The candidate genus “Methylacidiphilum” comprises thermoacidophilic aerobic methane oxidizers belonging to the Verrucomicrobia phylum. These are the first described non-proteobacterial aerobic methane oxidizers. The genes pmoCAB, encoding the particulate methane monooxygenase do not originate from horizontal gene transfer from proteobacteria. Instead, the “Ca. Methylacidiphilum” and the sister genus “Ca. Methylacidimicrobium” represent a novel and hitherto understudied evolutionary lineage of aerobic methane oxidizers. Obtaining and comparing the full genome sequences is an important step towards understanding the evolution and physiology of this novel group of organisms.Here we present the closed genome of “Ca. Methylacidiphilum kamchatkense” strain Kam1 and a comparison with the genomes of its two closest relatives “Ca. Methylacidiphilum fumariolicum” strain SolV and “Ca. Methylacidiphilum infernorum” strain V4. The genome consists of a single 2,2 Mbp chromosome with 2119 predicted protein coding sequences. Genome analysis showed that the majority of the genes connected with metabolic traits described for one member of “Ca. Methylacidiphilum” is conserved between all three genomes. All three strains encode class I CRISPR-cas systems. The average nucleotide identity between “Ca. M. kamchatkense” strain Kam1 and strains SolV and V4 is =95% showing that they should be regarded as separate species. Whole genome comparison revealed a high degree of synteny between the genomes of strains Kam1 and SolV. In contrast, comparison of the genomes of strains Kam1 and V4 revealed a number of rearrangements. There are large differences in the numbers of transposable elements found in the genomes of the three strains with 12, 37 and 80 transposable elements in the genomes of strains Kam1, V4 and SolV respectively. Genomic rearrangements and the activity of transposable elements explain much of the genomic differences between strains. For example, a type 1h uptake hydrogenase is conserved between strains Kam1 and SolV but seems to have been lost from strain V4 due to genomic rearrangements.Comparing three closed genomes of “Ca. Methylacidiphilum” spp. has given new insights into the evolution of these organisms and revealed large differences in numbers of transposable elements between strains, the activity of these explains much of the genomic differences between strains.


April 21, 2020  |  

Prediction of Host-Specific Genes by Pan-Genome Analyses of the Korean Ralstonia solanacearum Species Complex.

The soil-borne pathogenic Ralstonia solanacearum species complex (RSSC) is a group of plant pathogens that is economically destructive worldwide and has a broad host range, including various solanaceae plants, banana, ginger, sesame, and clove. Previously, Korean RSSC strains isolated from samples of potato bacterial wilt were grouped into four pathotypes based on virulence tests against potato, tomato, eggplant, and pepper. In this study, we sequenced the genomes of 25 Korean RSSC strains selected based on these pathotypes. The newly sequenced genomes were analyzed to determine the phylogenetic relationships between the strains with average nucleotide identity values, and structurally compared via multiple genome alignment using Mauve software. To identify candidate genes responsible for the host specificity of the pathotypes, functional genome comparisons were conducted by analyzing pan-genome orthologous group (POG) and type III secretion system effectors (T3es). POG analyses revealed that a total of 128 genes were shared only in tomato-non-pathogenic strains, 8 genes in tomato-pathogenic strains, 5 genes in eggplant-non-pathogenic strains, 7 genes in eggplant-pathogenic strains, 1 gene in pepper-non-pathogenic strains, and 34 genes in pepper-pathogenic strains. When we analyzed T3es, three host-specific effectors were predicted: RipS3 (SKWP3) and RipH3 (HLK3) were found only in tomato-pathogenic strains, and RipAC (PopC) were found only in eggplant-pathogenic strains. Overall, we identified host-specific genes and effectors that may be responsible for virulence functions in RSSC in silico. The expected characters of those genes suggest that the host range of RSSC is determined by the comprehensive actions of various virulence factors, including effectors, secretion systems, and metabolic enzymes.


April 21, 2020  |  

Closing the Yield Gap for Cannabis: A Meta-Analysis of Factors Determining Cannabis Yield.

Until recently, the commercial production of Cannabis sativa was restricted to varieties that yielded high-quality fiber while producing low levels of the psychoactive cannabinoid tetrahydrocannabinol (THC). In the last few years, a number of jurisdictions have legalized the production of medical and/or recreational cannabis with higher levels of THC, and other jurisdictions seem poised to follow suit. Consequently, demand for industrial-scale production of high yield cannabis with consistent cannabinoid profiles is expected to increase. In this paper we highlight that currently, projected annual production of cannabis is based largely on facility size, not yield per square meter. This meta-analysis of cannabis yields reported in scientific literature aimed to identify the main factors contributing to cannabis yield per plant, per square meter, and per W of lighting electricity. In line with previous research we found that variety, plant density, light intensity and fertilization influence cannabis yield and cannabinoid content; we also identified pot size, light type and duration of the flowering period as predictors of yield and THC accumulation. We provide insight into the critical role of light intensity, quality, and photoperiod in determining cannabis yields, with particular focus on the potential for light-emitting diodes (LEDs) to improve growth and reduce energy requirements. We propose that the vast amount of genomics data currently available for cannabis can be used to better understand the effect of genotype on yield. Finally, we describe diversification that is likely to emerge in cannabis growing systems and examine the potential role of plant-growth promoting rhizobacteria (PGPR) for growth promotion, regulation of cannabinoid biosynthesis, and biocontrol.


April 21, 2020  |  

Long-Read Sequencing Emerging in Medical Genetics

The wide implementation of next-generation sequencing (NGS) technologies has revolutionized the field of medical genetics. However, the short read lengths of currently used sequencing approaches pose a limitation for identification of structural variants, sequencing repetitive regions, phasing alleles and distinguishing highly homologous genomic regions. These limitations may significantly contribute to the diagnostic gap in patients with genetic disorders who have undergone standard NGS, like whole exome or even genome sequencing. Now, the emerging long-read sequencing (LRS) technologies may offer improvements in the characterization of genetic variation and regions that are difficult to assess with the currently prevailing NGS approaches. LRS has so far mainly been used to investigate genetic disorders with previously known or strongly suspected disease loci. While these targeted approaches already show the potential of LRS, it remains to be seen whether LRS technologies can soon enable true whole genome sequencing routinely. Ultimately, this could allow the de novo assembly of individual whole genomes used as a generic test for genetic disorders. In this article, we summarize the current LRS-based research on human genetic disorders and discuss the potential of these technologies to facilitate the next major advancements in medical genetics.


April 21, 2020  |  

A new reference genome for Sorghum bicolor reveals high levels of sequence similarity between sweet and grain genotypes: implications for the genetics of sugar metabolism.

The process of crop domestication often consists of two stages: initial domestication, where the wild species is first cultivated by humans, followed by diversification, when the domesticated species are subsequently adapted to more environments and specialized uses. Selective pressure to increase sugar accumulation in certain varieties of the cereal crop Sorghum bicolor is an excellent example of the latter; this has resulted in pronounced phenotypic divergence between sweet and grain-type sorghums, but the genetic mechanisms underlying these differences remain poorly understood.Here we present a new reference genome based on an archetypal sweet sorghum line and compare it to the current grain sorghum reference, revealing a high rate of nonsynonymous and potential loss of function mutations, but few changes in gene content or overall genome structure. We also use comparative transcriptomics to highlight changes in gene expression correlated with high stalk sugar content and show that changes in the activity and possibly localization of transporters, along with the timing of sugar metabolism play a critical role in the sweet phenotype.The high level of genomic similarity between sweet and grain sorghum reflects their historical relatedness, rather than their current phenotypic differences, but we find key changes in signaling molecules and transcriptional regulators that represent new candidates for understanding and improving sugar metabolism in this important crop.


April 21, 2020  |  

Reviving the Transcriptome Studies: An Insight into the Emergence of Single-molecule Transcriptome Sequencing

Advances in transcriptomics have provided an exceptional opportunity to study functional implications of the genetic variability. Technologies such as RNA-Seq have emerged as state-of-the-art techniques for transcriptome analysis that take advantage of high-throughput next-generation sequencing. However, similar to their predecessors, these approaches continue to impose major challenges on full-length transcript structure identification, primarily due to inherent limitations of read length. With the development of single-molecule sequencing (SMS) from PacBio, a growing number of studies on the transcriptome of different organisms have been reported. SMS has emerged as advantageous for comprehensive genome annotation including identification of novel genes/isoforms, long non-coding RNAs and fusion transcripts. This approach can be used across a broad spectrum of species to better interpret the coding information of the genome, and facilitate the biological function study. We provide an overview of SMS platform and its diverse applications in various biological studies, and our perspective on the challenges associated with the transcriptome studies.


April 21, 2020  |  

Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data.

Our understanding of the pig transcriptome is limited. RNA transcript diversity among nine tissues was assessed using poly(A) selected single-molecule long-read isoform sequencing (Iso-seq) and Illumina RNA sequencing (RNA-seq) from a single White cross-bred pig. Across tissues, a total of 67,746 unique transcripts were observed, including 60.5% predicted protein-coding, 36.2% long non-coding RNA and 3.3% nonsense-mediated decay transcripts. On average, 90% of the splice junctions were supported by RNA-seq within tissue. A large proportion (80%) represented novel transcripts, mostly produced by known protein-coding genes (70%), while 17% corresponded to novel genes. On average, four transcripts per known gene (tpg) were identified; an increase over current EBI (1.9 tpg) and NCBI (2.9 tpg) annotations and closer to the number reported in human genome (4.2 tpg). Our new pig genome annotation extended more than 6000 known gene borders (5′ end extension, 3′ end extension, or both) compared to EBI or NCBI annotations. We validated a large proportion of these extensions by independent pig poly(A) selected 3′-RNA-seq data, or human FANTOM5 Cap Analysis of Gene Expression data. Further, we detected 10,465 novel genes (81% non-coding) not reported in current pig genome annotations. More than 80% of these novel genes had transcripts detected in >?1 tissue. In addition, more than 80% of novel intergenic genes with at least one transcript detected in liver tissue had H3K4me3 or H3K36me3 peaks mapping to their promoter and gene body, respectively, in independent liver chromatin immunoprecipitation data. These validated results show significant improvement over current pig genome annotations.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.