Menu
September 22, 2019

A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants.

Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) ‘Hongyang’ draft genome has 164 Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models.A second genome of an A. chinensis (genotype Red5) was fully sequenced. This new sequence resulted in a 554.0 Mb assembly with all but 6 Mb assigned to pseudo-chromosomes. Pseudo-chromosomal comparisons showed a considerable number of translocation events have occurred following a whole genome duplication (WGD) event some consistent with centromeric Robertsonian-like translocations. RNA sequencing data from 12 tissues and ab initio analysis informed a genome-wide manual annotation, using the WebApollo tool. In total, 33,044 gene loci represented by 33,123 isoforms were identified, named and tagged for quality of evidential support. Of these 3114 (9.4%) were identical to a protein within ‘Hongyang’ The Kiwifruit Information Resource (KIR v2). Some proportion of the differences will be varietal polymorphisms. However, as most computationally predicted Red5 models required manual re-annotation this proportion is expected to be small. The quality of the new gene models was tested by fully sequencing 550 cloned ‘Hort16A’ cDNAs and comparing with the predicted protein models for Red5 and both the original ‘Hongyang’ assembly and the revised annotation from KIR v2. Only 48.9% and 63.5% of the cDNAs had a match with 90% identity or better to the original and revised ‘Hongyang’ annotation, respectively, compared with 90.9% to the Red5 models.Our study highlights the need to take a cautious approach to draft genomes and computationally predicted genes. Our use of the manual annotation tool WebApollo facilitated manual checking and correction of gene models enabling improvement of computational prediction. This utility was especially relevant for certain types of gene families such as the EXPANSIN like genes. Finally, this high quality gene set will supply the kiwifruit and general plant community with a new tool for genomics and other comparative analysis.


September 22, 2019

Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb.The first draft genome sequences of P. ginseng cultivar “Chunpoong” were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page.This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.


September 22, 2019

The genome of Ectocarpus subulatus highlights unique mechanisms for stress tolerance in brown algae

Brown algae are multicellular photosynthetic organisms belonging to the stramenopile lineage. They are successful colonizers of marine rocky shores world-wide. The genus Ectocarpus, and especially strain Ec32, has been established as a genetic and genomic model for brown algae. A related species, Ectocarpus subulatus Kuetzing, is characterized by its high tolerance of abiotic stress. Here we present the genome and metabolic network of a haploid male strain of E. subulatus, establishing it as a comparative model to study the genomic bases of stress tolerance in Ectocarpus. Our analyses indicate that E. subulatus has separated from Ectocarpus sp. Ec32 via allopatric speciation. Since this event, its genome has been shaped by the activity of viruses and large retrotransposons, which in the case of chlorophyll-binding proteins, may be related to the expansion of this gene family. We have identified a number of further genes that we suspect to contribute to stress tolerance in E. subulatus, including an expanded family of heat shock proteins, the reduction of genes involved in the production of halogenated defense compounds, and the presence of fewer cell wall polysaccharide-modifying enzymes. However, 96% of genes that differed between the two examined Ectocarpus species, as well as 90% of genes under positive selection, were found to be lineage-specific and encode proteins of unknown function. This underlines the uniqueness of brown algae with respect to their stress tolerance mechanisms as well as the significance of establishing E. subulatus as a comparative model for future functional studies.


September 22, 2019

Genome-wide identification of simple sequence repeats and development of polymorphic SSR markers for genetic studies in tea plant (Camellia sinensis)

The tea plant (Camellia sinensis (L.) O. Kuntze) is one of the most popular non-alcoholic beverage crops worldwide. The availability of complete genome sequences for the Camellia sinensis var. ‘Shuchazao’ has provided the opportunity to identify all types of simple sequence repeat (SSR) markers by genome-wide scan. In this study, a total of 667,980 SSRs were identified in the ~?3.08 Gb genome, with an overall density of 216.88 SSRs/Mb. Dinucleotide repeats were predominant among microsatellites (72.25%), followed by trinucleotide repeats (15.35%), while the remaining SSRs accounted for less than 13%. The motif AG/CT (49.96%) and AT/TA (40.14%) were the most and the second most abundant among all identified SSR motifs, respectively; meanwhile, AAT/ATT (41.29%) and AAAT/ATTT (67.47%) were the most common among trinucleotides and tetranucleotides, respectively. A total of 300 primer pairs were designed to screen six tea cultivars for polymorphisms of SSR markers using the five selected repeat types of microsatellite sequences. The resulting 96 SSR markers that yielded polymorphic and unambiguous bands were further deployed on 47 tea cultivars for genetic diversity assessment, demonstrating high polymorphism of these SSR markers. Remarkably, the dendrogram revealed that the phylogenetic relationships among these tea cultivars are highly consistent with their genetic backgrounds or places of origin. The identified genome-wide SSRs and newly developed SSR markers will provide a powerful means for genetic researches in tea plant, including genetic diversity and evolutionary origin analysis, fingerprinting, QTL mapping, and marker-assisted selection for breeding.


September 22, 2019

The complete chloroplast genome of Chrysanthemum boreale (Asteraceae)

Chrysanthemum boreale is a perennial plant in the Asteraceae family that is native to eastern Asia and has both ornamental and herbal uses. Here, we determined the complete chloroplast genome sequence for C. boreale using long-read sequencing. The chloroplast genome was 151,012?bp and consisted of a large single copy (LSC) region (82,817?bp), a small single copy (SSC) region (18,281?bp) and two inverted repeats (IRs) (24,957?bp). It was predicted to contain 131 genes, including 87 protein-coding genes, eight rRNAs and 46 tRNAs. Phylogenetic analysis of chloroplast genomes clustered C. boreale with other Chrysanthemum and Asteraceae species.


September 22, 2019

DNA N6-adenine methylation in Arabidopsis thaliana.

DNA methylation on N6-adenine (6mA) has recently been found to be a potentially epigenetic mark in several unicellular and multicellular eukaryotes. However, its distribution patterns and potential functions in land plants, which are primary producers for most ecosystems, remain largely unknown. Here we report global profiling of 6mA sites at single-nucleotide resolution in the genome of Arabidopsis thaliana at different developmental stages using single-molecule real-time sequencing. 6mA sites are widely distributed across the Arabidopsis genome and enriched over the pericentromeric heterochromatin regions. 6mA occurs more frequently in gene bodies than intergenic regions. Analysis of 6mA methylomes and RNA sequencing data demonstrates that 6mA frequency positively correlates with the gene expression level and the transition from vegetative to reproductive growth in Arabidopsis. Our results uncover 6mA as a DNA mark associated with actively expressed genes in Arabidopsis, suggesting that 6mA serves as a hitherto unknown epigenetic mark in land plants. Copyright © 2018 Elsevier Inc. All rights reserved.


September 22, 2019

Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality.

Tea, one of the world’s most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ~0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ~30 to 40 and ~90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. Copyright © 2018 the Author(s). Published by PNAS.


September 22, 2019

De novo genome assembly of the red silk cotton tree (Bombax ceiba).

Bombax ceiba L. (the red silk cotton tree) is a large deciduous tree that is distributed in tropical and sub-tropical Asia as well as northern Australia. It has great economic and ecological importance, with several applications in industry and traditional medicine in many Asian countries. To facilitate further utilization of this plant resource, we present here the draft genome sequence for B. ceiba.We assembled a relatively intact genome of B. ceiba by using PacBio single-molecule sequencing and BioNano optical mapping technologies. The final draft genome is approximately 895 Mb long, with contig and scaffold N50 sizes of 1.0 Mb and 2.06 Mb, respectively.The high-quality draft genome assembly of B. ceiba will be a valuable resource enabling further genetic improvement and more effective use of this tree species.


September 22, 2019

Phenotypic diversification by enhanced genome restructuring after induction of multiple DNA double-strand breaks.

DNA double-strand break (DSB)-mediated genome rearrangements are assumed to provide diverse raw genetic materials enabling accelerated adaptive evolution; however, it remains unclear about the consequences of massive simultaneous DSB formation in cells and their resulting phenotypic impact. Here, we establish an artificial genome-restructuring technology by conditionally introducing multiple genomic DSBs in vivo using a temperature-dependent endonuclease TaqI. Application in yeast and Arabidopsis thaliana generates strains with phenotypes, including improved ethanol production from xylose at higher temperature and increased plant biomass, that are stably inherited to offspring after multiple passages. High-throughput genome resequencing revealed that these strains harbor diverse rearrangements, including copy number variations, translocations in retrotransposons, and direct end-joinings at TaqI-cleavage sites. Furthermore, large-scale rearrangements occur frequently in diploid yeasts (28.1%) and tetraploid plants (46.3%), whereas haploid yeasts and diploid plants undergo minimal rearrangement. This genome-restructuring system (TAQing system) will enable rapid genome breeding and aid genome-evolution studies.


September 22, 2019

Genomic analyses of unique carbohydrate and phytohormone metabolism in the macroalga Gracilariopsis lemaneiformis (Rhodophyta).

Red algae are economically valuable for food and in industry. However, their genomic information is limited, and the genomic data of only a few species of red algae have been sequenced and deposited recently. In this study, we annotated a draft genome of the macroalga Gracilariopsis lemaneiformis (Gracilariales, Rhodophyta).The entire 88.98 Mb genome of Gp. lemaneiformis 981 was generated from 13,825 scaffolds (=500 bp) with an N50 length of 30,590 bp, accounting for approximately 91% of this algal genome. A total of 38.73 Mb of scaffold sequences were repetitive, and 9281 protein-coding genes were predicted. A phylogenomic analysis of 20 genomes revealed the relationship among the Chromalveolata, Rhodophyta, Chlorophyta and higher plants. Homology analysis indicated phylogenetic proximity between Gp. lemaneiformis and Chondrus crispus. The number of enzymes related to the metabolism of carbohydrates, including agar, glycoside hydrolases, glycosyltransferases, was abundant. In addition, signaling pathways associated with phytohormones such as auxin, salicylic acid and jasmonates are reported for the first time for this alga.We sequenced and analyzed a draft genome of the red alga Gp. lemaneiformis, and revealed its carbohydrate metabolism and phytohormone signaling characteristics. This work will be helpful in research on the functional and comparative genomics of the order Gracilariales and will enrich the genomic information on marine algae.


September 22, 2019

Genome-wide analysis of the NAC transcription factor family and their expression during the development and ripening of the Fragaria × ananassa fruits.

NAC proteins are a family of transcription factors which have a variety of important regulatory roles in plants. They present a very well conserved group of NAC subdomains in the N-terminal region and a highly variable domain at the C-terminus. Currently, knowledge concerning NAC family in the strawberry plant remains very limited. In this work, we analyzed the NAC family of Fragaria vesca, and a total of 112 NAC proteins were identified after we curated the annotations from the version 4.0.a1 genome. They were placed into the ligation groups (pseudo-chromosomes) and described its physicochemical and genetic features. A microarray transcriptomic analysis showed six of them expressed during the development and ripening of the Fragaria x ananassa fruit. Their expression patterns were studied in fruit (receptacle and achenes) in different stages of development and in vegetative tissues. Also, the expression level under different hormonal treatments (auxins, ABA) and drought stress was investigated. In addition, they were clustered with other NAC transcription factor with known function related to growth and development, senescence, fruit ripening, stress response, and secondary cell wall and vascular development. Our results indicate that these six strawberry NAC proteins could play different important regulatory roles in the process of development and ripening of the fruit, providing the basis for further functional studies and the selection for NAC candidates suitable for biotechnological applications.


September 22, 2019

Inpactor, integrated and parallel analyzer and classifier of LTR retrotransposons and its application for pineapple LTR retrotransposons diversity and dynamics.

One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.


September 22, 2019

A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea.

Transposable elements (TEs) are mobile DNA sequences known as drivers of genome evolution. Their impacts have been widely studied in animals, plants and insects, but little is known about them in microalgae. In a previous study, we compared the genetic polymorphisms between strains of the haptophyte microalga Tisochrysis lutea and suggested the involvement of active autonomous TEs in their genome evolution.To identify potentially autonomous TEs, we designed a pipeline named PiRATE (Pipeline to Retrieve and Annotate Transposable Elements, download: https://doi.org/10.17882/51795 ), and conducted an accurate TE annotation on a new genome assembly of T. lutea. PiRATE is composed of detection, classification and annotation steps. Its detection step combines multiple, existing analysis packages representing all major approaches for TE detection and its classification step was optimized for microalgal genomes. The efficiency of the detection and classification steps was evaluated with data on the model species Arabidopsis thaliana. PiRATE detected 81% of the TE families of A. thaliana and correctly classified 75% of them. We applied PiRATE to T. lutea genomic data and established that its genome contains 15.89% Class I and 4.95% Class II TEs. In these, 3.79 and 17.05% correspond to potentially autonomous and non-autonomous TEs, respectively. Annotation data was combined with transcriptomic and proteomic data to identify potentially active autonomous TEs. We identified 17 expressed TE families and, among these, a TIR/Mariner and a TIR/hAT family were able to synthesize their transposase. Both these TE families were among the three highest expressed genes in a previous transcriptomic study and are composed of highly similar copies throughout the genome of T. lutea. This sum of evidence reveals that both these TE families could be capable of transposing or triggering the transposition of potential related MITE elements.This manuscript provides an example of a de novo transposable element annotation of a non-model organism characterized by a fragmented genome assembly and belonging to a poorly studied phylum at genomic level. Integration of multi-omics data enabled the discovery of potential mobile TEs and opens the way for new discoveries on the role of these repeated elements in genomic evolution of microalgae.


September 22, 2019

Nucleotide-binding resistance gene signatures in sugar beet, insights from a new reference genome.

Nucleotide-binding (NB-ARC), leucine-rich-repeat genes (NLRs) account for 60.8% of resistance (R) genes molecularly characterized from plants. NLRs exist as large gene families prone to tandem duplication and transposition, with high sequence diversity among crops and their wild relatives. This diversity can be a source of new disease resistance, but difficulty in distinguishing specific sequences from homologous gene family members hinders characterization of resistance for improving crop varieties. Current genome sequencing and assembly technologies, especially those using long-read sequencing, are improving resolution of repeat-rich genomic regions and clarifying locations of duplicated genes, such as NLRs. Using the conserved NB-ARC domain as a model, 231 tentative NB-ARC loci were identified in a highly contiguous genome assembly of sugar beet, revealing diverged and truncated NB-ARC signatures as well as full-length sequences. The NB-ARC-associated proteins contained NLR resistance gene domains, including TIR, CC, and LRR, as well as other integrated domains. Phylogenetic relationships of partial and complete domains were determined, and patterns of physical clustering in the genome were evaluated. Comparison of sugar beet NB-ARC domains to validated R genes from monocots and eudicots suggested extensive B. vulgaris-specific subfamily expansions. The NLR landscape in the rhizomania resistance conferring Rz region of Chromosome 3 was characterized, identifying 26 NLR-like sequences spanning 20 MB. This work presents the first detailed view of NLR family composition in a member of the Caryophyllales, builds a foundation for additional disease resistance work in B. vulgaris, and demonstrates an additional nucleic-acid-based method for NLR prediction in non-model plant species. This article is protected by copyright. All rights reserved.This article is protected by copyright. All rights reserved.


September 22, 2019

Size and content of the sex-determining region of the Y chromosome in dioecious Mercurialis annua, a plant with homomorphic sex chromosomes.

Dioecious plants vary in whether their sex chromosomes are heteromorphic or homomorphic, but even homomorphic sex chromosomes may show divergence between homologues in the non-recombining, sex-determining region (SDR). Very little is known about the SDR of these species, which might represent particularly early stages of sex-chromosome evolution. Here, we assess the size and content of the SDR of the diploid dioecious herb Mercurialis annua, a species with homomorphic sex chromosomes and mild Y-chromosome degeneration. We used RNA sequencing (RNAseq) to identify new Y-linked markers for M. annua. Twelve of 24 transcripts showing male-specific expression in a previous experiment could be amplified by polymerase chain reaction (PCR) only from males, and are thus likely to be Y-linked. Analysis of genome-capture data from multiple populations of M. annua pointed to an additional six male-limited (and thus Y-linked) sequences. We used these markers to identify and sequence 17 sex-linked bacterial artificial chromosomes (BACs), which form 11 groups of non-overlapping sequences, covering a total sequence length of about 1.5 Mb. Content analysis of this region suggests that it is enriched for repeats, has low gene density, and contains few candidate sex-determining genes. The BACs map to a subset of the sex-linked region of the genetic map, which we estimate to be at least 14.5 Mb. This is substantially larger than estimates for other dioecious plants with homomorphic sex chromosomes, both in absolute terms and relative to their genome sizes. Our data provide a rare, high-resolution view of the homomorphic Y chromosome of a dioecious plant.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.