April 21, 2020  |  

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions.

Chlorella vulgaris is a fast-growing fresh-water microalga cultivated at the industrial scale for applications ranging from food to biofuel production. To advance our understanding of its biology and to establish genetics tools for biotechnological manipulation, we sequenced the nuclear and organelle genomes of Chlorella vulgaris 211/11P by combining next generation sequencing and optical mapping of isolated DNA molecules. This hybrid approach allowed to assemble the nuclear genome in 14 pseudo-molecules with an N50 of 2.8 Mb and 98.9% of scaffolded genome. The integration of RNA-seq data obtained at two different irradiances of growth (high light-HL versus low light -LL) enabled to identify 10,724 nuclear genes, coding for 11,082 transcripts. Moreover 121 and 48 genes were respectively found in the chloroplast and mitochondrial genome. Functional annotation and expression analysis of nuclear, chloroplast and mitochondrial genome sequences revealed peculiar features of Chlorella vulgaris. Evidence of horizontal gene transfers from chloroplast to mitochondrial genome was observed. Furthermore, comparative transcriptomic analyses of LL vs HL provide insights into the molecular basis for metabolic rearrangement in HL vs. LL conditions leading to enhanced de novo fatty acid biosynthesis and triacylglycerol accumulation. The occurrence of a cytosolic fatty acid biosynthetic pathway can be predicted and its upregulation upon HL exposure is observed, consistent with increased lipid amount under HL. These data provide a rich genetic resource for future genome editing studies, and potential targets for biotechnological manipulation of Chlorella vulgaris or other microalgae species to improve biomass and lipid productivity.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

A draft nuclear-genome assembly of the acoel flatworm Praesagittifera naikaiensis.

Acoels are primitive bilaterians with very simple soft bodies, in which many organs, including the gut, are not developed. They provide platforms for studying molecular and developmental mechanisms involved in the formation of the basic bilaterian body plan, whole-body regeneration, and symbiosis with photosynthetic microalgae. Because genomic information is essential for future research on acoel biology, we sequenced and assembled the nuclear genome of an acoel, Praesagittifera naikaiensis.To avoid sequence contamination derived from symbiotic microalgae, DNA was extracted from embryos that were free of algae. More than 290x sequencing coverage was achieved using a combination of Illumina (paired-end and mate-pair libraries) and PacBio sequencing. RNA sequencing and Iso-Seq data from embryos, larvae, and adults were also obtained. First, a preliminary ~17-kilobase pair (kb) mitochondrial genome was assembled, which was deleted from the nuclear sequence assembly. As a result, a draft nuclear genome assembly was ~656 Mb in length, with a scaffold N50 of 117 kb and a contig N50 of 57 kb. Although ~70% of the assembled sequences were likely composed of repetitive sequences that include DNA transposons and retrotransposons, the draft genome was estimated to contain 22,143 protein-coding genes, ~99% of which were substantiated by corresponding transcripts. We could not find horizontally transferred microalgal genes in the acoel genome. Benchmarking Universal Single-Copy Orthologs analyses indicated that 77% of the conserved single-copy genes were complete. Pfam domain analyses provided a basic set of gene families for transcription factors and signaling molecules.Our present sequencing and assembly of the P. naikaiensis nuclear genome are comparable to those of other metazoan genomes, providing basic information for future studies of genic and genomic attributes of this animal group. Such studies may shed light on the origins and evolution of simple bilaterians. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Chromulinavorax destructans, a pathogen of microzooplankton that provides a window into the enigmatic candidate phylum Dependentiae.

Members of the major candidate phylum Dependentiae (a.k.a. TM6) are widespread across diverse environments from showerheads to peat bogs; yet, with the exception of two isolates infecting amoebae, they are only known from metagenomic data. The limited knowledge of their biology indicates that they have a long evolutionary history of parasitism. Here, we present Chromulinavorax destructans (Strain SeV1) the first isolate of this phylum to infect a representative from a widespread and ecologically significant group of heterotrophic flagellates, the microzooplankter Spumella elongata (Strain CCAP 955/1). Chromulinavorax destructans has a reduced 1.2 Mb genome that is so specialized for infection that it shows no evidence of complete metabolic pathways, but encodes an extensive transporter system for importing nutrients and energy in the form of ATP from the host. Its replication causes extensive reorganization and expansion of the mitochondrion, effectively surrounding the pathogen, consistent with its dependency on the host for energy. Nearly half (44%) of the inferred proteins contain signal sequences for secretion, including many without recognizable similarity to proteins of known function, as well as 98 copies of proteins with an ankyrin-repeat domain; ankyrin-repeats are known effectors of host modulation, suggesting the presence of an extensive host-manipulation apparatus. These observations help to cement members of this phylum as widespread and diverse parasites infecting a broad range of eukaryotic microbes.


April 21, 2020  |  

Polysaccharide utilization loci of North Sea Flavobacteriia as basis for using SusC/D-protein expression for predicting major phytoplankton glycans.

Marine algae convert a substantial fraction of fixed carbon dioxide into various polysaccharides. Flavobacteriia that are specialized on algal polysaccharide degradation feature genomic clusters termed polysaccharide utilization loci (PULs). As knowledge on extant PUL diversity is sparse, we sequenced the genomes of 53 North Sea Flavobacteriia and obtained 400 PULs. Bioinformatic PUL annotations suggest usage of a large array of polysaccharides, including laminarin, a-glucans, and alginate as well as mannose-, fucose-, and xylose-rich substrates. Many of the PULs exhibit new genetic architectures and suggest substrates rarely described for marine environments. The isolates’ PUL repertoires often differed considerably within genera, corroborating ecological niche-associated glycan partitioning. Polysaccharide uptake in Flavobacteriia is mediated by SusCD-like transporter complexes. Respective protein trees revealed clustering according to polysaccharide specificities predicted by PUL annotations. Using the trees, we analyzed expression of SusC/D homologs in multiyear phytoplankton bloom-associated metaproteomes and found indications for profound changes in microbial utilization of laminarin, a-glucans, ß-mannan, and sulfated xylan. We hence suggest the suitability of SusC/D-like transporter protein expression within heterotrophic bacteria as a proxy for the temporal utilization of discrete polysaccharides.


April 21, 2020  |  

Mitochondrial genome characterization of Melipona bicolor: Insights from the control region and gene expression data.

The stingless bee Melipona bicolor is the only bee in which true polygyny occurs. Its mitochondrial genome was first sequenced in 2008, but it was incomplete and no information about its transcription was known. We combined short and long reads of M. bicolor DNA with RNASeq data to obtain insights about mitochondrial evolution and gene expression in bees. The complete genome has 15,001?bp, including a control region of 255?bp that contains all conserved structures described in honeybees with the highest AT content reported so far for bees (98.1%), displaying a compact but functional region. Gene expression control is similar to other insects however unusual patterns of expression may suggest the existence of different isoforms for the mitochondrially encoded 12S rRNA. Results reveal unique and shared features of the mitochondrial genome in terms of sequence evolution and gene expression making M. bicolor an interesting model to study mitochondrial genomic evolution. Copyright © 2019 Elsevier B.V. All rights reserved.


April 21, 2020  |  

The conservation of polyol transporter proteins and their involvement in lichenized Ascomycota.

In lichen symbiosis, polyol transfer from green algae is important for acquiring the fungal carbon source. However, the existence of polyol transporter genes and their correlation with lichenization remain unclear. Here, we report candidate polyol transporter genes selected from the genome of the lichen-forming fungus (LFF) Ramalina conduplicans. A phylogenetic analysis using characterized polyol and monosaccharide transporter proteins and hypothetical polyol transporter proteins of R. conduplicans and various ascomycetous fungi suggested that the characterized yeast’ polyol transporters form multiple clades with the polyol transporter-like proteins selected from the diverse ascomycetous taxa. Thus, polyol transporter genes are widely conserved among Ascomycota, regardless of lichen-forming status. In addition, the phylogenetic clusters suggested that LFFs belonging to Lecanoromycetes have duplicated proteins in each cluster. Consequently, the number of sequences similar to characterized yeast’ polyol transporters were evaluated using the genomes of 472 species or strains of Ascomycota. Among these, LFFs belonging to Lecanoromycetes had greater numbers of deduced polyol transporter proteins. Thus, various polyol transporters are conserved in Ascomycota and polyol transporter genes appear to have expanded during the evolution of Lecanoromycetes. Copyright © 2019 British Mycological Society. Published by Elsevier Ltd. All rights reserved.


April 21, 2020  |  

Carbohydrate catabolic capability of a Flavobacteriia bacterium isolated from hadal water.

Flavobacteriia are abundant in many marine environments including hadal waters, as demonstrated recently. However, it is unclear how this flavobacterial population adapts to hadal conditions. In this study, extensive comparative genomic analyses were performed for the flavobacterial strain Euzebyella marina RN62 isolated from the Mariana Trench hadal water in low abundance. The complete genome of RN62 possessed a considerable number of carbohydrate-active enzymes with a different composition. There was a predominance of GH family 13 proteins compared to closely related relatives, suggesting that RN62 has preserved a certain capacity for carbohydrate utilization and that the hadal ocean may hold an organic matter reservoir distinct from the surface ocean. Additionally, RN62 possessed potential intracellular cycling of the glycogen/starch pathway, which may serve as a strategy for carbon storage and consumption in response to nutrient pulse and starvation. Moreover, the discovery of higher glycoside hydrolase dissimilarities among Flavobacteriia, compared to peptidases and transporters, suggested variation in polysaccharide utilization related traits as an important ecophysiological factor in response to environmental alterations, such as decreased labile organic carbon in hadal waters. The presence of abundant toxin exporting, transcription and signal transduction related genes in RN62 may further help to survive in hadal conditions, including high pressure/low temperature.Copyright © 2019 Elsevier GmbH. All rights reserved.


April 21, 2020  |  

A siphonous macroalgal genome suggests convergent functions of homeobox genes in algae and land plants.

Genome evolution and development of unicellular, multinucleate macroalgae (siphonous algae) are poorly known, although various multicellular organisms have been studied extensively. To understand macroalgal developmental evolution, we assembled the ~26?Mb genome of a siphonous green alga, Caulerpa lentillifera, with high contiguity, containing 9,311 protein-coding genes. Molecular phylogeny using 107 nuclear genes indicates that the diversification of the class Ulvophyceae, including C. lentillifera, occurred before the split of the Chlorophyceae and Trebouxiophyceae. Compared with other green algae, the TALE superclass of homeobox genes, which expanded in land plants, shows a series of lineage-specific duplications in this siphonous macroalga. Plant hormone signalling components were also expanded in a lineage-specific manner. Expanded transport regulators, which show spatially different expression, suggest that the structural patterning strategy of a multinucleate cell depends on diversification of nuclear pore proteins. These results not only imply functional convergence of duplicated genes among green plants, but also provide insight into evolutionary roots of green plants. Based on the present results, we propose cellular and molecular mechanisms involved in the structural differentiation in the siphonous alga. © The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.


April 21, 2020  |  

Plastid genomes from diverse glaucophyte genera reveal a largely conserved gene content and limited architectural diversity.

Plastid genome (ptDNA) data of Glaucophyta have been limited for many years to the genus Cyanophora. Here, we sequenced the ptDNAs of Gloeochaete wittrockiana, Cyanoptyche gloeocystis, Glaucocystis incrassata, and Glaucocystis sp. BBH. The reported sequences are the first genome-scale plastid data available for these three poorly studied glaucophyte genera. Although the Glaucophyta plastids appear morphologically “ancestral,” they actually bear derived genomes not radically different from those of red algae or viridiplants. The glaucophyte plastid coding capacity is highly conserved (112 genes shared) and the architecture of the plastid chromosomes is relatively simple. Phylogenomic analyses recovered Glaucophyta as the earliest diverging Archaeplastida lineage, but the position of viridiplants as the first branching group was not rejected by the approximately unbiased test. Pairwise distances estimated from 19 different plastid genes revealed that the highest sequence divergence between glaucophyte genera is frequently higher than distances between species of different classes within red algae or viridiplants. Gene synteny and sequence similarity in the ptDNAs of the two Glaucocystis species analyzed is conserved. However, the ptDNA of Gla. incrassata contains a 7.9-kb insertion not detected in Glaucocystis sp. BBH. The insertion contains ten open reading frames that include four coding regions similar to bacterial serine recombinases (two open reading frames), DNA primases, and peptidoglycan aminohydrolases. These three enzymes, often encoded in bacterial plasmids and bacteriophage genomes, are known to participate in the mobilization and replication of DNA mobile elements. It is therefore plausible that the insertion in Gla. incrassata ptDNA is derived from a DNA mobile element.


April 21, 2020  |  

Genome and transcriptome sequencing of the astaxanthin-producing green microalga, Haematococcus pluvialis.

Haematococcus pluvialis is a freshwater species of Chlorophyta, family Haematococcaceae. It is well known for its capacity to synthesize high amounts of astaxanthin, which is a strong antioxidant that has been utilized in aquaculture and cosmetics. To improve astaxanthin yield and to establish genetic resources for H. pluvialis, we performed whole-genome sequencing, assembly, and annotation of this green microalga. A total of 83.1 Gb of raw reads were sequenced. After filtering the raw reads, we subsequently generated a draft assembly with a genome size of 669.0?Mb, a scaffold N50 of 288.6?kb, and predicted 18,545 genes. We also established a robust phylogenetic tree from 14 representative algae species. With additional transcriptome data, we revealed some novel potential genes that are involved in the synthesis, accumulation, and regulation of astaxanthin production. In addition, we generated an isoform-level reference transcriptome set of 18,483 transcripts with high confidence. Alternative splicing analysis demonstrated that intron retention is the most frequent mode. In summary, we report the first draft genome of H. pluvialis. These genomic resources along with transcriptomic data provide a solid foundation for the discovery of the genetic basis for theoretical and commercial astaxanthin enrichment.


April 21, 2020  |  

Genetic basis for the establishment of endosymbiosis in Paramecium.

The single-celled ciliate Paramecium bursaria is an indispensable model for investigating endosymbiosis between protists and green-algal symbionts. To elucidate the mechanism of this type of endosymbiosis, we combined PacBio and Illumina sequencing to assemble a high-quality and near-complete macronuclear genome of P. bursaria. The genomic characteristics and phylogenetic analyses indicate that P. bursaria is the basal clade of the Paramecium genus. Through comparative genomic analyses with its close relatives, we found that P. bursaria encodes more genes related to nitrogen metabolism and mineral absorption, but encodes fewer genes involved in oxygen binding and N-glycan biosynthesis. A comparison of the transcriptomic profiles between P. bursaria with and without endosymbiotic Chlorella showed differential expression of a wide range of metabolic genes. We selected 32 most differentially expressed genes to perform RNA interference experiment in P. bursaria, and found that P. bursaria can regulate the abundance of their symbionts through glutamine supply. This study provides novel insights into Paramecium evolution and will extend our knowledge of the molecular mechanism for the induction of endosymbiosis between P. bursaria and green algae.


April 21, 2020  |  

Genome analysis and genetic transformation of a water surface-floating microalga Chlorococcum sp. FFG039.

Microalgal harvesting and dewatering are the main bottlenecks that need to be overcome to tap the potential of microalgae for production of valuable compounds. Water surface-floating microalgae form robust biofilms, float on the water surface along with gas bubbles entrapped under the biofilms, and have great potential to overcome these bottlenecks. However, little is known about the molecular mechanisms involved in the water surface-floating phenotype. In the present study, we analysed the genome sequence of a water surface-floating microalga Chlorococcum sp. FFG039, with a next generation sequencing technique to elucidate the underlying mechanisms. Comparative genomics study with Chlorococcum sp. FFG039 and other non-floating green microalgae revealed some of the unique gene families belonging to this floating microalga, which may be involved in biofilm formation. Furthermore, genetic transformation of this microalga was achieved with an electroporation method. The genome information and transformation techniques presented in this study will be useful to obtain molecular insights into the water surface-floating phenotype of Chlorococcum sp. FFG039.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.