April 21, 2020  |  

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.


April 21, 2020  |  

Complete genome sequence of Salinigranum rubrum GX10T, an extremely halophilic archaeon isolated from a marine solar saltern

Since the first genome of a halophilic archaeon was sequenced in 2000, microbes inhabiting hypersaline environments have been investigated largely based on genomic characteristics. Salinigranum rubrum GX10T, the type species of the genus Salinigranum belonging to the euryarchaeal family Haloferacaceae, was isolated from the brine of Gangxi marine solar saltern near Weihai, China. Similar with most members of the class Halobacteria, S. rubrum GX10T is an extreme halophile requiring at least 1.5?M NaCl for growth and 3.1?M NaCl for optimum growth. We sequenced and annotated the complete genome of S. rubrum GX10T, which was found to be 4,973,118?bp and comprise one chromosome and five plasmids. A total of 4966 protein coding genes, 47 tRNA genes and 6 rRNA genes were obtained. The isoelectric point distribution for the predict proteins was observed with an acidic peak, which reflected the adaption of S. rubrum GX10T to the halophilic environment. Genes related to potassium uptake, sodium efflux as well as compatible-solute biosynthesis and transport were identified, which were responsible for the resistance to osmotic stress. Genes related to heavy metal resistance, CRISPR-Cas system and light transform system were also detected. This study reports the first genome in the genus Salinigranum and provides a basis for understanding resistance strategies to harsh environment at the genomic level.


April 21, 2020  |  

Transcriptome analysis based on a combination of sequencing platforms provides insights into leaf pigmentation in Acer rubrum.

Red maple (Acer rubrum L.) is one of the most common and widespread trees with colorful leaves. We found a mutant with red, yellow, and green leaf phenotypes in different branches, which provided ideal materials with the same genetic relationship, and little interference from the environment, for the study of complex metabolic networks that underly variations in the coloration of leaves. We applied a combination of NGS and SMRT sequencing to various red maple tissues.A total of 125,448 unigenes were obtained, of which 46 and 69 were thought to be related to the synthesis of anthocyanins and carotenoids, respectively. In addition, 88 unigenes were presumed to be involved in the chlorophyll metabolic pathway. Based on a comprehensive analysis of the pigment gene expression network, the mechanisms of leaf color were investigated. The massive accumulation of Cy led to its higher content and proportion than other pigments, which caused the redness of leaves. Yellow coloration was the result of the complete decomposition of chlorophyll pigments, the unmasking of carotenoid pigments, and a slight accumulation of Cy.This study provides a systematic analysis of color variations in the red maple. Moreover, mass sequence data obtained by deep sequencing will provide references for the controlled breeding of red maple.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.