Menu
July 7, 2019

Resequencing and annotation of the Nostoc punctiforme ATTC 29133 genome: facilitating biofuel and high-value chemical production.

Cyanobacteria have the potential to produce bulk and fine chemicals and members belonging to Nostoc sp. have received particular attention due to their relatively fast growth rate and the relative ease with which they can be harvested. Nostoc punctiforme is an aerobic, motile, Gram-negative, filamentous cyanobacterium that has been studied intensively to enhance our understanding of microbial carbon and nitrogen fixation. The genome of the type strain N. punctiforme ATCC 29133 was sequenced in 2001 and the scientific community has used these genome data extensively since then. Advances in bioinformatics tools for sequence annotation and the importance of this organism prompted us to resequence and reanalyze its genome and to make both, the initial and improved annotation, available to the scientific community. The new draft genome has a total size of 9.1 Mbp and consists of 65 contiguous pieces of DNA with a GC content of 41.38% and 7664 protein-coding genes. Furthermore, the resequenced genome is slightly (5152 bp) larger and contains 987 more genes with functional prediction when compared to the previously published version. We deposited the annotation of both genomes in the Department of Energy’s IMG database to facilitate easy genome exploration by the scientific community without the need of in-depth bioinformatics skills. We expect that an facilitated access and ability to search the N. punctiforme ATCC 29133 for genes of interest will significantly facilitate metabolic engineering and genome prospecting efforts and ultimately the synthesis of biofuels and natural products from this keystone organism and closely related cyanobacteria.


July 7, 2019

Identification of symmetrical RNA editing events in the mitochondria of Salvia miltiorrhiza by strand-specific RNA sequencing.

Salvia miltiorrhiza is one of the most widely-used medicinal plants. Here, we systematically analyzed the RNA editing events in its mitochondria. We developed a pipeline using REDItools to predict RNA editing events from stand-specific RNA-Seq data. The predictions were validated using reverse transcription, RT-PCR amplification and Sanger sequencing experiments. Putative sequences motifs were characterized. Comparative analyses were carried out between S. miltiorrhiza, Arabidopsis thaliana and Oryza sativa. We discovered 1123 editing sites, including 225 “C to U” sites in the protein-coding regions. Fourteen of sixteen (87.5%) sites were validated. Three putative DNA motifs were identified around the predicted sites. The nucleotides on both strands at 115 of the 225 sites had undergone RNA editing, which we called symmetrical RNA editing (SRE). Four of six these SRE sites (66.7%) were experimentally confirmed. Re-examination of strand-specific RNA-Seq data from A. thaliana and O. sativa identified 327 and 369 SRE sites respectively. 78, 20 and 13 SRE sites were found to be conserved among A. thaliana, O. sativa and S. miltiorrhiza respectively. This study provides a comprehensive picture of RNA editing events in the mitochondrial genome of S. miltiorrhiza. We identified SREs for the first time, which may represent a universal phenomenon.


July 7, 2019

A vast genomic deletion in the C56BL/6 genome affects different genes within the Ifi200 cluster on chromosome 1 and mediates obesity and insulin resistance.

Obesity, the excessive accumulation of body fat, is a highly heritable and genetically heterogeneous disorder. The complex, polygenic basis for the disease consisting of a network of different gene variants is still not completely known.In the current study we generated a BAC library of the obese-prone NZO strain to clarify the genomic alteration within the gene cluster Ifi200 on chr.1 including Ifi202b, an obesity gene that is in contrast to NZO not expressed in the lean B6 mouse. With the PacBio sequencing data of NZO BAC clones we identified a deletion spanning approximately 261.8 kb in the B6 reference genome. The deletion affects different members of the Ifi200 gene family which also includes the original first exon and 5′-regulatory parts of the Ifi202b gene and suggests to be the relevant cause of its expression deficiency in B6. In addition, the generation and characterization of congenic mice carrying the critical fragment on the B6 background demonstrate its crucial role for obesity and insulin resistance.Our data reveal the reconstruction of a complex genomic region on mouse chr.1 resulting from deletions and duplications of Ifi200 genes and suggest to be relevant for the development of obesity. The results further demonstrate the complexity of the disease and highlight the importance for studying rare genetic variants as they can be causal for large effects.


July 7, 2019

Cytosine methylation at CpCpG sites triggers accumulation of non-CpG methylation in gene bodies.

Methylation of cytosine is an epigenetic mark involved in the regulation of transcription, usually associated with transcriptional repression. In mammals, methylated cytosines are found predominantly in CpGs but in plants non-CpG methylation (in the CpHpG or CpHpH contexts, where H is A, C or T) is also present and is associated with the transcriptional silencing of transposable elements. In addition, CpG methylation is found in coding regions of active genes. In the absence of the demethylase of lysine 9 of histone 3 (IBM1), a subset of body-methylated genes acquires non-CpG methylation. This was shown to alter their expression and affect plant development. It is not clear why only certain body-methylated genes gain non-CpG methylation in the absence of IBM1 and others do not. Here we describe a link between CpG methylation and the establishment of methylation in the CpHpG context that explains the two classes of body-methylated genes. We provide evidence that external cytosines of CpCpG sites can only be methylated when internal cytosines are methylated. CpCpG sites methylated in both cytosines promote spreading of methylation in the CpHpG context in genes protected by IBM1. In contrast, CpCpG sites remain unmethylated in IBM1-independent genes and do not promote spread of CpHpG methylation.© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.


July 7, 2019

Comparative genomics and transcriptome analysis of Aspergillus niger and metabolic engineering for citrate production.

Despite a long and successful history of citrate production in Aspergillus niger, the molecular mechanism of citrate accumulation is only partially understood. In this study, we used comparative genomics and transcriptome analysis of citrate-producing strains-namely, A. niger H915-1 (citrate titer: 157?g?L(-1)), A1 (117?g?L(-1)), and L2 (76?g?L(-1))-to gain a genome-wide view of the mechanism of citrate accumulation. Compared with A. niger A1 and L2, A. niger H915-1 contained 92 mutated genes, including a succinate-semialdehyde dehydrogenase in the ?-aminobutyric acid shunt pathway and an aconitase family protein involved in citrate synthesis. Furthermore, transcriptome analysis of A. niger H915-1 revealed that the transcription levels of 479 genes changed between the cell growth stage (6?h) and the citrate synthesis stage (12?h, 24?h, 36?h, and 48?h). In the glycolysis pathway, triosephosphate isomerase was up-regulated, whereas pyruvate kinase was down-regulated. Two cytosol ATP-citrate lyases, which take part in the cycle of citrate synthesis, were up-regulated, and may coordinate with the alternative oxidases in the alternative respiratory pathway for energy balance. Finally, deletion of the oxaloacetate acetylhydrolase gene in H915-1 eliminated oxalate formation but neither influence on pH decrease nor difference in citrate production were observed.


July 7, 2019

Review of the algal biology program within the National Alliance for Advanced Biofuels and Bioproducts

In 2010, when the National Alliance for Advanced Biofuels and Bioproducts (NAABB) consortium began, little was known about the molecular basis of algal biomass or oil production. Very few algal genome sequences were available and efforts to identify the best-producing wild species through bioprospecting approaches had largely stalled after the U.S. Department of Energy’s Aquatic Species Program. This lack of knowledge included how reduced carbon was partitioned into storage products like triglycerides or starch and the role played by metabolite remodeling in the accumulation of energy-dense storage products. Furthermore, genetic transformation and metabolic engineering approaches to improve algal biomass and oil yields were in their infancy. Genome sequencing and transcriptional profiling were becoming less expensive, however; and the tools to annotate gene expression profiles under various growth and engineered conditions were just starting to be developed for algae. It was in this context that an integrated algal biology program was introduced in the NAABB to address the greatest constraints limiting algal biomass yield. This review describes the NAABB algal biology program, including hypotheses, research objectives, and strategies to move algal biology research into the twenty-first century and to realize the greatest potential of algae biomass systems to produce biofuels.


July 7, 2019

The unique genomic landscape surrounding the EPSPS gene in glyphosate resistant Amaranthus palmeri: a repetitive path to resistance.

The expanding number and global distributions of herbicide resistant weedy species threaten food, fuel, fiber and bioproduct sustainability and agroecosystem longevity. Amongst the most competitive weeds, Amaranthus palmeri S. Wats has rapidly evolved resistance to glyphosate primarily through massive amplification and insertion of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene across the genome. Increased EPSPS gene copy numbers results in higher titers of the EPSPS enzyme, the target of glyphosate, and confers resistance to glyphosate treatment. To understand the genomic unit and mechanism of EPSPS gene copy number proliferation, we developed and used a bacterial artificial chromosome (BAC) library from a highly resistant biotype to sequence the local genomic landscape flanking the EPSPS gene.By sequencing overlapping BACs, a 297 kb sequence was generated, hereafter referred to as the “EPSPS cassette.” This region included several putative genes, dense clusters of tandem and inverted repeats, putative helitron and autonomous replication sequences, and regulatory elements. Whole genome shotgun sequencing (WGS) of two biotypes exhibiting high and no resistance to glyphosate was performed to compare genomic representation across the EPSPS cassette. Mapping of sequences for both biotypes to the reference EPSPS cassette revealed significant differences in upstream and downstream sequences relative to EPSPS with regard to both repetitive units and coding content between these biotypes. The differences in sequence may have resulted from a compounded-building mechanism such as repetitive transpositional events. The association of putative helitron sequences with the cassette suggests a possible amplification and distribution mechanism. Flow cytometry revealed that the EPSPS cassette added measurable genomic content.The adoption of glyphosate resistant cropping systems in major crops such as corn, soybean, cotton and canola coupled with excessive use of glyphosate herbicide has led to evolved glyphosate resistance in several important weeds. In Amaranthus palmeri, the amplification of the EPSPS cassette, characterized by a complex array of repetitive elements and putative helitron sequences, suggests an adaptive structural genomic mechanism that drives amplification and distribution around the genome. The added genomic content not found in glyphosate sensitive plants may be driving evolution through genome expansion.


July 7, 2019

Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of the genus Nocardia.


July 7, 2019

Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome.

Using second-generation sequencing (SGS) RNA-Seq strategies, extensive alterative splicing prediction is impractical and high variability of isoforms expression quantification is inevitable in organisms without true reference dataset. we report the development of a novel analysis method, termed hybrid sequencing and map finding (HySeMaFi) which combines the specific strengths of third-generation sequencing (TGS) (PacBio SMRT sequencing) and SGS (Illumina Hi-Seq/MiSeq sequencing) to effectively decipher gene splicing and to reliably estimate the isoforms abundance. Error-corrected long reads from TGS are capable of capturing full length transcripts or as large partial transcript fragments. Both true and false isoforms, from a particular gene, as well as that containing all possible exons, could be generated by employing different assembly methods in SGS. We first develop an effective method which can establish the mapping relationship between the error-corrected long reads and the longest assembled contig in every corresponding gene. According to the mapping data, the true splicing pattern of the genes was reliably detected, and quantification of the isoforms was also effectively determined. HySeMaFi is also the optimal strategy by which to decipher the full exon expression of a specific gene when the longest mapped contigs were chosen as the reference set.


July 7, 2019

ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing.

Isolating and sequencing specific regions in a genome is a cornerstone of molecular biology. This has been facilitated by computationally encoding the thermodynamics of DNA hybridization for automated design of hybridization and priming oligonucleotides. However, the repetitive composition of genomes challenges the identification of target-specific oligonucleotides, which limits genetics and genomics research on many species. Here, a tool called ThermoAlign was developed that ensures the design of target-specific primer pairs for DNA amplification. This is achieved by evaluating the thermodynamics of hybridization for full-length oligonucleotide-template alignments – thermoalignments – across the genome to identify primers predicted to bind specifically to the target site. For amplification-based resequencing of regions that cannot be amplified by a single primer pair, a directed graph analysis method is used to identify minimum amplicon tiling paths. Laboratory validation by standard and long-range polymerase chain reaction and amplicon resequencing with maize, one of the most repetitive genomes sequenced to date (˜85% repeat content), demonstrated the specificity-by-design functionality of ThermoAlign. ThermoAlign is released under an open source license and bundled in a dependency-free container for wide distribution. It is anticipated that this tool will facilitate multiple applications in genetics and genomics and be useful in the workflow of high-throughput targeted resequencing studies.


July 7, 2019

Genetic and genomic tools for Cannabis sativa

The Cannabis industry is currently one of the fastest growing industries in the United States. Given the changing legal status of the plant, and the rapidly advancing research, updated information on the advancement of Cannabis genomics is needed. This versatile plant is used as medicine and for food, fiber, and bioremediation. Insights from modern, high-throughput genomic technology are revolutionizing our understanding of the plant and are providing new tools to further improve our knowledge and utilization of this unique species. This review quantifies and evaluates the currently available genomic resources for Cannabis research, including six whole-genome assemblies, two transcriptomes, and 393 other substantial genomic resources, as well as other smaller publicly available genetic and genomic resources. The open-source approaches followed by many leading scientists in the field promote collaboration and facilitate these rapid advances.


July 7, 2019

Sequencing and de novo assembly of a near complete indica rice genome.

A high-quality reference genome is critical for understanding genome structure, genetic variation and evolution of an organism. Here we report the de novo assembly of an indica rice genome Shuhui498 (R498) through the integration of single-molecule sequencing and mapping data, genetic map and fosmid sequence tags. The 390.3?Mb assembly is estimated to cover more than 99% of the R498 genome and is more continuous than the current reference genomes of japonica rice Nipponbare (MSU7) and Arabidopsis thaliana (TAIR10). We annotate high-quality protein-coding genes in R498 and identify genetic variations between R498 and Nipponbare and presence/absence variations by comparing them to 17 draft genomes in cultivated rice and its closest wild relatives. Our results demonstrate how to de novo assemble a highly contiguous and near-complete plant genome through an integrative strategy. The R498 genome will serve as a reference for the discovery of genes and structural variations in rice.


July 7, 2019

Genome sequencing supports a multi-vertex model for Brassiceae species.

The economically important Brassica genus is a good system for studying the evolution of polyploids. Brassica genomes have undergone whole genome triplication (WGT). Subgenome dominance phenomena such as biased gene fractionation and dominant gene expression were observed in tripled genomes of Brassica. The genome of radish (Raphanus sativus), another important crop of tribe Brassiceae, was derived from the same WGT event and shows similar subgenome dominance. These findings and molecular dating indicate that radish occupies a similar evolutionary origin as that of Brassica species. Here, we extended the Brassica “triangle of U” to a multi-vertex model. This model describes the relationships or the potential of using more Brassiceae mesohexaploids in the creation of new allotetraploid oil or vegetable crop species. Copyright © 2017 Elsevier Ltd. All rights reserved.


July 7, 2019

The complete chloroplast genome sequence of tung tree (Vernicia fordii): Organization and phylogenetic relationships with other angiosperms.

Tung tree (Vernicia fordii) is an economically important tree widely cultivated for industrial oil production in China. To better understand the molecular basis of tung tree chloroplasts, we sequenced and characterized its genome using PacBio RS II sequencing platforms. The chloroplast genome was sequenced with 161,528?bp in length, composed with one pair of inverted repeats (IRs) of 26,819?bp, which were separated by one small single copy (SSC; 18,758?bp) and one large single copy (LSC; 89,132?bp). The genome contains 114 genes, coding for 81 protein, four ribosomal RNAs and 29 transfer RNAs. An expansion with integration of an additional rps19 gene in the IR regions was identified. Compared to the chloroplast genome of Jatropha curcas, a species from the same family, the tung tree chloroplast genome is distinct with 85 single nucleotide polymorphisms (SNPs) and 82 indels. Phylogenetic analysis suggests that V. fordii is a sister species with J. curcas within the Eurosids I. The nucleotide sequence provides vital molecular information for understanding the biology of this important oil tree.


July 7, 2019

Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch.

Silver birch (Betula pendula) is a pioneer boreal tree that can be induced to flower within 1 year. Its rapid life cycle, small (440-Mb) genome, and advanced germplasm resources make birch an attractive model for forest biotechnology. We assembled and chromosomally anchored the nuclear genome of an inbred B. pendula individual. Gene duplicates from the paleohexaploid event were enriched for transcriptional regulation, whereas tandem duplicates were overrepresented by environmental responses. Population resequencing of 80 individuals showed effective population size crashes at major points of climatic upheaval. Selective sweeps were enriched among polyploid duplicates encoding key developmental and physiological triggering functions, suggesting that local adaptation has tuned the timing of and cross-talk between fundamental plant processes. Variation around the tightly-linked light response genes PHYC and FRS10 correlated with latitude and longitude and temperature, and with precipitation for PHYC. Similar associations characterized the growth-promoting cytokinin response regulator ARR1, and the wood development genes KAK and MED5A.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.