Scaffolding Archives - Page 20 of 21

September 22, 2019 |

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Cannabis has been cultivated for millennia with distinct cultivars providing either fiber and grain or tetrahydrocannabinol. Recent demand for cannabidiol rather than tetrahydrocannabinol has favored the breeding of admixed cultivars with extremely high cannabidiol content. Despite several draft Cannabis genomes, the genomic structure of cannabinoid synthase loci has remained elusive. A genetic map derived from a tetrahydrocannabinol/cannabidiol segregating population and a complete chromosome assembly from a high-cannabidiol cultivar together resolve the linkage of cannabidiolic and tetrahydrocannabinolic acid synthase gene clusters which are associated with transposable elements. High-cannabidiol cultivars appear to have been generated by integrating hemp-type cannabidiolic acid synthase gene clusters into a background of marijuana-type cannabis. Quantitative trait locus mapping suggests that overall drug potency, however, is associated with other genomic regions needing additional study.

September 22, 2019 |

Update on Tetracycline Susceptibility of Pediococcus acidilactici Based on Strains Isolated from Swiss Cheese and Whey.

Bacterial strains used as starter cultures in the production of fermented foods may act as reservoirs for antibiotic resistance (AbR) genes. To avoid the introduction of such genes into the food chain, the presence of acquired AbR in bacterial strains added to food must be tested. Standard protocols and microbiological cut-off values have been defined to provide practitioners with a basis for evaluating whether their bacterial isolates harbor an acquired resistance to a given antibiotic. Here, we tested the AbR of 24 strains of Pediococcus acidilactici by using the standard protocol and microbiological cut-off values recommended by the European Food Safety Authority. Phenotypic data were complemented by searching for known AbR genes using an in silico analysis of whole genomes. The majority (54.2%) of the strains were able to grow at a tetracycline concentration above the defined cut-off, even though only one strain carried a known tetracycline resistance gene, tetM. The same strain also carried the AbR gene of an erythromycin resistance methylase, ermA, and displayed resistance toward clindamycin and erythromycin. Our results bolster the scarce data on the sensitivity of P. acidilactici to tetracycline and suggest that the microbiological cut-off recommended by the European Food Safety Authority for this antibiotic should be amended.

September 22, 2019 |

How complete are “complete” genome assemblies?-An avian perspective.

The genomics revolution has led to the sequencing of a large variety of nonmodel organisms often referred to as “whole” or “complete” genome assemblies. But how complete are these, really? Here, we use birds as an example for nonmodel vertebrates and find that, although suitable in principle for genomic studies, the current standard of short-read assemblies misses a significant proportion of the expected genome size (7% to 42%; mean 20 ± 9%). In particular, regions with strongly deviating nucleotide composition (e.g., guanine-cytosine-[GC]-rich) and regions highly enriched in repetitive DNA (e.g., transposable elements and satellite DNA) are usually underrepresented in assemblies. However, long-read sequencing technologies successfully characterize many of these underrepresented GC-rich or repeat-rich regions in several bird genomes. For instance, only ~2% of the expected total base pairs are missing in the last chicken reference (galGal5). These assemblies still contain thousands of gaps (i.e., fragmented sequences) because some chromosomal structures (e.g., centromeres) likely contain arrays of repetitive DNA that are too long to bridge with currently available technologies. We discuss how to minimize the number of assembly gaps by combining the latest available technologies with complementary strengths. At last, we emphasize the importance of knowing the location, size and potential content of assembly gaps when making population genetic inferences about adjacent genomic regions.© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

September 22, 2019 |

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

Selection of highly productive algal strains is crucial for establishing economically viable biomass and biopro- duct cultivation systems. Characterization of algal genomes, including understanding strain-specific differences in genome content and architecture is a critical step in this process. Using genomic analyses, we demonstrate significant differences between three strains of Chlorella sorokiniana (strain 1228, UTEX 1230, and DOE1412). We found that unique, strain-specific genes comprise a substantial proportion of each genome, and genomic regions with> 80% local nucleotide identity constitute <15% of each genome among the strains, indicating substantial strain specific evolution. Furthermore, cataloging of meiosis and other sex-related genes in C. sor- okiniana strains suggests strategic breeding could be utilized to improve biomass and bioproduct yields if a sexual cycle can be characterized. Finally, preliminary investigation of epigenetic machinery suggests the pre- sence of potentially unique transcriptional regulation in each strain. Our data demonstrate that these three C. sorokiniana strains represent significantly different genomic content. Based on these findings, we propose in- dividualized assessment of each strain for potential performance in cultivation systems.

September 22, 2019 |

The genomic architecture and molecular evolution of ant odorant receptors.

The massive expansions of odorant receptor (OR) genes in ant genomes are notable examples of rapid genome evolution and adaptive gene duplication. However, the molecular mechanisms leading to gene family expansion remain poorly understood, partly because available ant genomes are fragmentary. Here, we present a highly contiguous, chromosome-level assembly of the clonal raider ant genome, revealing the largest known OR repertoire in an insect. While most ant ORs originate via local tandem duplication, we also observe several cases of dispersed duplication followed by tandem duplication in the most rapidly evolving OR clades. We found that areas of unusually high transposable element density (TE islands) were depauperate in ORs in the clonal raider ant, and found no evidence for retrotransposition of ORs. However, OR loci were enriched for transposons relative to the genome as a whole, potentially facilitating tandem duplication by unequal crossing over. We also found that ant OR genes are highly AT-rich compared to other genes. In contrast, in flies, OR genes are dispersed and largely isolated within the genome, and we find that fly ORs are not AT-rich. The genomic architecture and composition of ant ORs thus show convergence with the unrelated vertebrate ORs rather than the related fly ORs. This might be related to the greater gene numbers and/or potential similarities in gene regulation between ants and vertebrates as compared to flies.© 2018 McKenzie and Kronauer; Published by Cold Spring Harbor Laboratory Press.

September 22, 2019 |

Genomic discovery of the hypsin gene and biosynthetic pathways for terpenoids in Hypsizygus marmoreus.

Hypsizygus marmoreus (Beech mushroom) is a popular ingredient in Asian cuisine. The medicinal effects of its bioactive compounds such as hypsin and hypsiziprenol have been reported, but the genetic basis or biosynthesis of these components is unknown.In this study, we sequenced a reference strain of H. marmoreus (Haemi 51,987-8). We evaluated various assembly strategies, and as a result the Allpaths and PBJelly produced the best assembly. The resulting genome was 42.7 Mbp in length and annotated with 16,627 gene models. A putative gene (Hypma_04324) encoding the antifungal and antiproliferative hypsin protein with 75% sequence identity with the previously known N-terminal sequence was identified. Carbohydrate active enzyme analysis displayed the typical feature of white-rot fungi where auxiliary activity and carbohydrate-binding modules were enriched. The genome annotation revealed four terpene synthase genes responsible for terpenoid biosynthesis. From the gene tree analysis, we identified that terpene synthase genes can be classified into six clades. Four terpene synthase genes of H. marmoreus belonged to four different groups that implies they may be involved in the synthesis of different structures of terpenes. A terpene synthase gene cluster was well-conserved in Agaricomycetes genomes, which contained known biosynthesis and regulatory genes.Genome sequence analysis of this mushroom led to the discovery of the hypsin gene. Comparative genome analysis revealed the conserved gene cluster for terpenoid biosynthesis in the genome. These discoveries will further our understanding of the biosynthesis of medicinal bioactive molecules in this edible mushroom.

September 22, 2019 |

Whole-genome sequencing of Chinese yellow catfish provides a valuable genetic resource for high-throughput identification of toxin genes.

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ˜6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.

September 22, 2019 |

The chromosome-level quality genome provides insights into the evolution of the biosynthesis genes for aroma compounds of Osmanthus fragrans.

Sweet osmanthus (Osmanthus fragrans) is a very popular ornamental tree species throughout Southeast Asia and USA particularly for its extremely fragrant aroma. We constructed a chromosome-level reference genome of O. fragrans to assist in studies of the evolution, genetic diversity, and molecular mechanism of aroma development. A total of over 118?Gb of polished reads was produced from HiSeq (45.1?Gb) and PacBio Sequel (73.35?Gb), giving 100× depth coverage for long reads. The combination of Illumina-short reads, PacBio-long reads, and Hi-C data produced the final chromosome quality genome of O. fragrans with a genome size of 727?Mb and a heterozygosity of 1.45 %. The genome was annotated using de novo and homology comparison and further refined with transcriptome data. The genome of O. fragrans was predicted to have?45,542 genes, of which 95.68 % were functionally annotated. Genome annotation found 49.35 % as the repetitive sequences, with long terminal repeats (LTR) being the richest (28.94 %). Genome evolution analysis indicated the evidence of whole-genome duplication 15 million years ago, which contributed to the current content of 45,242 genes. Metabolic analysis revealed that linalool, a monoterpene is the main aroma compound. Based on the genome and transcriptome, we further demonstrated the direct connection between terpene synthases (TPSs) and the rich aromatic molecules in O. fragrans. We identified three new flower-specific TPS genes, of which the expression coincided with the production of linalool. Our results suggest that the high number of TPS genes and the flower tissue- and stage-specific TPS genes expressions might drive the strong unique aroma production of O. fragrans.

September 22, 2019 |

Correcting palindromes in long reads after whole-genome amplification.

Next-generation sequencing requires sufficient DNA to be available. If limited, whole-genome amplification is applied to generate additional amounts of DNA. Such amplification often results in many chimeric DNA fragments, in particular artificial palindromic sequences, which limit the usefulness of long sequencing reads.Here, we present Pacasus, a tool for correcting such errors. Two datasets show that it markedly improves read mapping and de novo assembly, yielding results similar to these that would be obtained with non-amplified DNA.With Pacasus long-read technologies become available for sequencing targets with very small amounts of DNA, such as single cells or even single chromosomes.

September 22, 2019 |

Growth factor gene IGF1 is associated with bill size in the black-bellied seedcracker Pyrenestes ostrinus.

Pyrenestes finches are unique among birds in showing a non-sex-determined polymorphism in bill size and are considered a textbook example of disruptive selection. Morphs breed randomly with respect to bill size, and differ in diet and feeding performance relative to seed hardness. Previous breeding experiments are consistent with the polymorphism being controlled by a single genetic factor. Here, we use genome-wide pooled sequencing to explore the underlying genetic basis of bill morphology and identify a single candidate region. Targeted resequencing reveals extensive linkage disequilibrium across a 300?Kb region containing the insulin-like growth factor 1 (IGF1) gene, with a single 5-million-year-old haplotype associating with phenotypic dominance of the large-billed morph. We find no genetic similarities controlling bill size in the well-studied Darwin’s finches (Geospiza). Our results show how a single genetic factor may control bill size and provide a foundation for future studies to examine this phenomenon within and among avian species.

September 22, 2019 |

Cryptocurrencies and Zero Mode Wave guides: An unclouded path to a more contiguous Cannabis sativa L. genome assembly

We describe the use ofa Decentralized Autonomous Organization (DAO) to crypto- fund the single molecule sequencing and publication ofa Type ll Cannabis plant. This resulted in the construction of the most contiguous Cannabis genome assembly to date. The combined use of the Dash cryptocurrency, DAOs, and Pacific Biosciences sequencing delivered a 1.03 Gb genome with a N50 of 665Kb in 77 days from funding to public upload. This represents a 230 fold improvement in the contiguity of the first cannabis assemblies in 2011 and a 4 fold improvement over all cannabis assemblies to date. 34Gb ofadditional sequencing pushed the assembly to a N50 of 3.8Mb. Hi-C data from Phase Genomics further scaffolded the assembly to 35 contigs at an N50 of 74Mb but requires additional curation. The genome is partially phased and larger than previously reported (2N : 1.33Gb). The CBCA, THCA and CBDA synthase gene clusters have been phased onto respective contigs demonstrating tandem repeat expansions.

September 22, 2019 |

The impact of genome evolution on the allotetraploid Nicotiana rustica – an intriguing story of enhanced alkaloid production.

Nicotiana rustica (Aztec tobacco), like common tobacco (Nicotiana tabacum), is an allotetraploid formed through a recent hybridization event; however, it originated from completely different progenitor species. Here, we report the comparative genome analysis of wild type N. rustica (5 Gb; 2n?=?4x?=?48) with its three putative diploid progenitors (2.3-3 Gb; 2n?=?2x =24), Nicotiana undulata, Nicotiana paniculata and Nicotiana knightiana.In total, 41% of N. rustica genome originated from the paternal donor (N. undulata), while 59% originated from the maternal donor (N. paniculata/N. knightiana). Chloroplast genome and gene analyses indicated that N. knightiana is more closely related to N. rustica than N. paniculata. Gene clustering revealed 14,623 ortholog groups common to other Nicotiana species and 207 unique to N. rustica. Genome sequence analysis indicated that N. knightiana is more closely related to N. rustica than N. paniculata, and that the higher nicotine content of N. rustica leaves is the result of the progenitor genomes combination and of a more active transport of nicotine to the shoot.The availability of four new Nicotiana genome sequences provide insights into how speciation impacts plant metabolism, and in particular alkaloid transport and accumulation, and will contribute to better understanding the evolution of Nicotiana species.

September 22, 2019 |

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue.We present HG-CoLoR, a hybrid error correction method that focuses on a seed-and-extend approach based on the alignment of the short reads to the long reads, followed by the traversal of a variable-order de Bruijn graph, built from the short reads. Our experiments show that HG-CoLoR manages to efficiently correct highly noisy long reads that display an error rate as high as 44%. When compared to other state-of-the-art long read error correction methods, our experiments also show that HG-CoLoR provides the best trade-off between runtime and quality of the results, and is the only method able to efficiently scale to eukaryotic genomes.HG-CoLoR is implemented is C++, supported on Linux platforms and freely available at https://github.com/morispi/HG-CoLoR.Supplementary data are available at Bioinformatics online.

September 22, 2019 |

Genome-scale analysis of Acetobacterium bakii reveals the cold adaptation of psychrotolerant acetogens by post-transcriptional regulation.

Acetogens synthesize acetyl-CoA via CO2 or CO fixation, producing organic compounds. Despite their ecological and industrial importance, their transcriptional and post-transcriptional regulation has not been systematically studied. With completion of the genome sequence of Acetobacterium bakii (4.28-Mb), we measured changes in the transcriptome of this psychrotolerant acetogen in response to temperature variations under autotrophic and heterotrophic growth conditions. Unexpectedly, acetogenesis genes were highly up-regulated at low temperatures under heterotrophic, as well as autotrophic, growth conditions. To mechanistically understand the transcriptional regulation of acetogenesis genes via changes in RNA secondary structures of 5′-untranslated regions (5′-UTR), the primary transcriptome was experimentally determined, and 1379 transcription start sites (TSS) and 1100 5′-UTR were found. Interestingly, acetogenesis genes contained longer 5′-UTR with lower RNA-folding free energy than other genes, revealing that the 5′-UTRs control the RNA abundance of the acetogenesis genes under low temperature conditions. Our findings suggest that post-transcriptional regulation via RNA conformational changes of 5′-UTRs is necessary for cold-adaptive acetogenesis.© 2018 Shin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

September 22, 2019 |

Whole-genome landscape of Medicago truncatula symbiotic genes.

Advances in deciphering the functional architecture of eukaryotic genomes have been facilitated by recent breakthroughs in sequencing technologies, enabling a more comprehensive representation of genes and repeat elements in genome sequence assemblies, as well as more sensitive and tissue-specific analyses of gene expression. Here we show that PacBio sequencing has led to a substantially improved genome assembly of Medicago truncatula A17, a legume model species notable for endosymbiosis studies1, and has enabled the identification of genome rearrangements between genotypes at a near-base-pair resolution. Annotation of the new M. truncatula genome sequence has allowed for a thorough analysis of transposable elements and their dynamics, as well as the identification of new players involved in symbiotic nodule development, in particular 1,037 upregulated long non-coding RNAs (lncRNAs). We have also discovered that a substantial proportion (~35% and 38%, respectively) of the genes upregulated in nodules or expressed in the nodule differentiation zone colocalize in genomic clusters (270 and 211, respectively), here termed symbiotic islands. These islands contain numerous expressed lncRNA genes and display differentially both DNA methylation and histone marks. Epigenetic regulations and lncRNAs are therefore attractive candidate elements for the orchestration of symbiotic gene expression in the M. truncatula genome.

Auto Tag: Scaffolding

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Update on Tetracycline Susceptibility of Pediococcus acidilactici Based on Strains Isolated from Swiss Cheese and Whey.

How complete are “complete” genome assemblies?-An avian perspective.

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

The genomic architecture and molecular evolution of ant odorant receptors.

Genomic discovery of the hypsin gene and biosynthetic pathways for terpenoids in Hypsizygus marmoreus.

Whole-genome sequencing of Chinese yellow catfish provides a valuable genetic resource for high-throughput identification of toxin genes.

The chromosome-level quality genome provides insights into the evolution of the biosynthesis genes for aroma compounds of Osmanthus fragrans.

Correcting palindromes in long reads after whole-genome amplification.

Growth factor gene IGF1 is associated with bill size in the black-bellied seedcracker Pyrenestes ostrinus.

Cryptocurrencies and Zero Mode Wave guides: An unclouded path to a more contiguous Cannabis sativa L. genome assembly

The impact of genome evolution on the allotetraploid Nicotiana rustica – an intriguing story of enhanced alkaloid production.

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

Genome-scale analysis of Acetobacterium bakii reveals the cold adaptation of psychrotolerant acetogens by post-transcriptional regulation.

Whole-genome landscape of Medicago truncatula symbiotic genes.

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert