Large genome Archives - Page 34 of 69

September 22, 2019

Genome-wide researches and applications on Dendrobium.

This review summarizes current knowledge of chromosome characterization, genetic mapping, genomic sequencing, quality formation, floral transition, propagation, and identification in Dendrobium. The widely distributed Dendrobium has been studied for a long history, due to its important economic values in both medicine and ornamental. In recent years, some species of Dendrobium and other orchids had been reported on genomic sequences, using the next-generation sequencing technology. And the chloroplast genomes of many Dendrobium species were also revealed. The chromosomes of most Dendrobium species belong to mini-chromosomes, and showed 2n?=?38. Only a few of genetic studies were reported in Dendrobium. After revealing of genomic sequences, the techniques of transcriptomics, proteomics and metabolomics could be employed on Dendrobium easily. Some other molecular biological techniques, such as gene cloning, gene editing, genetic transformation and molecular marker developing, had also been applied on the basic research of Dendrobium, successively. As medicinal plants, insights into the biosynthesis of some medicinal components were the most important. As ornamental plants, regulation of flower related characteristics was the most important. More, knowledge of growth and development, environmental interaction, evolutionary analysis, breeding of new cultivars, propagation, and identification of species and herbs were also required for commercial usage. All of these studies were improved using genomic sequences and related technologies. To answer some key scientific issues in Dendrobium, quality formation, flowering, self-incompatibility and seed germination would be the focus of future research. And genome related technologies and studies would be helpful.

September 22, 2019

Haematococcus lacustris: the makings of a giant-sized chloroplast genome.

Recent work on the chlamydomonadalean green alga Haematococcus lacustris uncovered the largest plastid genome on record: a whopping 1.35 Mb with >90 % non-coding DNA. A 500-word description of this genome was published in the journal Genome Announcements. But such a short report for such a large genome leaves many unanswered questions. For instance, the H. lacustris plastome was found to encode only 12 tRNAs, less than half that of a typical plastome, it appears to have a non-standard genetic code, and is one of only a few known plastid DNAs (ptDNAs), out of thousands of available sequences, not biased in adenine and thymine. Here, I take a closer look at the H. lacustris plastome, comparing its size, content and architecture to other large organelle DNAs, including those from close relatives in the Chlamydomonadales. I show that the H. lacustris plastid coding repertoire is not as unusual as initially thought, representing a standard set of rRNAs, tRNAs and protein-coding genes, where the canonical stop codon UGA appears to sometimes signify tryptophan. The intergenic spacers are dense with repeats, and it is within these regions where potential answers to the source of such extreme genomic expansion lie. By comparing ptDNA sequences of two closely related strains of H. lacustris, I argue that the mutation rate of the non-coding DNA is high and contributing to plastome inflation. Finally, by exploring publicly available RNA-sequencing data, I find that most of the intergenic ptDNA is transcriptionally active.

September 22, 2019

Recurrent loss of HMGCS2 shows that ketogenesis is not essential for the evolution of large mammalian brains.

Apart from glucose, fatty acid-derived ketone bodies provide metabolic energy for the brain during fasting and neonatal development. We investigated the evolution of HMGCS2, the key enzyme required for ketone body biosynthesis (ketogenesis). Unexpectedly, we found that three mammalian lineages, comprising cetaceans (dolphins and whales), elephants and mastodons, and Old World fruit bats have lost this gene. Remarkably, many of these species have exceptionally large brains and signs of intelligent behavior. While fruit bats are sensitive to starvation, cetaceans and elephants can still withstand periods of fasting. This suggests that alternative strategies to fuel large brains during fasting evolved repeatedly and reveals flexibility in mammalian energy metabolism. Furthermore, we show that HMGCS2 loss preceded brain size expansion in toothed whales and elephants. Thus, while ketogenesis was likely important for brain size expansion in modern humans, ketogenesis is not a universal precondition for the evolution of large mammalian brains.© 2018, Jebb et al.

September 22, 2019

The complete chloroplast genome sequence of Coix lacryma-jobi L.(Poaceae), a cereal and medicinal crop

Coix lacryma-jobi is a cereal and medicinal crop belonging to the Poaceae family. This study characterized complete chloroplast genome sequence of a Korean cultivar Johyun of C. lacryma-jobi var. ma-yuen through the de novo hybrid assembly with Illumina and PacBio genomic reads. The chloroplast genome is 140,863?bp long and composed of large single copy (82,827?bp), small single copy (12,522?bp), and a pair of inverted repeats (each 22,757?bp). A total of 123 genes including 87 protein-coding genes, 32 tRNA genes, and four rRNA genes were predicted in the genome. Phylogenetic analysis confirmed a close relationship of C. lacryma-jobi with species in the Panicoideae subfamily of the Poaceae family.

September 22, 2019

Recovery of novel association loci in Arabidopsis thaliana and Drosophila melanogaster through leveraging INDELs association and integrated burden test.

Short insertions, deletions (INDELs) and larger structural variants have been increasingly employed in genetic association studies, but few improvements over SNP-based association have been reported. In order to understand why this might be the case, we analysed two publicly available datasets and observed that 63% of INDELs called in A. thaliana and 64% in D. melanogaster populations are misrepresented as multiple alleles with different functional annotations, i.e. where the same underlying variant is represented by inconsistent alignments leading to different variant calls. To address this issue, we have developed the software Irisas to reclassify and re-annotate these variants, which we then used for single-locus tests of association. We also integrated them to predict the functional impact of SNPs, INDELs, and structural variants for burden testing. Using both approaches, we re-analysed the genetic architecture of complex traits in A. thaliana and D. melanogaster. Heritability analysis using SNPs alone explained on average 27% and 19% of phenotypic variance for A. thaliana and D. melanogaster respectively. Our method explained an additional 11% and 3%, respectively. We also identified novel trait loci that previous SNP-based association studies failed to map, and which contain established candidate genes. Our study shows the value of the association test with INDELs and integrating multiple types of variants in association studies in plants and animals.

September 22, 2019

The genomic basis of color pattern polymorphism in the Harlequin ladybird.

Many animal species comprise discrete phenotypic forms. A common example in natural populations of insects is the occurrence of different color patterns, which has motivated a rich body of ecological and genetic research [1-6]. The occurrence of dark, i.e., melanic, forms displaying discrete color patterns is found across multiple taxa, but the underlying genomic basis remains poorly characterized. In numerous ladybird species (Coccinellidae), the spatial arrangement of black and red patches on adult elytra varies wildly within species, forming strikingly different complex color patterns [7, 8]. In the harlequin ladybird, Harmonia axyridis, more than 200 distinct color forms have been described, which classic genetic studies suggest result from allelic variation at a single, unknown, locus [9, 10]. Here, we combined whole-genome sequencing, population-based genome-wide association studies, gene expression, and functional analyses to establish that the transcription factor Pannier controls melanic pattern polymorphism in H. axyridis. We show that pannier is necessary for the formation of melanic elements on the elytra. Allelic variation in pannier leads to protein expression in distinct domains on the elytra and thus determines the distinct color patterns in H. axyridis. Recombination between pannier alleles may be reduced by a highly divergent sequence of ~170 kb in the cis-regulatory regions of pannier, with a 50 kb inversion between color forms. This most likely helps maintain the distinct alleles found in natural populations. Thus, we propose that highly variable discrete color forms can arise in natural populations through cis-regulatory allelic variation of a single gene. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

September 22, 2019

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.

September 22, 2019

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Cannabis has been cultivated for millennia with distinct cultivars providing either fiber and grain or tetrahydrocannabinol. Recent demand for cannabidiol rather than tetrahydrocannabinol has favored the breeding of admixed cultivars with extremely high cannabidiol content. Despite several draft Cannabis genomes, the genomic structure of cannabinoid synthase loci has remained elusive. A genetic map derived from a tetrahydrocannabinol/cannabidiol segregating population and a complete chromosome assembly from a high-cannabidiol cultivar together resolve the linkage of cannabidiolic and tetrahydrocannabinolic acid synthase gene clusters which are associated with transposable elements. High-cannabidiol cultivars appear to have been generated by integrating hemp-type cannabidiolic acid synthase gene clusters into a background of marijuana-type cannabis. Quantitative trait locus mapping suggests that overall drug potency, however, is associated with other genomic regions needing additional study.

September 22, 2019

Bacterial virulence against an oceanic bloom-forming phytoplankter is mediated by algal DMSP

Emiliania huxleyi is a bloom-forming microalga that affects the global sulfur cycle by producing large amounts of dimethylsulfoniopropionate (DMSP) and its volatile metabolic product dimethyl sulfide. Top-down regulation of E. huxleyi blooms has been attributed to viruses and grazers; however, the possible involvement of algicidal bacteria in bloom demise has remained elusive. We demonstrate that a Roseobacter strain, Sulfitobacter D7, that we isolated from a North Atlantic E. huxleyi bloom, exhibited algicidal effects against E. huxleyi upon coculturing. Both the alga and the bacterium were found to co-occur during a natural E. huxleyi bloom, therefore establishing this host-pathogen system as an attractive, ecologically relevant model for studying algal-bacterial interactions in the oceans. During interaction, Sulfitobacter D7 consumed and metabolized algal DMSP to produce high amounts of methanethiol, an alternative product of DMSP catabolism. We revealed a unique strain-specific response, in which E. huxleyi strains that exuded higher amounts of DMSP were more susceptible to Sulfitobacter D7 infection. Intriguingly, exogenous application of DMSP enhanced bacterial virulence and induced susceptibility in an algal strain typically resistant to the bacterial pathogen. This enhanced virulence was highly specific to DMSP compared to addition of propionate and glycerol which had no effect on bacterial virulence. We propose a novel function for DMSP, in addition to its central role in mutualistic interactions among marine organisms, as a mediator of bacterial virulence that may regulate E. huxleyi blooms.

September 22, 2019

Targeted genotyping of variable number tandem repeats with adVNTR.

Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets.© 2018 Bakhtiari et al.; Published by Cold Spring Harbor Laboratory Press.

September 22, 2019

A continuous genome assembly of the corkwing wrasse (Symphodus melops).

The wrasses (Labridae) are one of the most successful and species-rich families of the Perciformes order of teleost fish. Its members display great morphological diversity, and occupy distinct trophic levels in coastal waters and coral reefs. The cleaning behaviour displayed by some wrasses, such as corkwing wrasse (Symphodus melops), is of particular interest for the salmon aquaculture industry to combat and control sea lice infestation as an alternative to chemicals and pharmaceuticals. There are still few genome assemblies available within this fish family for comparative and functional studies, despite the rapid increase in genome resources generated during the past years. Here, we present a highly continuous genome assembly of the corkwing wrasse using PacBio SMRT sequencing (x28.8) followed by error correction with paired-end Illumina data (x132.9). The present genome assembly consists of 5040 contigs (N50?=?461,652?bp) and a total size of 614 Mbp, of which 8.5% of the genome sequence encode known repeated elements. The genome assembly covers 94.21% of highly conserved genes across ray-finned fish species. We find evidence for increased copy numbers specific for corkwing wrasse possibly highlighting diversification and adaptive processes in gene families including N-linked glycosylation (ST8SIA6) and stress response kinases (HIPK1). By comparative analyses, we discover that de novo repeats, often not properly investigated during genome annotation, encode hundreds of immune-related genes. This new genomic resource, together with the ballan wrasse (Labrus bergylta), will allow for in-depth comparative genomics as well as population genetic analyses for the understudied wrasses. Copyright © 2018 Elsevier Inc. All rights reserved.

September 22, 2019

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

The evolutionary dynamics of the conflict between transposable elements (TEs) and their host genome remain elusive. This conflict will be intense in stress-adapted plants as stress can often reactivate TEs. Mangroves reduce TE load convergently in their adaptation to intertidal environments and thus provide a unique opportunity to address the host-TE conflict and its interaction with stress adaptation. Using the mangrove Rhizophora apiculata as a model, we investigated methylation and short interfering RNA (siRNA) targeting patterns in relation to the abundance and age of long terminal repeat (LTR) retrotransposons. We also examined the distance of LTR retrotransposons to genes, the impact on neighboring gene expression and population frequencies. We found differential accumulation amongst classes of LTR retrotransposons despite high overall methylation levels. This can be attributed to 24-nucleotide siRNA-mediated CHH methylation preferentially targeting Gypsy elements, particularly in their LTR regions. Old Gypsy elements possess unusually abundant siRNAs which show cross-mapping to young copies. Gypsy elements appear to be closer to genes and under stronger purifying selection than other classes. Our results suggest a continuous host-TE battle masked by the TE load reduction in R. apiculata. This conflict may enable mangroves, such as R. apiculata, to maintain genetic diversity and thus evolutionary potential during stress adaptation.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

September 22, 2019

Combining probabilistic alignments with read pair information improves accuracy of split-alignments.

Split-alignments provide base-pair-resolution evidence of genomic rearrangements. In practice, they are found by first computing high-scoring local alignments, parts of which are then combined into a split-alignment. This approach is challenging when aligning a short read to a large and repetitive reference, as it tends to produce many spurious local alignments leading to ambiguities in identifying the correct split-alignment. This problem is further exacerbated by the fact that rearrangements tend to occur in repeat-rich regions.We propose a split-alignment technique that combats the issue of ambiguous alignments by combining information from probabilistic alignment with positional information from paired-end reads. We demonstrate that our method finds accurate split-alignments, and that this translates into improved performance of variant-calling tools that rely on split-alignments.An open-source implementation is freely available at: https://bitbucket.org/splitpairedend/last-split-pe.Supplementary data are available at Bioinformatics online.

September 22, 2019

Understanding explosive diversification through cichlid fish genomics.

Owing to their taxonomic, phenotypic, ecological and behavioural diversity and propensity for explosive diversification, the assemblages of cichlid fish in the East African Great Lakes Victoria, Malawi and Tanganyika are important role models in evolutionary biology. With the release of five reference genomes and many additional genomic resources, as well as the establishment of functional genomic tools, the cichlid system has fully entered the genomic era. The in-depth genomic exploration of the East African cichlid fauna – in combination with the examination of their ecology, morphology and behaviour – permits novel insights into the way organisms diversify.

September 22, 2019

How complete are “complete” genome assemblies?-An avian perspective.

The genomics revolution has led to the sequencing of a large variety of nonmodel organisms often referred to as “whole” or “complete” genome assemblies. But how complete are these, really? Here, we use birds as an example for nonmodel vertebrates and find that, although suitable in principle for genomic studies, the current standard of short-read assemblies misses a significant proportion of the expected genome size (7% to 42%; mean 20 ± 9%). In particular, regions with strongly deviating nucleotide composition (e.g., guanine-cytosine-[GC]-rich) and regions highly enriched in repetitive DNA (e.g., transposable elements and satellite DNA) are usually underrepresented in assemblies. However, long-read sequencing technologies successfully characterize many of these underrepresented GC-rich or repeat-rich regions in several bird genomes. For instance, only ~2% of the expected total base pairs are missing in the last chicken reference (galGal5). These assemblies still contain thousands of gaps (i.e., fragmented sequences) because some chromosomal structures (e.g., centromeres) likely contain arrays of repetitive DNA that are too long to bridge with currently available technologies. We discuss how to minimize the number of assembly gaps by combining the latest available technologies with complementary strengths. At last, we emphasize the importance of knowing the location, size and potential content of assembly gaps when making population genetic inferences about adjacent genomic regions.© 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

Asset Tag: Large genome

Genome-wide researches and applications on Dendrobium.

Haematococcus lacustris: the makings of a giant-sized chloroplast genome.

Recurrent loss of HMGCS2 shows that ketogenesis is not essential for the evolution of large mammalian brains.

The complete chloroplast genome sequence of Coix lacryma-jobi L.(Poaceae), a cereal and medicinal crop

Recovery of novel association loci in Arabidopsis thaliana and Drosophila melanogaster through leveraging INDELs association and integrated burden test.

The genomic basis of color pattern polymorphism in the Harlequin ladybird.

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

A complete Cannabis chromosome assembly and adaptive admixture for elevated cannabidiol (CBD) content

Bacterial virulence against an oceanic bloom-forming phytoplankter is mediated by algal DMSP

Targeted genotyping of variable number tandem repeats with adVNTR.

A continuous genome assembly of the corkwing wrasse (Symphodus melops).

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

Combining probabilistic alignments with read pair information improves accuracy of split-alignments.

Understanding explosive diversification through cichlid fish genomics.

How complete are “complete” genome assemblies?-An avian perspective.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert