Large genome Archives - Page 8 of 69

April 21, 2020

Structural variants in 3000 rice genomes.

Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5′ UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice. © 2019 Fuentes et al.; Published by Cold Spring Harbor Laboratory Press.

April 21, 2020

Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.

Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms. © The Author 2017. Published by Oxford University Press.

April 21, 2020

The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication.

High oil and protein content make tetraploid peanut a leading oil and food legume. Here we report a high-quality peanut genome sequence, comprising 2.54?Gb with 20 pseudomolecules and 83,709 protein-coding gene models. We characterize gene functional groups implicated in seed size evolution, seed oil content, disease resistance and symbiotic nitrogen fixation. The peanut B subgenome has more genes and general expression dominance, temporally associated with long-terminal-repeat expansion in the A subgenome that also raises questions about the A-genome progenitor. The polyploid genome provided insights into the evolution of Arachis hypogaea and other legume chromosomes. Resequencing of 52 accessions suggests that independent domestications formed peanut ecotypes. Whereas 0.42-0.47 million years ago (Ma) polyploidy constrained genetic variation, the peanut genome sequence aids mapping and candidate-gene discovery for traits such as seed size and color, foliar disease resistance and others, also providing a cornerstone for functional genomics and peanut improvement.

April 21, 2020

Interspecies association mapping links reduced CG to TG substitution rates to the loss of gene-body methylation.

Comparative genomics can unravel the genetic basis of species differences; however, successful reports on quantitative traits are still scarce. Here we present genome assemblies of 31 so-far unassembled Brassicaceae plant species and combine them with 16 previously published assemblies to establish the Brassicaceae Diversity Panel. Using a new interspecies association strategy for quantitative traits, we found a so-far unknown association between the unexpectedly high variation in CG to TG substitution rates in genes and the absence of CHROMOMETHYLASE3 (CMT3) orthologues. Low substitution rates were associated with the loss of CMT3, while species with conserved CMT3 orthologues showed high substitution rates. Species without CMT3 also lacked gene-body methylation (gbM), suggesting an evolutionary trade-off between the unknown function of gbM and low substitution rates in Brassicaceae, possibly due to low mutability of non-methylated cytosines.

April 21, 2020

The genome of the medicinal plant Andrographis paniculata provides insight into the biosynthesis of the bioactive diterpenoid neoandrographolide.

Andrographis paniculata is a herbaceous dicot plant widely used for its anti-inflammatory and anti-viral properties across its distribution in China, India and other Southeast Asian countries. A. paniculata was used as a crucial therapeutic treatment during the influenza epidemic of 1919 in India, and is still used for the treatment of infectious disease in China. A. paniculata produces large quantities of the anti-inflammatory diterpenoid lactones andrographolide and neoandrographolide, and their analogs, which are touted to be the next generation of natural anti-inflammatory medicines for lung diseases, hepatitis, neurodegenerative disorders, autoimmune disorders and inflammatory skin diseases. Here, we report a chromosome-scale A. paniculata genome sequence of 269 Mb that was assembled by Illumina short reads, PacBio long reads and high-confidence (Hi-C) data. Gene annotation predicted 25 428 protein-coding genes. In order to decipher the genetic underpinning of diterpenoid biosynthesis, transcriptome data from seedlings elicited with methyl jasmonate were also obtained, which enabled the identification of genes encoding diterpenoid synthases, cytochrome P450 monooxygenases, 2-oxoglutarate-dependent dioxygenases and UDP-dependent glycosyltransferases potentially involved in diterpenoid lactone biosynthesis. We further carried out functional characterization of pairs of class-I and -II diterpene synthases, revealing the ability to produce diversified labdane-related diterpene scaffolds. In addition, a glycosyltransferase able to catalyze O-linked glucosylation of andrograpanin, yielding the major active product neoandrographolide, was also identified. Thus, our results demonstrate the utility of the combined genomic and transcriptomic data set generated here for the investigation of the production of the bioactive diterpenoid lactone constituents of the important medicinal herb A. paniculata. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.

April 21, 2020

Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data.

Construction of chromosome-level assembly is a vital step in achieving the goal of a ‘Platinum’ genome, but it remains a major challenge to assemble and anchor sequences to chromosomes in autopolyploid or highly heterozygous genomes. High-throughput chromosome conformation capture (Hi-C) technology serves as a robust tool to dramatically advance chromosome scaffolding; however, existing approaches are mostly designed for diploid genomes and often with the aim of reconstructing a haploid representation, thereby having limited power to reconstruct chromosomes for autopolyploid genomes. We developed a novel algorithm (ALLHiC) that is capable of building allele-aware, chromosomal-scale assembly for autopolyploid genomes using Hi-C paired-end reads with innovative ‘prune’ and ‘optimize’ steps. Application on simulated data showed that ALLHiC can phase allelic contigs and substantially improve ordering and orientation when compared to other mainstream Hi-C assemblers. We applied ALLHiC on an autotetraploid and an autooctoploid sugar-cane genome and successfully constructed the phased chromosomal-level assemblies, revealing allelic variations present in these two genomes. The ALLHiC pipeline enables de novo chromosome-level assembly of autopolyploid genomes, separating each allele. Haplotype chromosome-level assembly of allopolyploid and heterozygous diploid genomes can be achieved using ALLHiC, overcoming obstacles in assembling complex genomes.

April 21, 2020

A coupled role for CsMYB75 and CsGSTF1 in anthocyanin hyperaccumulation in purple tea.

Cultivars of purple tea (Camellia sinensis) that accumulate anthocyanins in place of catechins are currently attracting global interest in their use as functional health beverages. RNA-seq of normal (LJ43) and purple Zijuan (ZJ) cultivars identified the transcription factor CsMYB75 and phi (F) class glutathione transferase CsGSTF1 as being associated with anthocyanin hyperaccumulation. Both genes mapped as a quantitative trait locus (QTL) to the purple bud leaf color (BLC) trait in F1 populations, with CsMYB75 promoting the expression of CsGSTF1 in transgenic tobacco (Nicotiana tabacum). Although CsMYB75 elevates the biosynthesis of both catechins and anthocyanins, only anthocyanins accumulate in purple tea, indicating selective downstream regulation. As glutathione transferases in other plants are known to act as transporters (ligandins) of flavonoids, directing them for vacuolar deposition, the role of CsGSTF1 in selective anthocyanin accumulation was investigated. In tea, anthocyanins accumulate in multiple vesicles, with the expression of CsGSTF1 correlated with BLC, but not with catechin content, in diverse germplasm. Complementation of the Arabidopsis tt19-8 mutant, which is unable to express the orthologous ligandin AtGSTF12, restored anthocyanin accumulation, but did not rescue the transparent testa phenotype, confirming that CsGSTF1 did not function in catechin accumulation. Consistent with a ligandin function, transient expression of CsGSTF1 in Nicotiana occurred in the nucleus, cytoplasm and membrane. Furthermore, RNA-Seq of the complemented mutants exposed to 2% sucrose as a stress treatment showed unexpected roles for anthocyanin accumulation in affecting the expression of genes involved in redox responses, phosphate homeostasis and the biogenesis of photosynthetic components, as compared with non-complemented plants. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.

April 21, 2020

Morphology and genome of a snailfish from the Mariana Trench provide insights into deep-sea adaptation.

It is largely unknown how living organisms-especially vertebrates-survive and thrive in the coldness, darkness and high pressures of the hadal zone. Here, we describe the unique morphology and genome of Pseudoliparis swirei-a recently described snailfish species living below a depth of 6,000?m in the Mariana Trench. Unlike closely related shallow sea species, P. swirei has transparent, unpigmented skin and scales, thin and incompletely ossified bones, an inflated stomach and a non-closed skull. Phylogenetic analyses show that P. swirei diverged from a close relative living near the sea surface about 20?million?years ago and has abundant genetic diversity. Genomic analyses reveal that: (1) the bone Gla protein (bglap) gene has a frameshift mutation that may cause early termination of cartilage calcification; (2) cell membrane fluidity and transport protein activity in P. swirei may have been enhanced by changes in protein sequences and gene expansion; and (3) the stability of its proteins may have been increased by critical mutations in the trimethylamine N-oxide-synthesizing enzyme and hsp90 chaperone protein. Our results provide insights into the morphological, physiological and molecular evolution of hadal vertebrates.

April 21, 2020

Morphotypes of the common beadlet anemone Actinia equina (L.) are genetically distinct

Anemones of the genus Actinia are ecologically important and familiar organisms on many rocky shores. However, this genus is taxonomically problematical and prior evidence suggests that the North Atlantic beadlet anemone, Actinia equina, may actually consist of a number of cryptic species. Previous genetic work has been largely limited to allozyme electrophoresis and there remains a dearth of genetic resources with which to study this genus. Mitochondrial DNA sequencing may help to clarify the taxonomy of Actinia. Here, the complete mitochondrial genome of the beadlet anemone Actinia equina (Cnidaria: Anthozoa: Actinaria: Actiniidae) is shown to be 20,690?bp in length and to contain the standard complement of Cnidarian features including 13 protein coding genes, two rRNA genes, two tRNAs and two Group I introns, one with an in-frame truncated homing endonuclease gene open reading frame. However, amplification and sequencing of the standard mtDNA barcoding region of the cytochrome oxidase I gene revealed only two haplotypes, differing by a single base pair, in widely geographically separated A. equina and its congener A. prasina. COI barcoding shows that whilst A. equina and A. prasina share the common mtDNA haplotype, haplotype frequency differed significantly between A. equina with red/orange pedal discs and those with green pedal discs, consistent with the hypothesis that these morphotypes represent incipient species.

April 21, 2020

Musa balbisiana genome reveals subgenome evolution and functional divergence.

Banana cultivars (Musa ssp.) are diploid, triploid and tetraploid hybrids derived from Musa acuminata and Musa balbisiana. We presented a high-quality draft genome assembly of M. balbisiana with 430?Mb (87%) assembled into 11?chromosomes. We identified that the recent divergence of M. acuminata (A-genome) and M. balbisiana (B-genome) occurred after lineage-specific whole-genome duplication, and that the B-genome may be more sensitive to the fractionation process compared to the A-genome. Homoeologous exchanges occurred frequently between A- and B-subgenomes in allopolyploids. Genomic variation within progenitors resulted in functional divergence of subgenomes. Global homoeologue expression dominance occurred between subgenomes of the allotriploid. Gene families related to ethylene biosynthesis and starch metabolism exhibited significant expansion at the pathway level and wide homoeologue expression dominance in the B-subgenome of the allotriploid. The independent origin of 1-aminocyclopropane-1-carboxylic acid oxidase (ACO) homoeologue gene pairs and tandem duplication-driven expansion of ACO genes in the B-subgenome contributed to rapid and major ethylene production post-harvest in allotriploid banana fruits. The findings of this study provide greater context for understanding fruit biology, and aid the development of tools for breeding optimal banana cultivars.

April 21, 2020

De novo genome assembly of the stress tolerant forest species Casuarina equisetifolia provides insight into secondary growth.

Casuarina equisetifolia (C. equisetifolia), a conifer-like angiosperm with resistance to typhoon and stress tolerance, is mainly cultivated in the coastal areas of Australasia. C. equisetifolia, making it a valuable model to study secondary growth associated genes and stress-tolerance traits. However, the genome sequence is unavailable and therefore wood-associated growth rate and stress resistance at the molecular level is largely unexplored. We therefore constructed a high-quality draft genome sequence of C. equisetifolia by a combination of Illumina second-generation sequencing reads and Pacific Biosciences single-molecule real-time (SMRT) long reads to advance the investigation of this species. Here, we report the genome assembly, which contains approximately 300 megabases (Mb) and scaffold size of N50 is 1.06 Mb. Additionally, gene annotation, assisted by a combination of prediction and RNA-seq data, generated 29 827 annotated protein-coding genes and 1983 non-coding genes, respectively. Furthermore, we found that the total number of repetitive sequences account for one-third of the genome assembly. Here we also construct the genome-wide map of DNA modification, such as two novel forms N6 -adenine (6mA) and N4-methylcytosine (4mC) at the level of single-nucleotide resolution using single-molecule real-time (SMRT) sequencing. Interestingly, we found that 17% of 6mA modification genes and 15% of 4mC modification genes also included alternative splicing events. Finally, we investigated cellulose, hemicellulose, and lignin-related genes, which were associated with secondary growth and contained different DNA modifications. The high-quality genome sequence and annotation of C. equisetifolia in this study provide a valuable resource to strengthen our understanding of the diverse traits of trees. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.

April 21, 2020

From markers to genome-based breeding in wheat.

Recent technological advances in wheat genomics provide new opportunities to uncover genetic variation in traits of breeding interest and enable genome-based breeding to deliver wheat cultivars for the projected food requirements for 2050. There has been tremendous progress in development of whole-genome sequencing resources in wheat and its progenitor species during the last 5 years. High-throughput genotyping is now possible in wheat not only for routine gene introgression but also for high-density genome-wide genotyping. This is a major transition phase to enable genome-based breeding to achieve progressive genetic gains to parallel to projected wheat production demands. These advances have intrigued wheat researchers to practice less pursued analytical approaches which were not practiced due to the short history of genome sequence availability. Such approaches have been successful in gene discovery and breeding applications in other crops and animals for which genome sequences have been available for much longer. These strategies include, (i) environmental genome-wide association studies in wheat genetic resources stored in genbanks to identify genes for local adaptation by using agroclimatic traits as phenotypes, (ii) haplotype-based analyses to improve the statistical power and resolution of genomic selection and gene mapping experiments, (iii) new breeding strategies for genome-based prediction of heterosis patterns in wheat, and (iv) ultimate use of genomics information to develop more efficient and robust genome-wide genotyping platforms to precisely predict higher yield potential and stability with greater precision. Genome-based breeding has potential to achieve the ultimate objective of ensuring sustainable wheat production through developing high yielding, climate-resilient wheat cultivars with high nutritional quality.

April 21, 2020

Single-Molecule Sequencing: Towards Clinical Applications.

In the past several years, single-molecule sequencing platforms, such as those by Pacific Biosciences and Oxford Nanopore Technologies, have become available to researchers and are currently being tested for clinical applications. They offer exceptionally long reads that permit direct sequencing through regions of the genome inaccessible or difficult to analyze by short-read platforms. This includes disease-causing long repetitive elements, extreme GC content regions, and complex gene loci. Similarly, these platforms enable structural variation characterization at previously unparalleled resolution and direct detection of epigenetic marks in native DNA. Here, we review how these technologies are opening up new clinical avenues that are being applied to pathogenic microorganisms and viruses, constitutional disorders, pharmacogenomics, cancer, and more.Copyright © 2018 Elsevier Ltd. All rights reserved.

April 21, 2020

The Genome of Armadillidium vulgare (Crustacea, Isopoda) Provides Insights into Sex Chromosome Evolution in the Context of Cytoplasmic Sex Determination.

The terrestrial isopod Armadillidium vulgare is an original model to study the evolution of sex determination and symbiosis in animals. Its sex can be determined by ZW sex chromosomes, or by feminizing Wolbachia bacterial endosymbionts. Here, we report the sequence and analysis of the ZW female genome of A. vulgare. A distinguishing feature of the 1.72 gigabase assembly is the abundance of repeats (68% of the genome). We show that the Z and W sex chromosomes are essentially undifferentiated at the molecular level and the W-specific region is extremely small (at most several hundreds of kilobases). Our results suggest that recombination suppression has not spread very far from the sex-determining locus, if at all. This is consistent with A. vulgare possessing evolutionarily young sex chromosomes. We characterized multiple Wolbachia nuclear inserts in the A. vulgare genome, none of which is associated with the W-specific region. We also identified several candidate genes that may be involved in the sex determination or sexual differentiation pathways. The A. vulgare genome serves as a resource for studying the biology and evolution of crustaceans, one of the most speciose and emblematic metazoan groups. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

April 21, 2020

Genome of Crucihimalaya himalaica, a close relative of Arabidopsis, shows ecological adaptation to high altitude.

Crucihimalaya himalaica, a close relative of Arabidopsis and Capsella, grows on the Qinghai-Tibet Plateau (QTP) about 4,000 m above sea level and represents an attractive model system for studying speciation and ecological adaptation in extreme environments. We assembled a draft genome sequence of 234.72 Mb encoding 27,019 genes and investigated its origin and adaptive evolutionary mechanisms. Phylogenomic analyses based on 4,586 single-copy genes revealed that C. himalaica is most closely related to Capsella (estimated divergence 8.8 to 12.2 Mya), whereas both species form a sister clade to Arabidopsis thaliana and Arabidopsis lyrata, from which they diverged between 12.7 and 17.2 Mya. LTR retrotransposons in C. himalaica proliferated shortly after the dramatic uplift and climatic change of the Himalayas from the Late Pliocene to Pleistocene. Compared with closely related species, C. himalaica showed significant contraction and pseudogenization in gene families associated with disease resistance and also significant expansion in gene families associated with ubiquitin-mediated proteolysis and DNA repair. We identified hundreds of genes involved in DNA repair, ubiquitin-mediated proteolysis, and reproductive processes with signs of positive selection. Gene families showing dramatic changes in size and genes showing signs of positive selection are likely candidates for C. himalaica’s adaptation to intense radiation, low temperature, and pathogen-depauperate environments in the QTP. Loss of function at the S-locus, the reason for the transition to self-fertilization of C. himalaica, might have enabled its QTP occupation. Overall, the genome sequence of C. himalaica provides insights into the mechanisms of plant adaptation to extreme environments.Copyright © 2019 the Author(s). Published by PNAS.

Asset Tag: Large genome

Structural variants in 3000 rice genomes.

Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.

The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication.

Interspecies association mapping links reduced CG to TG substitution rates to the loss of gene-body methylation.

The genome of the medicinal plant Andrographis paniculata provides insight into the biosynthesis of the bioactive diterpenoid neoandrographolide.

Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data.

A coupled role for CsMYB75 and CsGSTF1 in anthocyanin hyperaccumulation in purple tea.

Morphology and genome of a snailfish from the Mariana Trench provide insights into deep-sea adaptation.

Morphotypes of the common beadlet anemone Actinia equina (L.) are genetically distinct

Musa balbisiana genome reveals subgenome evolution and functional divergence.

De novo genome assembly of the stress tolerant forest species Casuarina equisetifolia provides insight into secondary growth.

From markers to genome-based breeding in wheat.

Single-Molecule Sequencing: Towards Clinical Applications.

The Genome of Armadillidium vulgare (Crustacea, Isopoda) Provides Insights into Sex Chromosome Evolution in the Context of Cytoplasmic Sex Determination.

Genome of Crucihimalaya himalaica, a close relative of Arabidopsis, shows ecological adaptation to high altitude.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert