April 21, 2020  |  

Rapid evolution of a-gliadin gene family revealed by analyzing Gli-2 locus regions of wild emmer wheat.

a-Gliadins are a major group of gluten proteins in wheat flour that contribute to the end-use properties for food processing and contain major immunogenic epitopes that can cause serious health-related issues including celiac disease (CD). a-Gliadins are also the youngest group of gluten proteins and are encoded by a large gene family. The majority of the gene family members evolved independently in the A, B, and D genomes of different wheat species after their separation from a common ancestral species. To gain insights into the origin and evolution of these complex genes, the genomic regions of the Gli-2 loci encoding a-gliadins were characterized from the tetraploid wild emmer, a progenitor of hexaploid bread wheat that contributed the AABB genomes. Genomic sequences of Gli-2 locus regions for the wild emmer A and B genomes were first reconstructed using the genome sequence scaffolds along with optical genome maps. A total of 24 and 16 a-gliadin genes were identified for the A and B genome regions, respectively. a-Gliadin pseudogene frequencies of 86% for the A genome and 69% for the B genome were primarily caused by C to T substitutions in the highly abundant glutamine codons, resulting in the generation of premature stop codons. Comparison with the homologous regions from the hexaploid wheat cv. Chinese Spring indicated considerable sequence divergence of the two A genomes at the genomic level. In comparison, conserved regions between the two B genomes were identified that included a-gliadin pseudogenes containing shared nested TE insertions. Analyses of the genomic organization and phylogenetic tree reconstruction indicate that although orthologous gene pairs derived from speciation were present, large portions of a-gliadin genes were likely derived from differential gene duplications or deletions after the separation of the homologous wheat genomes ~?0.5 MYA. The higher number of full-length intact a-gliadin genes in hexaploid wheat than that in wild emmer suggests that human selection through domestication might have an impact on a-gliadin evolution. Our study provides insights into the rapid and dynamic evolution of genomic regions harboring the a-gliadin genes in wheat.


April 21, 2020  |  

Chromosome-scale assemblies reveal the structural evolution of African cichlid genomes.

African cichlid fishes are well known for their rapid radiations and are a model system for studying evolutionary processes. Here we compare multiple, high-quality, chromosome-scale genome assemblies to elucidate the genetic mechanisms underlying cichlid diversification and study how genome structure evolves in rapidly radiating lineages.We re-anchored our recent assembly of the Nile tilapia (Oreochromis niloticus) genome using a new high-density genetic map. We also developed a new de novo genome assembly of the Lake Malawi cichlid, Metriaclima zebra, using high-coverage Pacific Biosciences sequencing, and anchored contigs to linkage groups (LGs) using 4 different genetic maps. These new anchored assemblies allow the first chromosome-scale comparisons of African cichlid genomes. Large intra-chromosomal structural differences (~2-28 megabase pairs) among species are common, while inter-chromosomal differences are rare (<10 megabase pairs total). Placement of the centromeres within the chromosome-scale assemblies identifies large structural differences that explain many of the karyotype differences among species. Structural differences are also associated with unique patterns of recombination on sex chromosomes. Structural differences on LG9, LG11, and LG20 are associated with reduced recombination, indicative of inversions between the rock- and sand-dwelling clades of Lake Malawi cichlids. M. zebra has a larger number of recent transposable element insertions compared with O. niloticus, suggesting that several transposable element families have a higher rate of insertion in the haplochromine cichlid lineage.This study identifies novel structural variation among East African cichlid genomes and provides a new set of genomic resources to support research on the mechanisms driving cichlid adaptation and speciation. © The Author(s) 2019. Published by Oxford University Press.


April 21, 2020  |  

Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data.

Construction of chromosome-level assembly is a vital step in achieving the goal of a ‘Platinum’ genome, but it remains a major challenge to assemble and anchor sequences to chromosomes in autopolyploid or highly heterozygous genomes. High-throughput chromosome conformation capture (Hi-C) technology serves as a robust tool to dramatically advance chromosome scaffolding; however, existing approaches are mostly designed for diploid genomes and often with the aim of reconstructing a haploid representation, thereby having limited power to reconstruct chromosomes for autopolyploid genomes. We developed a novel algorithm (ALLHiC) that is capable of building allele-aware, chromosomal-scale assembly for autopolyploid genomes using Hi-C paired-end reads with innovative ‘prune’ and ‘optimize’ steps. Application on simulated data showed that ALLHiC can phase allelic contigs and substantially improve ordering and orientation when compared to other mainstream Hi-C assemblers. We applied ALLHiC on an autotetraploid and an autooctoploid sugar-cane genome and successfully constructed the phased chromosomal-level assemblies, revealing allelic variations present in these two genomes. The ALLHiC pipeline enables de novo chromosome-level assembly of autopolyploid genomes, separating each allele. Haplotype chromosome-level assembly of allopolyploid and heterozygous diploid genomes can be achieved using ALLHiC, overcoming obstacles in assembling complex genomes.


April 21, 2020  |  

Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes.

The commercial release of third-generation sequencing technologies (TGSTs), giving long and ultra-long sequencing reads, has stimulated the development of new tools for assembling highly contiguous genome sequences with unprecedented accuracy across complex repeat regions. We survey here a wide range of emerging sequencing platforms and analytical tools for de novo assembly, provide background information for each of their steps, and discuss the spectrum of available options. Our decision tree recommends workflows for the generation of a high-quality genome assembly when used in combination with the specific needs and resources of a project.Copyright © 2019 Elsevier Ltd. All rights reserved.


April 21, 2020  |  

Computational aspects underlying genome to phenome analysis in plants.

Recent advances in genomics technologies have greatly accelerated the progress in both fundamental plant science and applied breeding research. Concurrently, high-throughput plant phenotyping is becoming widely adopted in the plant community, promising to alleviate the phenotypic bottleneck. While these technological breakthroughs are significantly accelerating quantitative trait locus (QTL) and causal gene identification, challenges to enable even more sophisticated analyses remain. In particular, care needs to be taken to standardize, describe and conduct experiments robustly while relying on plant physiology expertise. In this article, we review the state of the art regarding genome assembly and the future potential of pangenomics in plant research. We also describe the necessity of standardizing and describing phenotypic studies using the Minimum Information About a Plant Phenotyping Experiment (MIAPPE) standard to enable the reuse and integration of phenotypic data. In addition, we show how deep phenotypic data might yield novel trait-trait correlations and review how to link phenotypic data to genomic data. Finally, we provide perspectives on the golden future of machine learning and their potential in linking phenotypes to genomic features. © 2018 The Authors The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.


April 21, 2020  |  

The role of genomic structural variation in the genetic improvement of polyploid crops

Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences (presence–absence variation) and the presence of genes or sequences in multiple copies (copy-number variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation (presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review, we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.


April 21, 2020  |  

A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds.

The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map.Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based mainly on Sanger sequencing reads. N50 of contigs is 120-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor >?98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features.The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.


July 7, 2019  |  

The genus Brachypodium as a model for perenniality and polyploidy

The genus Brachypodium contains annual and perennial species with both diploid and polyploid genomes. Like the annual species B. distachyon, some of the perennial and polyploid species have traits compatible with use as a model system (e.g. small genomes, rapid generation time, self-fertile and easy to grow). Thus, there is an opportunity to leverage the resources and knowledge developed for B. distachyon to use other Brachypodium species as models for perenniality and the regulation and evolution of polyploid genomes. There are two factors driving an increased interest in perenniality. First, several perennial grasses are being developed as biomass crops for the sustainable production of biofuel and it would be useful to have a perennial model system to rapidly test biotechnological crop improvement strategies for undesirable impacts on perenniality and winter hardiness. In addition, a deeper understanding of the molecular mechanisms underlying perenniality could be used to design strategies for improving energy crops, for example, by changing resource allocation during growth or by altering the onset of dormancy. The second factor driving increased interest in perenniality is the potential environmental benefits of developing perennial grain crops. B. sylvaticum is a perennial with attributes suitable for use as a perennial model system. A high efficiency transformation system has been developed and a genome sequencing project is underway. Since many important crops, including emerging biomass crops, are polyploid, there is a pressing need to understand the rules governing the evolution and regulation of polyploid genomes. Unfortunately, it is difficult to study polyploid crop genomes because of their size and the difficulty of manipulating those plants in the laboratory. By contrast, B. hybridum has a small polyploid genome and is easy to work with in the laboratory. In addition, analysis of the B. hybridum genome, will be greatly aided by the genome sequences of the two extant diploid species (B. distachyon and B. stacei) that apparently gave rise to B. hybridum. Availability of high quality reference genomes for these three species will be a powerful resource for the study of polyploidy.


July 7, 2019  |  

The Brachypodium distachyon reference genome

Grasses provide the bulk of human calories but improvement in grass yields is hindered by the characteristically large and complex genomes of these species; the genomes of wheat, maize, and sugar cane are 17,000, 2300, and 10,000 Mb, respectively. Brachypodium distachyon has one of the smallest genomes of all grasses at 272 Mb, and a number of key traits that make it a good model grass. Brachypodium was the fourth sequenced grass genome, after rice, Sorghum, and maize, and was the first sequenced in the Pooideae subfamily, a diverse group that includes wheat, barley, oat, and rye. The Brachypodium genome was sequenced using a whole genome shotgun approach with Sanger sequencing and is nearly complete with 99.6 % of the sequences anchored to five chromosomes. Sequencing of Brachypodium enabled comparative genomic analysis of grass genomes and shed light on processes involved in chromosome fusions and maintenance of a small genome. The high-quality Brachypodium genome sequence provides a framework for gene expression atlases, resequencing, quantitative trait loci (QTL) mapping, GWAS, and ENCODE datasets. The wealth of Brachypodium genomic resources have cemented its utility as a model organism and will facilitate translational work for improving the grasses that feed the world.


July 7, 2019  |  

De novo hybrid assembly of the rubber tree genome reveals evidence of paleotetraploidy in Hevea species.

Para rubber tree (Hevea brasiliensis) is an important economic species as it is the sole commercial producer of high-quality natural rubber. Here, we report a de novo hybrid assembly of BPM24 accession, which exhibits resistance to major fungal pathogens in Southeast Asia. Deep-coverage 454/Illumina short-read and Pacific Biosciences (PacBio) long-read sequence data were acquired to generate a preliminary draft, which was subsequently scaffolded using a long-range “Chicago” technique to obtain a final assembly of 1.26?Gb (N50?=?96.8?kb). The assembled genome contains 69.2% repetitive sequences and has a GC content of 34.31%. Using a high-density SNP-based genetic map, we were able to anchor 28.9% of the genome assembly (363?Mb) associated with over two thirds of the predicted protein-coding genes into rubber tree’s 18 linkage groups. These genetically anchored sequences allowed comparative analyses of the intragenomic homeologous synteny, providing the first concrete evidence to demonstrate the presence of paleotetraploidy in Hevea species. Additionally, the degree of macrosynteny conservation observed between rubber tree and cassava strongly supports the hypothesis that the paleotetraploidization event took place prior to the divergence of the Hevea and Manihot species.


July 7, 2019  |  

Genomic innovation for crop improvement.

Crop production needs to increase to secure future food supplies, while reducing its impact on ecosystems. Detailed characterization of plant genomes and genetic diversity is crucial for meeting these challenges. Advances in genome sequencing and assembly are being used to access the large and complex genomes of crops and their wild relatives. These have helped to identify a wide spectrum of genetic variation and permitted the association of genetic diversity with diverse agronomic phenotypes. In combination with improved and automated phenotyping assays and functional genomic studies, genomics is providing new foundations for crop-breeding systems.


July 7, 2019  |  

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies.By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual.The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.


July 7, 2019  |  

Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted previous attempts at assembly. We show that the resulting assembled contigs are far larger than in any previous assembly, with an N50 contig size of 486,807 nucleotides. We compare the contigs to independently produced optical maps to evaluate their large-scale accuracy, and to a set of high-quality bacterial artificial chromosome (BAC)-based assemblies to evaluate base-level accuracy. © 2017 Zimin et al.; Published by Cold Spring Harbor Laboratory Press.


July 7, 2019  |  

Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses.

Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt).The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors.Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the “arms-race” between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen’s ability to infect the host.


July 7, 2019  |  

Complete genome sequencing and targeted mutagenesis reveal virulence contributions of Tal2 and Tal4b of Xanthomonas translucens pv. undulosa ICMP11055 in bacterial leaf streak of wheat

Bacterial leaf streak caused by Xanthomonas translucens pv. undulosa (Xtu) is an important disease of wheat (Triticum aestivum) and barley (Hordeum vulgare) worldwide. Transcription activator-like effectors (TALEs) play determinative roles in many of the plant diseases caused by the different species and pathovars of Xanthomonas, but their role in this disease has not been characterized. ICMP11055 is a highly virulent Xtu strain from Iran. The aim of this study was to better understand genetic diversity of Xtu and to assess the role of TALEs in bacterial leaf streak of wheat by comparing the genome of this strain to the recently completely sequenced genome of a U.S. Xtu strain, and to several other draft X. translucens genomes, and by carrying out mutational analyses of the TALE (tal) genes the Iranian strain might harbor. The ICMP11055 genome, including its repeat-rich tal genes, was completely sequenced using single molecule, real-time technology (Pacific Biosciences). It consists of a single circular chromosome of 4,561,583 bp, containing 3,953 genes. Whole genome alignment with the genome of the United States Xtu strain XT4699 showed two major re-arrangements, nine genomic regions unique to ICMP11055, and one region unique to XT4699. ICMP110055 harbors 26 non-TALE type III effector genes and seven tal genes, compared to 25 and eight for XT4699. The tal genes occur singly or in pairs across five scattered loci. Four are identical to tal genes in XT4699. In addition to common repeat-variable diresidues (RVDs), the tal genes of ICMP11055, like those of XT4699, encode several RVDs rarely observed in Xanthomonas, including KG, NF, Y*, YD, and YK. Insertion and deletion mutagenesis of ICMP11055 tal genes followed by genetic complementation analysis in wheat cv. Chinese Spring revealed that Tal2 and Tal4b of ICMP11055 each contribute individually to the extent of disease caused by this strain. A largely conserved ortholog of tal2 is present in XT4699, but for tal4b, only a gene with partial, fragmented RVD sequence similarity can be found. Our results lay the foundation for identification of important host genes activated by Xtu TALEs as targets for the development of disease resistant varieties.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.