Scaffolding Archives - Page 16 of 21

September 22, 2019

The genome of Rhizophagus clarus HR1 reveals a common genetic basis for auxotrophy among arbuscular mycorrhizal fungi.

Mycorrhizal symbiosis is one of the most fundamental types of mutualistic plant-microbe interaction. Among the many classes of mycorrhizae, the arbuscular mycorrhizae have the most general symbiotic style and the longest history. However, the genomes of arbuscular mycorrhizal (AM) fungi are not well characterized due to difficulties in cultivation and genetic analysis. In this study, we sequenced the genome of the AM fungus Rhizophagus clarus HR1, compared the sequence with the genome sequence of the model species R. irregularis, and checked for missing genes that encode enzymes in metabolic pathways related to their obligate biotrophy.In the genome of R. clarus, we confirmed the absence of cytosolic fatty acid synthase (FAS), whereas all mitochondrial FAS components were present. A KEGG pathway map identified the absence of genes encoding enzymes for several other metabolic pathways in the two AM fungi, including thiamine biosynthesis and the conversion of vitamin B6 derivatives. We also found that a large proportion of the genes encoding glucose-producing polysaccharide hydrolases, that are present even in ectomycorrhizal fungi, also appear to be absent in AM fungi.In this study, we found several new genes that are absent from the genomes of AM fungi in addition to the genes previously identified as missing. Missing genes for enzymes in primary metabolic pathways imply that AM fungi may have a higher dependency on host plants than other biotrophic fungi. These missing metabolic pathways provide a genetic basis to explore the physiological characteristics and auxotrophy of AM fungi.

September 22, 2019

A reference genome of the European beech (Fagus sylvatica L.).

The European beech is arguably the most important climax broad-leaved tree species in Central Europe, widely planted for its valuable wood. Here, we report the 542 Mb draft genome sequence of an up to 300-year-old individual (Bhaga) from an undisturbed stand in the Kellerwald-Edersee National Park in central Germany.Using a hybrid assembly approach, Illumina reads with short- and long-insert libraries, coupled with long Pacific Biosciences reads, we obtained an assembled genome size of 542 Mb, in line with flow cytometric genome size estimation. The largest scaffold was of 1.15 Mb, the N50 length was 145 kb, and the L50 count was 983. The assembly contained 0.12% of Ns. A Benchmarking with Universal Single-Copy Orthologs (BUSCO) analysis retrieved 94% complete BUSCO genes, well in the range of other high-quality draft genomes of trees. A total of 62,012 protein-coding genes were predicted, assisted by transcriptome sequencing. In addition, we are reporting an efficient method for extracting high-molecular-weight DNA from dormant buds, by which contamination by environmental bacteria and fungi was kept at a minimum.The assembled genome will be a valuable resource and reference for future population genomics studies on the evolution and past climate change adaptation of beech and will be helpful for identifying genes, e.g., involved in drought tolerance, in order to select and breed individuals to adapt forestry to climate change in Europe. A continuously updated genome browser and download page can be accessed from beechgenome.net, which will include future genome versions of the reference individual Bhaga, as new sequencing approaches develop.

September 22, 2019

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

Arachis monticola (2n = 4x = 40) is the only allotetraploid wild peanut within the Arachis genus and section, with an AABB-type genome of ~2.7 Gb in size. The AA-type subgenome is derived from diploid wild peanut Arachis duranensis, and the BB-type subgenome is derived from diploid wild peanut Arachis ipaensis. A. monticola is regarded either as the direct progenitor of the cultivated peanut or as an introgressive derivative between the cultivated peanut and wild species. The large polyploidy genome structure and enormous nearly identical regions of the genome make the assembly of chromosomal pseudomolecules very challenging. Here we report the first reference quality assembly of the A. monticola genome, using a series of advanced technologies. The final whole genome of A. monticola is ~2.62 Gb and has a contig N50 and scaffold N50 of 106.66 Kb and 124.92 Mb, respectively. The vast majority (91.83%) of the assembled sequence was anchored onto the 20 pseudo-chromosomes, and 96.07% of assemblies were accurately separated into AA- and BB- subgenomes. We demonstrated efficiency of the current state of the strategy for de novo assembly of the highly complex allotetraploid species, wild peanut (A. monticola), based on whole-genome shotgun sequencing, single molecule real-time sequencing, high-throughput chromosome conformation capture technology, and BioNano optical genome maps. These combined technologies produced reference-quality genome of the allotetraploid wild peanut, which is valuable for understanding the peanut domestication and evolution within the Arachis genus and among legume crops.

September 22, 2019

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.

September 22, 2019

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Echinoderms exhibit several fascinating evolutionary innovations that are rarely seen in the animal kingdom, but how these animals attained such features is not well understood. Here we report the sequencing and analysis of the genome and extensive transcriptomes of the sea cucumber Apostichopus japonicus, a species from a special echinoderm group with extraordinary potential for saponin synthesis, aestivation and organ regeneration. The sea cucumber does not possess a reorganized Hox cluster as previously assumed for all echinoderms, and the spatial expression of Hox7 and Hox11/13b potentially guides the embryo-to-larva axial transformation. Contrary to the typical production of lanosterol in animal cholesterol synthesis, the oxidosqualene cyclase of sea cucumber produces parkeol for saponin synthesis and has “plant-like” motifs suggestive of convergent evolution. The transcriptional factors Klf2 and Egr1 are identified as key regulators of aestivation, probably exerting their effects through a clock gene-controlled process. Intestinal hypometabolism during aestivation is driven by the DNA hypermethylation of various metabolic gene pathways, whereas the transcriptional network of intestine regeneration involves diverse signaling pathways, including Wnt, Hippo and FGF. Decoding the sea cucumber genome provides a new avenue for an in-depth understanding of the extraordinary features of sea cucumbers and other echinoderms.

September 22, 2019

Draft genome sequence of Annulohypoxylon stygium, Aspergillus mulundensis, Berkeleyomyces basicola (syn. Thielaviopsis basicola), Ceratocystis smalleyi, two Cercospora beticola strains, Coleophoma cylindrospora, Fusarium fracticaudum, Phialophora cf. hyalina, and Morchella septimelata.

Draft genomes of the species Annulohypoxylon stygium, Aspergillus mulundensis, Berkeleyomyces basicola (syn. Thielaviopsis basicola), Ceratocystis smalleyi, two Cercospora beticola strains, Coleophoma cylindrospora, Fusarium fracticaudum, Phialophora cf. hyalina and Morchella septimelata are presented. Both mating types (MAT1-1 and MAT1-2) of Cercospora beticola are included. Two strains of Coleophoma cylindrospora that produce sulfated homotyrosine echinocandin variants, FR209602, FR220897 and FR220899 are presented. The sequencing of Aspergillus mulundensis, Coleophoma cylindrospora and Phialophora cf. hyalina has enabled mapping of the gene clusters encoding the chemical diversity from the echinocandin pathways, providing data that reveals the complexity of secondary metabolism in these different species. Overall these genomes provide a valuable resource for understanding the molecular processes underlying pathogenicity (in some cases), biology and toxin production of these economically important fungi.

September 22, 2019

Assembly of chromosome-scale contigs by efficiently resolving repetitive sequences with long reads

Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented and incompletely assembled genomes lose value as reference sequences, often due to short contigs that cannot be anchored or mispositioned onto chromosomes. Here we report a novel method Highly Efficient Repeat Assembly (HERA), which includes a new concept called a connection graph as well as algorithms for constructing the graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and enables the assembly of chromosome-scale contigs by further integrating genome maps and Hi-C data. We tested HERA with the genomes of rice R498, maize B73, human HX1 and Tartary buckwheat Pinku1. HERA can correctly assemble most of the tandemly repetitive sequences in rice using single-molecule sequencing data only. Using the same maize and human sequencing data published by Jiao et al. (2017) and Shi et al. (2016), respectively, we dramatically improved on the sequence contiguity compared with the published assemblies, increasing the contig N50 from 1.3 Mb to 61.2 Mb in maize B73 assembly and from 8.3 Mb to 54.4 Mb in human HX1 assembly with HERA. We provided a high-quality maize reference genome with 96.9% of the gaps filled (only 76 gaps left) and several incorrectly positioned sequences fixed compared with the B73 RefGen_v4 assembly. Comparisons between the HERA assembly of HX1 and the human GRCh38 reference genome showed that many gaps in GRCh38 could be filled, and that GRCh38 contained some potential errors that could be fixed. We assembled the Pinku1 genome into 12 scaffolds with a contig N50 size of 27.85 Mb. HERA serves as a new genome assembly/phasing method to generate high quality sequences for complex genomes and as a curation tool to improve the contiguity and completeness of existing reference genomes, including the correction of assembly errors in repetitive regions.

September 22, 2019

Genome Assembly.

Genome assembly uses sequence similarity to go from sequencing reads to longer contiguous sequences (contigs). Scaffolds are contigs linked together by gaps where the order and orientation of the contigs is known but the exact sequence connecting two contigs is unknown, represented by Ns which estimate the gap length. Here we describe recommendations for genome assembly for different sequencing technologies, describe organelle assembly, and review how to perform assembly quality control.

September 22, 2019

The mutation rate and the age of the sex chromosomes in Silene latifolia.

Many aspects of sex chromosome evolution are common to both plants and animals [1], but the process of Y chromosome degeneration, where genes on the Y become non-functional over time, may be much slower in plants due to purifying selection against deleterious mutations in the haploid gametophyte [2, 3]. Testing for differences in Y degeneration between the kingdoms has been hindered by the absence of accurate age estimates for plant sex chromosomes. Here, we used genome resequencing to estimate the spontaneous mutation rate and the age of the sex chromosomes in white campion (Silene latifolia). Screening of single nucleotide polymorphisms (SNPs) in parents and 10 F1 progeny identified 39 de novo mutations and yielded a rate of 7.31 × 10-9 (95% confidence interval: 5.20 × 10-9 – 8.00 × 10-9) mutations per site per haploid genome per generation. Applying this mutation rate to the synonymous divergence between homologous X- and Y-linked genes (gametologs) gave age estimates of 11.00 and 6.32 million years for the old and young strata, respectively. Based on SNP segregation patterns, we inferred which genes were Y-linked and found that at least 47% are already dysfunctional. Applying our new estimates for the age of the sex chromosomes indicates that the rate of Y degeneration in S. latifolia is nearly 2-fold slower when compared to animal sex chromosomes of a similar age. Our revised estimates support Y degeneration taking place more slowly in plants, a discrepancy that may be explained by differences in the life cycles of animals and plants. Copyright © 2018 Elsevier Ltd. All rights reserved.

September 22, 2019

Characterization and high-quality draft genome sequence of Herbivorax saccincola A7, an anaerobic, alkaliphilic, thermophilic, cellulolytic, and xylanolytic bacterium.

An anaerobic, cellulolytic-xylanolytic bacterium, designated strain A7, was isolated from a cellulose-degrading bacterial community inhabiting bovine manure compost on Ishigaki Island, Japan, by enrichment culture using unpretreated corn stover as the sole carbon source. The strain was Gram-positive, non-endospore forming, non-motile, and formed orange colonies on solid medium. Strain A7 was identified as Herbivorax saccincola by DNA-DNA hybridization, and phylogenetic analysis based on 16S rRNA gene sequences showed that it was closely related to H. saccincola GGR1 (= DSM 101079T). H. saccincola A7 (= JCM 31827=DSM 104321) had quite similar phenotypic characteristics to those of strain GGR1. However, the optimum growth of A7 was at alkaline pH (9.0) and 55°C, compared to pH 7.0 at 60°C for GGR1, and the fatty acid profile of A7 contained 1.7-times more C17:0 iso than GGR1. The draft genome sequence revealed that H. saccincola A7 possessed a cellulosome-like extracellular macromolecular complex, which has also been found for Clostridium thermocellum and C. clariflavum. H. saccincola A7 contained more glycoside hydrolases (GHs) belonging to GH families-11 and -2, and more diversity of xylanolytic enzymes, than C. thermocellum and C. clariflavum. H. saccincola A7 could grow on xylan because it encoded essential genes for xylose metabolism, such as a xylose transporter, xylose isomerase, xylulokinase, and ribulose-phosphate 3-epimerase, which are absent from C. thermocellum. These results indicated that H. saccincola A7 has great potential as a microorganism that can effectively degrade lignocellulosic biomass. Copyright © 2018 Elsevier GmbH. All rights reserved.

September 22, 2019

Investigating the central metabolism of Clostridium thermosuccinogenes.

Clostridium thermosuccinogenes is a thermophilic anaerobic bacterium able to convert various carbohydrates to succinate and acetate as main fermentation products. Genomes of the four publicly available strains have been sequenced, and the genome of the type strain has been closed. The annotated genomes were used to reconstruct the central metabolism, and enzyme assays were used to validate annotations and to determine cofactor specificity. The genes were identified for the pathways to all fermentation products, as well as for the Embden-Meyerhof-Parnas pathway and the pentose phosphate pathway. Notably, a candidate transaldolase was lacking, and transcriptomics during growth on glucose versus that on xylose did not provide any leads to potential transaldolase genes or alternative pathways connecting the C5 with the C3/C6 metabolism. Enzyme assays showed xylulokinase to prefer GTP over ATP, which could be of importance for engineering xylose utilization in related thermophilic species of industrial relevance. Furthermore, the gene responsible for malate dehydrogenase was identified via heterologous expression in Escherichia coli and subsequent assays with the cell extract, which has proven to be a simple and powerful method for the basal characterization of thermophilic enzymes.IMPORTANCE Running industrial fermentation processes at elevated temperatures has several advantages, including reduced cooling requirements, increased reaction rates and solubilities, and a possibility to perform simultaneous saccharification and fermentation of a pretreated biomass. Most studies with thermophiles so far have focused on bioethanol production. Clostridium thermosuccinogenes seems an attractive production organism for organic acids, succinic acid in particular, from lignocellulosic biomass-derived sugars. This study provides valuable insights into its central metabolism and GTP and PPi cofactor utilization. Copyright © 2018 American Society for Microbiology.

September 22, 2019

A pathogenesis-related 10 protein catalyzes the final step in thebaine biosynthesis.

The ultimate step in the formation of thebaine, a pentacyclic opiate alkaloid readily converted to the narcotic analgesics codeine and morphine in the opium poppy, has long been presumed to be a spontaneous reaction. We have detected and purified a novel enzyme from opium poppy latex that is capable of the efficient formation of thebaine from (7S)-salutaridinol 7-O-acetate at the expense of labile hydroxylated byproducts, which are preferentially produced by spontaneous allylic elimination. Remarkably, thebaine synthase (THS), a member of the pathogenesis-related 10 protein (PR10) superfamily, is encoded within a novel gene cluster in the opium poppy genome that also includes genes encoding the four biosynthetic enzymes immediately upstream. THS is a missing component that is crucial to the development of fermentation-based opiate production and dramatically improves thebaine yield in engineered yeast.

September 22, 2019

De novo assembly of a young Drosophila Y chromosome using single-molecule sequencing and chromatin conformation capture.

While short-read sequencing technology has resulted in a sharp increase in the number of species with genome assemblies, these assemblies are typically highly fragmented. Repeats pose the largest challenge for reference genome assembly, and pericentromeric regions and the repeat-rich Y chromosome are typically ignored from sequencing projects. Here, we assemble the genome of Drosophila miranda using long reads for contig formation, chromatin interaction maps for scaffolding and short reads, and optical mapping and bacterial artificial chromosome (BAC) clone sequencing for consensus validation. Our assembly recovers entire chromosomes and contains large fractions of repetitive DNA, including about 41.5 Mb of pericentromeric and telomeric regions, and >100 Mb of the recently formed highly repetitive neo-Y chromosome. While Y chromosome evolution is typically characterized by global sequence loss and shrinkage, the neo-Y increased in size by almost 3-fold because of the accumulation of repetitive sequences. Our high-quality assembly allows us to reconstruct the chromosomal events that have led to the unusual sex chromosome karyotype in D. miranda, including the independent de novo formation of a pair of sex chromosomes at two distinct time points, or the reversion of a former Y chromosome to an autosome.

September 22, 2019

High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant.

Salvia splendens Ker-Gawler, scarlet or tropical sage, is a tender herbaceous perennial widely introduced and seen in public gardens all over the world. With few molecular resources, breeding is still restricted to traditional phenotypic selection, and the genetic mechanisms underlying phenotypic variation remain unknown. Hence, a high-quality reference genome will be very valuable for marker-assisted breeding, genome editing, and molecular genetics.We generated 66 Gb and 37 Gb of raw DNA sequences, respectively, from whole-genome sequencing of a largely homozygous scarlet sage inbred line using Pacific Biosciences (PacBio) single-molecule real-time and Illumina HiSeq sequencing platforms. The PacBio de novo assembly yielded a final genome with a scaffold N50 size of 3.12 Mb and a total length of 808 Mb. The repetitive sequences identified accounted for 57.52% of the genome sequence, and ?54,008 protein-coding genes were predicted collectively with ab initio and homology-based gene prediction from the masked genome. The divergence time between S. splendens and Salvia miltiorrhiza was estimated at 28.21 million years ago (Mya). Moreover, 3,797 species-specific genes and 1,187 expanded gene families were identified for the scarlet sage genome.We provide the first genome sequence and gene annotation for the scarlet sage. The availability of these resources will be of great importance for further breeding strategies, genome editing, and comparative genomics among related species.

September 22, 2019

A graph-based approach to diploid genome assembly.

Constructing high-quality haplotype-resolved de novo assemblies of diploid genomes is important for revealing the full extent of structural variation and its role in health and disease. Current assembly approaches often collapse the two sequences into one haploid consensus sequence and, therefore, fail to capture the diploid nature of the organism under study. Thus, building an assembler capable of producing accurate and complete diploid assemblies, while being resource-efficient with respect to sequencing costs, is a key challenge to be addressed by the bioinformatics community.We present a novel graph-based approach to diploid assembly, which combines accurate Illumina data and long-read Pacific Biosciences (PacBio) data. We demonstrate the effectiveness of our method on a pseudo-diploid yeast genome and show that we require as little as 50× coverage Illumina data and 10× PacBio data to generate accurate and complete assemblies. Additionally, we show that our approach has the ability to detect and phase structural variants.https://github.com/whatshap/whatshap.Supplementary data are available at Bioinformatics online.

Auto Tag: Scaffolding

The genome of Rhizophagus clarus HR1 reveals a common genetic basis for auxotrophy among arbuscular mycorrhizal fungi.

A reference genome of the European beech (Fagus sylvatica L.).

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Draft genome sequence of Annulohypoxylon stygium, Aspergillus mulundensis, Berkeleyomyces basicola (syn. Thielaviopsis basicola), Ceratocystis smalleyi, two Cercospora beticola strains, Coleophoma cylindrospora, Fusarium fracticaudum, Phialophora cf. hyalina, and Morchella septimelata.

Assembly of chromosome-scale contigs by efficiently resolving repetitive sequences with long reads

Genome Assembly.

The mutation rate and the age of the sex chromosomes in Silene latifolia.

Characterization and high-quality draft genome sequence of Herbivorax saccincola A7, an anaerobic, alkaliphilic, thermophilic, cellulolytic, and xylanolytic bacterium.

Investigating the central metabolism of Clostridium thermosuccinogenes.

A pathogenesis-related 10 protein catalyzes the final step in thebaine biosynthesis.

De novo assembly of a young Drosophila Y chromosome using single-molecule sequencing and chromatin conformation capture.

High-quality assembly of the reference genome for scarlet sage, Salvia splendens, an economically important ornamental plant.

A graph-based approach to diploid genome assembly.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert