Reference genome Archives - Page 48 of 64

September 22, 2019

A reference genome of the European beech (Fagus sylvatica L.).

The European beech is arguably the most important climax broad-leaved tree species in Central Europe, widely planted for its valuable wood. Here, we report the 542 Mb draft genome sequence of an up to 300-year-old individual (Bhaga) from an undisturbed stand in the Kellerwald-Edersee National Park in central Germany.Using a hybrid assembly approach, Illumina reads with short- and long-insert libraries, coupled with long Pacific Biosciences reads, we obtained an assembled genome size of 542 Mb, in line with flow cytometric genome size estimation. The largest scaffold was of 1.15 Mb, the N50 length was 145 kb, and the L50 count was 983. The assembly contained 0.12% of Ns. A Benchmarking with Universal Single-Copy Orthologs (BUSCO) analysis retrieved 94% complete BUSCO genes, well in the range of other high-quality draft genomes of trees. A total of 62,012 protein-coding genes were predicted, assisted by transcriptome sequencing. In addition, we are reporting an efficient method for extracting high-molecular-weight DNA from dormant buds, by which contamination by environmental bacteria and fungi was kept at a minimum.The assembled genome will be a valuable resource and reference for future population genomics studies on the evolution and past climate change adaptation of beech and will be helpful for identifying genes, e.g., involved in drought tolerance, in order to select and breed individuals to adapt forestry to climate change in Europe. A continuously updated genome browser and download page can be accessed from beechgenome.net, which will include future genome versions of the reference individual Bhaga, as new sequencing approaches develop.

September 22, 2019

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

Arachis monticola (2n = 4x = 40) is the only allotetraploid wild peanut within the Arachis genus and section, with an AABB-type genome of ~2.7 Gb in size. The AA-type subgenome is derived from diploid wild peanut Arachis duranensis, and the BB-type subgenome is derived from diploid wild peanut Arachis ipaensis. A. monticola is regarded either as the direct progenitor of the cultivated peanut or as an introgressive derivative between the cultivated peanut and wild species. The large polyploidy genome structure and enormous nearly identical regions of the genome make the assembly of chromosomal pseudomolecules very challenging. Here we report the first reference quality assembly of the A. monticola genome, using a series of advanced technologies. The final whole genome of A. monticola is ~2.62 Gb and has a contig N50 and scaffold N50 of 106.66 Kb and 124.92 Mb, respectively. The vast majority (91.83%) of the assembled sequence was anchored onto the 20 pseudo-chromosomes, and 96.07% of assemblies were accurately separated into AA- and BB- subgenomes. We demonstrated efficiency of the current state of the strategy for de novo assembly of the highly complex allotetraploid species, wild peanut (A. monticola), based on whole-genome shotgun sequencing, single molecule real-time sequencing, high-throughput chromosome conformation capture technology, and BioNano optical genome maps. These combined technologies produced reference-quality genome of the allotetraploid wild peanut, which is valuable for understanding the peanut domestication and evolution within the Arachis genus and among legume crops.

September 22, 2019

The impact of Staphylococcus aureus genomic variation on clinical phenotype of children with acute hematogenous osteomyelitis.

Children with acute hematogenous osteomyelitis (AHO) have a broad spectrum of illness ranging from mild to severe. The purpose of this study is to evaluate the impact of genomic variation of Staphylococcus aureus on clinical phenotype of affected children and determine which virulence genes correlate with severity of illness.De novo whole genome sequencing was conducted for a strain of Community Acquired Methicillin Resistant Staphylococcus aureus (CA-MRSA), using PacBio Hierarchical Genome Assembly Process (HGAP) from 6 Single Molecule Real Time (SMRT) Cells, as a reference for DNA library assembly of 71 Staphylococcus aureus isolates from children with AHO. Virulence gene annotation was based on exhaustive literature review and genomic data in NCBI for Staphylococcus aureus. Clinical phenotype was assessed using a validated severity score. Kruskal-Wallis rank sum test determined association between clinical severity and virulence gene presence using False Discovery Rate (FDR), significance <0.01.PacBio produced an assembled genome of 2,898,306 bp and 2054 Open Reading Frames (ORFs). Annotation confirmed 201 virulence genes. Statistical analysis of gene presence by clinical severity found 40 genes significantly associated with severity of illness (FDR =0.009). MRSA isolates encoded a significantly greater number of virulence genes than did MSSA (p < 0.0001). Phylogenetic analysis by maximum likelihood (PAML) demonstrated the relatedness of genomic distance to clinical phenotype.The Staphylococcus aureus genome contains virulence genes which are significantly associated with severity of illness in children with osteomyelitis. This study introduces a novel reference strain and detailed annotation of Staphylococcus aureus virulence genes. While this study does not address bacterial gene expression, a platform is created for future transcriptome investigations to elucidate the complex mechanisms involved in childhood osteomyelitis.

September 22, 2019

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.

September 22, 2019

Comparative genomics of Campylobacter concisus: Analysis of clinical strains reveals genome diversity and pathogenic potential.

In recent years, an increasing number of Campylobacter species have been associated with human gastrointestinal (GI) diseases including gastroenteritis, inflammatory bowel disease, and colorectal cancer. Campylobacter concisus, an oral commensal historically linked to gingivitis and periodontitis, has been increasingly detected in the lower GI tract. In the present study, we generated robust genome sequence data from C. concisus strains and undertook a comprehensive pangenome assessment to identify C. concisus virulence properties and to explain potential adaptations acquired while residing in specific ecological niche(s) of the GI tract. Genomes of 53 new C. concisus strains were sequenced, assembled, and annotated including 36 strains from gastroenteritis patients, 13 strains from Crohn’s disease patients and four strains from colitis patients (three collagenous colitis and one lymphocytic colitis). When compared with previous published sequences, strains clustered into two main groups/genomospecies (GS) with phylogenetic clustering explained neither by disease phenotype nor sample location. Paired oral/faecal isolates, from the same patient, indicated that there are few genetic differences between oral and gut isolates which suggests that gut isolates most likely reflect oral strain relocation. Type IV and VI secretion systems genes, genes known to be important for pathogenicity in the Campylobacter genus, were present in the genomes assemblies, with 82% containing Type VI secretion system genes. Our findings indicate that C. concisus strains are genetically diverse, and the variability in bacterial secretion system content may play an important role in their virulence potential.

September 22, 2019

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Echinoderms exhibit several fascinating evolutionary innovations that are rarely seen in the animal kingdom, but how these animals attained such features is not well understood. Here we report the sequencing and analysis of the genome and extensive transcriptomes of the sea cucumber Apostichopus japonicus, a species from a special echinoderm group with extraordinary potential for saponin synthesis, aestivation and organ regeneration. The sea cucumber does not possess a reorganized Hox cluster as previously assumed for all echinoderms, and the spatial expression of Hox7 and Hox11/13b potentially guides the embryo-to-larva axial transformation. Contrary to the typical production of lanosterol in animal cholesterol synthesis, the oxidosqualene cyclase of sea cucumber produces parkeol for saponin synthesis and has “plant-like” motifs suggestive of convergent evolution. The transcriptional factors Klf2 and Egr1 are identified as key regulators of aestivation, probably exerting their effects through a clock gene-controlled process. Intestinal hypometabolism during aestivation is driven by the DNA hypermethylation of various metabolic gene pathways, whereas the transcriptional network of intestine regeneration involves diverse signaling pathways, including Wnt, Hippo and FGF. Decoding the sea cucumber genome provides a new avenue for an in-depth understanding of the extraordinary features of sea cucumbers and other echinoderms.

September 22, 2019

Assembly of chromosome-scale contigs by efficiently resolving repetitive sequences with long reads

Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented and incompletely assembled genomes lose value as reference sequences, often due to short contigs that cannot be anchored or mispositioned onto chromosomes. Here we report a novel method Highly Efficient Repeat Assembly (HERA), which includes a new concept called a connection graph as well as algorithms for constructing the graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and enables the assembly of chromosome-scale contigs by further integrating genome maps and Hi-C data. We tested HERA with the genomes of rice R498, maize B73, human HX1 and Tartary buckwheat Pinku1. HERA can correctly assemble most of the tandemly repetitive sequences in rice using single-molecule sequencing data only. Using the same maize and human sequencing data published by Jiao et al. (2017) and Shi et al. (2016), respectively, we dramatically improved on the sequence contiguity compared with the published assemblies, increasing the contig N50 from 1.3 Mb to 61.2 Mb in maize B73 assembly and from 8.3 Mb to 54.4 Mb in human HX1 assembly with HERA. We provided a high-quality maize reference genome with 96.9% of the gaps filled (only 76 gaps left) and several incorrectly positioned sequences fixed compared with the B73 RefGen_v4 assembly. Comparisons between the HERA assembly of HX1 and the human GRCh38 reference genome showed that many gaps in GRCh38 could be filled, and that GRCh38 contained some potential errors that could be fixed. We assembled the Pinku1 genome into 12 scaffolds with a contig N50 size of 27.85 Mb. HERA serves as a new genome assembly/phasing method to generate high quality sequences for complex genomes and as a curation tool to improve the contiguity and completeness of existing reference genomes, including the correction of assembly errors in repetitive regions.

September 22, 2019

Fungal Epigenomics: Detection and Analysis.

Across Eukaryota, DNA modifications play an important role in regulation of gene expression. While 5-methylcytosine (5mC) has been explored in depth, other modifications such as 6-methyladenine (6 mA) have historically been overlooked, in part due to technical difficulties in collecting/analyzing these data. However, recent technological advances have enabled exploration of these marks with much greater detail and on a larger scale. In this chapter, we discuss multiple methods for identifying and analyzing both 5mC and 6 mA across fungi.

September 22, 2019

Computational Modeling of Multidrug-Resistant Bacteria

Understanding how complex phenotypes arise from individual molecules and their interactions is a primary challenge in biology, and computational approaches have been increasingly employed to tackle this task. In this chapter, we describe current efforts by FIOCRUZ and partners to develop integrated computational models of multidrug-resistant bacteria. The bacterium chosen as the main focus of this effort is Pseudomonas aeruginosa, an opportunistic pathogen associated with a broad spectrum of infections in humans. Nowadays, P. aeruginosa is one of the main problems of healthcare-associated infections (HAI) in the world, because of its great capacity of survival in hospital environments and its intrinsic resistance to many antibiotics. Our overall research objective is to use integrated computational models to accurately predict a wide range of observable cellular behaviors of multidrug-resistant P. aeruginosa CCBH4851, which is a strain belonging to the clone ST277, endemic in Brazil. In this chapter, after a brief introduction to P. aeruginosa biology, we discuss the construction of metabolic and gene regulatory networks of P. aeruginosa CCBH 4851 from its genome. We also illustrate how these networks can be integrated into a single model, and we discuss methods for identifying potential therapeutic targets through integrated models.

September 22, 2019

The mutation rate and the age of the sex chromosomes in Silene latifolia.

Many aspects of sex chromosome evolution are common to both plants and animals [1], but the process of Y chromosome degeneration, where genes on the Y become non-functional over time, may be much slower in plants due to purifying selection against deleterious mutations in the haploid gametophyte [2, 3]. Testing for differences in Y degeneration between the kingdoms has been hindered by the absence of accurate age estimates for plant sex chromosomes. Here, we used genome resequencing to estimate the spontaneous mutation rate and the age of the sex chromosomes in white campion (Silene latifolia). Screening of single nucleotide polymorphisms (SNPs) in parents and 10 F1 progeny identified 39 de novo mutations and yielded a rate of 7.31 × 10-9 (95% confidence interval: 5.20 × 10-9 – 8.00 × 10-9) mutations per site per haploid genome per generation. Applying this mutation rate to the synonymous divergence between homologous X- and Y-linked genes (gametologs) gave age estimates of 11.00 and 6.32 million years for the old and young strata, respectively. Based on SNP segregation patterns, we inferred which genes were Y-linked and found that at least 47% are already dysfunctional. Applying our new estimates for the age of the sex chromosomes indicates that the rate of Y degeneration in S. latifolia is nearly 2-fold slower when compared to animal sex chromosomes of a similar age. Our revised estimates support Y degeneration taking place more slowly in plants, a discrepancy that may be explained by differences in the life cycles of animals and plants. Copyright © 2018 Elsevier Ltd. All rights reserved.

September 22, 2019

Homogenization of sub-genome secretome gene expression patterns in the allodiploid fungus Verticillium longisporum

Allopolyploidization, genome duplication through interspecific hybridization, is an important evolutionary mechanism that can enable organisms to adapt to environmental changes or stresses. The increased adaptive potential of allopolyploids can be particularly relevant for plant pathogens in their ongoing quest for host immune response evasion. To this end, plant pathogens secrete a plethora of molecules that enable host colonization. Allodiploidization has resulted in the new plant pathogen Verticillium longisporum that infects different hosts than haploid Verticillium species. To reveal the impact of allodiploidization on plant pathogen evolution, we studied the genome and transcriptome dynamics of V. longisporum using next-generation sequencing. V. longisporum genome evolution is characterized by extensive chromosomal rearrangements, between as well as within parental chromosome sets, leading to a mosaic genome structure. In comparison to haploid Verticillium species, V. longisporum genes display stronger signs of positive selection. The expression patterns of the two sub-genomes show remarkable resemblance, suggesting that the parental gene expression patterns homogenized upon hybridization. Moreover, whereas V. longisporum genes encoding secreted proteins frequently display differential expression between the parental sub-genomes in culture medium, expression patterns homogenize upon plant colonization. Collectively, our results illustrate of the adaptive potential of allodiploidy mediated by the interplay of two sub-genomes. Author summary Hybridization followed by whole-genome duplication, so-called allopolyploidization, provides genomic flexibility that is beneficial for survival under stressful conditions or invasiveness into new habitats. Allopolyploidization has mainly been studied in plants, but also occurs in other organisms, including fungi. Verticillium longisporum, an emerging fungal pathogen on brassicaceous plants, arose by allodiploidization between two Verticillium spp. We used comparative genomics to reveal the plastic nature of the V. longisporum genomes, showing that parental chromosome sets recombined extensively, resulting in a mosaic genome pattern. Furthermore, we show that non-synonymous substitutions frequently occurred in V. longisporum. Moreover, we reveal that expression patterns of genes encoding secreted proteins homogenized between the V. longisporum sub-genomes upon plant colonization. In conclusion, our results illustrate the large adaptive potential upon genome hybridization for fungi mediated by genomic plasticity and interaction between sub-genomes.

September 22, 2019

Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.

Luo-han-guo (Siraitia grosvenorii), also called monk fruit, is a member of the Cucurbitaceae family. Monk fruit has become an important area for research because of the pharmacological and economic potential of its noncaloric, extremely sweet components (mogrosides). It is also commonly used in traditional Chinese medicine for the treatment of lung congestion, sore throat, and constipation. Recently, a single reference genome became available for monk fruit, assembled from 36.9x genome coverage reads via Illumina sequencing platforms. This genome assembly has a relatively short (34.2 kb) contig N50 length and lacks integrated annotations. These drawbacks make it difficult to use as a reference in assembling transcriptomes and discovering novel functional genes.Here, we offer a new high-quality draft of the S. grosvenorii genome assembled using 31 Gb (~73.8x) long single molecule real time sequencing reads and polished with ~50 Gb Illumina paired-end reads. The final genome assembly is approximately 469.5 Mb, with a contig N50 length of 432,384 bp, representing a 12.6-fold improvement. We further annotated 237.3 Mb of repetitive sequence and 30,565 consensus protein coding genes with combined evidence. Phylogenetic analysis showed that S. grosvenorii diverged from members of the Cucurbitaceae family approximately 40.9 million years ago. With comprehensive transcriptomic analysis and differential expression testing, we identified 4,606 up-regulated genes in the early fruit compared to the leaf, a number of which were linked to metabolic pathways regulating fruit development and ripening.The availability of this new monk fruit genome assembly, as well as the annotations, will facilitate the discovery of new functional genes and the genetic improvement of monk fruit.

September 22, 2019

Co-culture of soil biofilm isolates enables the discovery of novel antibiotics

Bacterial natural products (NPs) are considered to be a promising source of drug discovery. However, the biosynthesis gene clusters (BGCs) of NP are not often expressed, making it difficult to identify them. Recently, the study of biofilm community showed bacteria may gain competitive advantages by the secretion of antibiotics, implying a possible way to screen antibiotic by evaluating the social behavior of bacteria. In this study, we have described an efficient workflow for novel antibiotic discovery by employing the bacterial social interaction strategy with biofilm cultivation, co-culture, transcriptomic and genomic methods. We showed that a biofilm dominant species, i.e. Pseudomonas sp. G7, which was isolated from cultivated soil biofilm community, was highly competitive in four-species biofilm communities, as the synergistic combinations preferred to exclude this strain while the antagonistic combinations did not. Through the analysis of transcriptomic changes in four-species co-culture and the complete genome of Pseudomonas sp. G7, we finally discovered two novel non-ribosomal polypeptide synthetic (NRPS) BGCs, whose products were predicted to have seven and six amino acid components, respectively. Furthermore, we provide evidence showing that only when Pseudomonas sp. G7 was co-cultivated with at least two or three other bacterial species can these BGC genes be induced, suggesting that the co-culture of the soil biofilm isolates is critical to the discovery of novel antibiotics. As a conclusion, we set a model of applying microbial interaction to the discovery of new antibiotics.

September 22, 2019

Diversity of hepatitis E virus genotype 3

Summary Hepatitis E virus genotype 3 (HEV-3) can lead to chronic infection in immunocompromised patients, and ribavirin is the treatment of choice. Recently, mutations in the polymerase gene have been associated with ribavirin failure but their frequency before treatment according to HEV-3 subtypes has not been studied on a large data set. We used single-molecule real-time sequencing technology to sequence 115 new complete genomes of HEV-3 infecting French patients. We analyzed phylogenetic relationships, the length of the polyproline region, and mutations in the HEV polymerase gene. Eighty-five (74%) were in the clade HEV-3efg, 28 (24%) in HEV-3chi clade, and 2 (2%) in HEV-3ra clade. Using automated partitioning of maximum likelihood phylogenetic trees, complete genomes were classified into subtypes. Polyproline region length differs within HEV-3 clades (from 189 to 315 nt). Investigating mutations in the polymerase gene, distinct polymorphisms between HEV-3 subtypes were found (G1634R in 95% of HEV-3e, G1634K in 56% of HEV-3ra, and V1479I in all HEV-3efg, clade HEV-3ra, and HEV-3k strains). Subtype-specific polymorphisms in the HEV-3 polymerase have been identified. Our study provides new complete genome sequences of HEV-3 that could be useful for comparing strains circulating in humans and the animal reservoir.

September 22, 2019

genomeview – an extensible python-based genomics visualization engine

Visual inspection and analysis are integral to quality control, hypothesis generation, methods development and validation of genomic data. The richness and complexity of genomic data necessitates customized visualizations highlighting specific features of interest while hiding the often vast tide of irrelevant attributes. However, the majority of genome-visualization occurs either in general-purpose tools such as IGV or the UCSC Genome Browser — which offer many options to adjust visualization parameters, but very little in the way of extensibility — or narrowly-focused tools aiming to solve a single visualization problem. Here, we present genomeview, a python-based visualization engine which is easy to extend and simple to integrate into existing analysis pipelines.

Auto Tag: Reference genome

A reference genome of the European beech (Fagus sylvatica L.).

Genome of an allotetraploid wild peanut Arachis monticola: a de novo assembly.

The impact of Staphylococcus aureus genomic variation on clinical phenotype of children with acute hematogenous osteomyelitis.

Footprints of parasitism in the genome of the parasitic flowering plant Cuscuta campestris.

Comparative genomics of Campylobacter concisus: Analysis of clinical strains reveals genome diversity and pathogenic potential.

Sea cucumber genome provides insights into saponin biosynthesis and aestivation regulation.

Assembly of chromosome-scale contigs by efficiently resolving repetitive sequences with long reads

Fungal Epigenomics: Detection and Analysis.

Computational Modeling of Multidrug-Resistant Bacteria

The mutation rate and the age of the sex chromosomes in Silene latifolia.

Homogenization of sub-genome secretome gene expression patterns in the allodiploid fungus Verticillium longisporum

Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.

Co-culture of soil biofilm isolates enables the discovery of novel antibiotics

Diversity of hepatitis E virus genotype 3

genomeview – an extensible python-based genomics visualization engine

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert