ENCODE Archives - Page 33 of 45

September 22, 2019

Variation in human chromosome 21 ribosomal RNA genes characterized by TAR cloning and long-read sequencing.

Despite the key role of the human ribosome in protein biosynthesis, little is known about the extent of sequence variation in ribosomal DNA (rDNA) or its pre-rRNA and rRNA products. We recovered ribosomal DNA segments from a single human chromosome 21 using transformation-associated recombination (TAR) cloning in yeast. Accurate long-read sequencing of 13 isolates covering ~0.82 Mb of the chromosome 21 rDNA complement revealed substantial variation among tandem repeat rDNA copies, several palindromic structures and potential errors in the previous reference sequence. These clones revealed 101 variant positions in the 45S transcription unit and 235 in the intergenic spacer sequence. Approximately 60% of the 45S variants were confirmed in independent whole-genome or RNA-seq data, with 47 of these further observed in mature 18S/28S rRNA sequences. TAR cloning and long-read sequencing enabled the accurate reconstruction of multiple rDNA units and a new, high-quality 44 838 bp rDNA reference sequence, which we have annotated with variants detected from chromosome 21 of a single individual. The large number of variants observed reveal heterogeneity in human rDNA, opening up the possibility of corresponding variations in ribosome dynamics.

September 22, 2019

A high-quality genome sequence of Rosa chinensis to elucidate ornamental traits.

Rose is the world’s most important ornamental plant, with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Roses are outbred and can have various ploidy levels. Our objectives were to develop a high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short reads, and anchoring to a high-density genetic map, and to study the genome structure and genetic basis of major ornamental traits. We produced a doubled haploid rose line (‘HapOB’) from Rosa chinensis ‘Old Blush’ and generated a rose genome assembly anchored to seven pseudo-chromosomes (512?Mb with N50 of 3.4?Mb and 564 contigs). The length of 512?Mb represents 90.1-96.1% of the estimated haploid genome size of rose. Of the assembly, 95% is contained in only 196 contigs. The anchoring was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features, including the pericentromeric regions, through annotation of transposable element families and positioned centromeric repeats using fluorescent in situ hybridization. The rose genome displays extensive synteny with the Fragaria vesca genome, and we delineated only two major rearrangements. Genetic diversity was analysed using resequencing data of seven diploid and one tetraploid Rosa species selected from various sections of the genus. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and the number of flower petals. A rose APETALA2/TOE homologue is proposed to be the major regulator of petal number in rose. This reference sequence is an important resource for studying polyploidization, meiosis and developmental processes, as we demonstrated for flower and prickle development. It will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.

September 22, 2019

A graph-based approach to diploid genome assembly.

Constructing high-quality haplotype-resolved de novo assemblies of diploid genomes is important for revealing the full extent of structural variation and its role in health and disease. Current assembly approaches often collapse the two sequences into one haploid consensus sequence and, therefore, fail to capture the diploid nature of the organism under study. Thus, building an assembler capable of producing accurate and complete diploid assemblies, while being resource-efficient with respect to sequencing costs, is a key challenge to be addressed by the bioinformatics community.We present a novel graph-based approach to diploid assembly, which combines accurate Illumina data and long-read Pacific Biosciences (PacBio) data. We demonstrate the effectiveness of our method on a pseudo-diploid yeast genome and show that we require as little as 50× coverage Illumina data and 10× PacBio data to generate accurate and complete assemblies. Additionally, we show that our approach has the ability to detect and phase structural variants.https://github.com/whatshap/whatshap.Supplementary data are available at Bioinformatics online.

September 22, 2019

Emergence of a novel mobile colistin resistance gene, mcr-8, in NDM-producing Klebsiella pneumoniae.

The rapid increase in carbapenem resistance among gram-negative bacteria has renewed focus on the importance of polymyxin antibiotics (colistin or polymyxin E). However, the recent emergence of plasmid-mediated colistin resistance determinants (mcr-1, -2, -3, -4, -5, -6, and -7), especially mcr-1, in carbapenem-resistant Enterobacteriaceae is a serious threat to global health. Here, we characterized a novel mobile colistin resistance gene, mcr-8, located on a transferrable 95,983-bp IncFII-type plasmid in Klebsiella pneumoniae. The deduced amino-acid sequence of MCR-8 showed 31.08%, 30.26%, 39.96%, 37.85%, 33.51%, 30.43%, and 37.46% identity to MCR-1, MCR-2, MCR-3, MCR-4, MCR-5, MCR-6, and MCR-7, respectively. Functional cloning indicated that the acquisition of the single mcr-8 gene significantly increased resistance to colistin in both Escherichia coli and K. pneumoniae. Notably, the coexistence of mcr-8 and the carbapenemase-encoding gene blaNDM was confirmed in K. pneumoniae isolates of livestock origin. Moreover, BLASTn analysis of mcr-8 revealed that this gene was present in a colistin- and carbapenem-resistant K. pneumoniae strain isolated from the sputum of a patient with pneumonia syndrome in the respiratory intensive care unit of a Chinese hospital in 2016. These findings indicated that mcr-8 has existed for some time and has disseminated among K. pneumoniae of both animal and human origin, further increasing the public health burden of antimicrobial resistance.

September 22, 2019

High-quality genomes reveal new differences between the great apes

High-quality genome sequences for some of the great apes have been assembled using state-of-the-art sequencing tools. The assemblies provide an unbiased comparison between humans and their closest evolutionary relatives.

September 22, 2019

Strand-seq enables reliable separation of long reads by chromosome via expectation maximization.

Current sequencing technologies are able to produce reads orders of magnitude longer than ever possible before. Such long reads have sparked a new interest in de novo genome assembly, which removes reference biases inherent to re-sequencing approaches and allows for a direct characterization of complex genomic variants. However, even with latest algorithmic advances, assembling a mammalian genome from long error-prone reads incurs a significant computational burden and does not preclude occasional misassemblies. Both problems could potentially be mitigated if assembly could commence for each chromosome separately.To address this, we show how single-cell template strand sequencing (Strand-seq) data can be leveraged for this purpose. We introduce a novel latent variable model and a corresponding Expectation Maximization algorithm, termed SaaRclust, and demonstrates its ability to reliably cluster long reads by chromosome. For each long read, this approach produces a posterior probability distribution over all chromosomes of origin and read directionalities. In this way, it allows to assess the amount of uncertainty inherent to sparse Strand-seq data on the level of individual reads. Among the reads that our algorithm confidently assigns to a chromosome, we observed more than 99% correct assignments on a subset of Pacific Bioscience reads with 30.1×?coverage. To our knowledge, SaaRclust is the first approach for the in silico separation of long reads by chromosome prior to assembly.https://github.com/daewoooo/SaaRclust.

September 22, 2019

GC content elevates mutation and recombination rates in the yeast Saccharomyces cerevisiae.

The chromosomes of many eukaryotes have regions of high GC content interspersed with regions of low GC content. In the yeast Saccharomyces cerevisiae, high-GC regions are often associated with high levels of meiotic recombination. In this study, we constructed URA3 genes that differ substantially in their base composition [URA3-AT (31% GC), URA3-WT (43% GC), and URA3-GC (63% GC)] but encode proteins with the same amino acid sequence. The strain with URA3-GC had an approximately sevenfold elevated rate of ura3 mutations compared with the strains with URA3-WT or URA3-AT About half of these mutations were single-base substitutions and were dependent on the error-prone DNA polymerase ?. About 30% were deletions or duplications between short (5-10 base) direct repeats resulting from DNA polymerase slippage. The URA3-GC gene also had elevated rates of meiotic and mitotic recombination relative to the URA3-AT or URA3-WT genes. Thus, base composition has a substantial effect on the basic parameters of genome stability and evolution. Copyright © 2018 the Author(s). Published by PNAS.

September 22, 2019

Genomic variation among and within six Juglans species.

Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are ‘draft’ genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata. Copyright © 2018 Stevens et al.

September 22, 2019

Genome mining-mediated discovery of a new avermipeptin analogue in Streptomyces actuosus ATCC 25421.

Streptomyces actuosus ATCC 25421 was famous for producing thiopeptide nosiheptide, which has widely been used as a feed additive for the promotion of animal growth. Herein, we report the complete genome sequence of S. actuosus ATCC 25421, which consists of an 8,145,579-bp circular chromosome with a G+C content of 72.53?% containing 7?536 protein-coding genes. The antiSMASH 3.0 program was used to identify 49 biosynthetic gene clusters for putative secondary metabolites, including a putative lantipeptide gene cluster that showed 85?% similarity to the reported informatipeptin biosynthetic gene cluster, indicating that the putative lantipeptide gene cluster has the ability to generate the informatipeptin analogue. Compared with avermipeptin, the lantipeptide precursor peptide (termed avermipeptin B) from S. actuosus ATCC 25421 contains a 14-aa leader peptide and a 24-aa core peptide, in which Ile15 was different from Val15 in avermipeptin. We also deduced the structure and the biosynthetic mechanism of avermipeptin B. Heterologous expression of the avermipeptin B biosynthetic gene cluster in S. lividans TK24 was characterized by high-resolution mass spectrometry (ESI-MS/MS). Finally, we found that avermipeptin B displayed strong activity against Gram-positive strains. The genome sequence reported here can encourage us to mine novel secondary metabolites and investigate their biosynthetic mechanism in the future.

September 22, 2019

The Chara genome: Secondary complexity and implications for plant terrestrialization.

Land plants evolved from charophytic algae, among which Charophyceae possess the most complex body plans. We present the genome of Chara braunii; comparison of the genome to those of land plants identified evolutionary novelties for plant terrestrialization and land plant heritage genes. C. braunii employs unique xylan synthases for cell wall biosynthesis, a phragmoplast (cell separation) mechanism similar to that of land plants, and many phytohormones. C. braunii plastids are controlled via land-plant-like retrograde signaling, and transcriptional regulation is more elaborate than in other algae. The morphological complexity of this organism may result from expanded gene families, with three cases of particular note: genes effecting tolerance to reactive oxygen species (ROS), LysM receptor-like kinases, and transcription factors (TFs). Transcriptomic analysis of sexual reproductive structures reveals intricate control by TFs, activity of the ROS gene network, and the ancestral use of plant-like storage and stress protection proteins in the zygote. Copyright © 2018 Elsevier Inc. All rights reserved.

September 22, 2019

Genome-based population structure analysis of the strawberry plant pathogen Xanthomonas fragariae reveals two distinct groups that evolved independently before its species description.

Xanthomonas fragariae is a quarantine organism in Europe, causing angular leaf spots on strawberry plants. It is spreading worldwide in strawberry-producing regions due to import of plant material through trade and human activities. In order to resolve the population structure at the strain level, we have employed high-resolution molecular typing tools on a comprehensive strain collection representing global and temporal distribution of the pathogen. Clustered regularly interspaced short palindromic repeat regions (CRISPRs) and variable number of tandem repeats (VNTRs) were identified within the reference genome of X. fragariae LMG 25863 as a potential source of variation. Strains from our collection were whole-genome sequenced and used in order to identify variable spacers and repeats for discriminative purpose. CRISPR spacer analysis and multiple-locus VNTR analysis (MLVA) displayed a congruent population structure, in which two major groups and a total of four subgroups were revealed. The two main groups were genetically separated before the first X. fragariae isolate was described and are potentially responsible for the worldwide expansion of the bacterial disease. Three primer sets were designed for discriminating CRISPR-associated markers in order to streamline group determination of novel isolates. Overall, this study describes typing methods to discriminate strains and monitor the pathogen population structure, more especially in the view of a new outbreak of the pathogen.

September 22, 2019

Biosynthetic Baeyer-Villiger chemistry enables access to two anthracene scaffolds from a single gene cluster in Deep-Sea-derived Streptomyces olivaceus SCSIO T05.

Four known compounds, rishirilide B (1), rishirilide C (2), lupinacidin A (3), and galvaquinone B (4), representing two anthracene scaffolds typical of aromatic polyketides, were isolated from a culture of the deep-sea-derived Streptomyces olivaceus SCSIO T05. From the S. olivaceus producer was cloned and sequenced the rsd biosynthetic gene cluster (BGC) that drives rishirilide biosynthesis. The structural gene rsdK2 inactivation and heterologous expression of the rsd BGC confirmed the single rsd BGC encodes construction of 1-4 and, thus, accounts for two anthracene scaffolds. Precursor incubation experiments with 13C-labeled acetate revealed that a Baeyer-Villiger-type rearrangement plays a central role in construction of 1-4. Two luciferase monooxygenase components, along with a reductase component, are presumably involved in the Baeyer-Villiger-type rearrangement reaction enabling access to the two anthracene scaffold variants. Engineering of the rsd BGC unveiled three SARP family transcriptional regulators, enhancing anthracene production. Inactivation of rsdR4, a MarR family transcriptional regulator, failed to impact production of 1-4, although production of 3 was slightly improved; most importantly rsdR4 inactivation led to the new adduct 6 in high titer. Notably, inactivation of rsdH, a putative amidohydrolase, substantially improved the overall titers of 1-4 by more than 4-fold.

September 22, 2019

The complete mitochondrial genome of the Basidiomycete edible fungus Hypsizygus marmoreus

The complete mitochondrial genome of the edible fungus Hypsizygus marmoreus was published in this paper. It was determined using Pacbio and Illumina sequencing. The complete mitochondrial DNA (mtDNA) is 106,417?bp in length with a GC content of 31.74%, which was the fourth large mitogenome in Agaricales. The circular mitogenome encoded 67 protein-coding genes and one ribosomal RNAs (rns). Among these genes, 13 conserved protein-coding genes were determined in the genome, including 6 subunits of NAD dehydrogenase (nad1-4, 4L and 6), three cytochrome oxidases (cox1-3), one apocytochrome b (cob) and three ATP synthases (atp6, apt 8 and apt 9). The phylogenic analysis confirmed that H. marmoreus (Lyophyllaceae) clustered together with Tricholoma matsutake (Tricholomataceae).

September 22, 2019

Nine draft genome sequences of Claviceps purpurea s.lat., including C. arundinis, C. humidiphila, and C. cf. spartinae, pseudomolecules for the pitch canker pathogen Fusarium circinatum, draft genome of Davidsoniella eucalypti, Grosmannia galeiformis, Quambalaria eucalypti, and Teratosphaeria destructans.

This genome announcement includes draft genomes from Claviceps purpurea s.lat., including C. arundinis, C. humidiphila and C. cf. spartinae. The draft genomes of Davidsoniella eucalypti, Quambalaria eucalypti and Teratosphaeria destructans, all three important eucalyptus pathogens, are presented. The insect associate Grosmannia galeiformis is also described. The pine pathogen genome of Fusarium circinatum has been assembled into pseudomolecules, based on additional sequence data and by harnessing the known synteny within the Fusarium fujikuroi species complex. This new assembly of the F. circinatum genome provides 12 pseudomolecules that correspond to the haploid chromosome number of F. circinatum. These are comparable to other chromosomal assemblies within the FFSC and will enable more robust genomic comparisons within this species complex.

September 22, 2019

Potential survival and pathogenesis of a novel strain, Vibrio parahaemolyticus FORC_022, isolated from a soy sauce marinated crab by genome and transcriptome analyses.

Vibrio parahaemolyticus can cause gastrointestinal illness through consumption of seafood. Despite frequent food-borne outbreaks of V. parahaemolyticus, only 19 strains have subjected to complete whole-genome analysis. In this study, a novel strain of V. parahaemolyticus, designated FORC_022 (Food-borne pathogen Omics Research Center_022), was isolated from soy sauce marinated crabs, and its genome and transcriptome were analyzed to elucidate the pathogenic mechanisms. FORC_022 did not include major virulence factors of thermostable direct hemolysin (tdh) and TDH-related hemolysin (trh). However, FORC_022 showed high cytotoxicity and had several V. parahaemolyticus islands (VPaIs) and other virulence factors, such as various secretion systems (types I, II, III, IV, and VI), in comparative genome analysis with CDC_K4557 (the most similar strain) and RIMD2210633 (genome island marker strain). FORC_022 harbored additional virulence genes, including accessory cholera enterotoxin, zona occludens toxin, and tight adhesion (tad) locus, compared with CDC_K4557. In addition, O3 serotype specific gene and the marker gene of pandemic O3:K6 serotype (toxRS) were detected in FORC_022. The expressions levels of genes involved in adherence and carbohydrate transporter were high, whereas those of genes involved in motility, arginine biosynthesis, and proline metabolism were low after exposure to crabs. Moreover, the virulence factors of the type III secretion system, tad locus, and thermolabile hemolysin were overexpressed. Therefore, the risk of foodborne-illness may be high following consumption of FORC_022 contaminated crab. These results provided molecular information regarding the survival and pathogenesis of V. parahaemolyticus FORC_022 strain in contaminated crab and may have applications in food safety.

Auto Tag: ENCODE

Variation in human chromosome 21 ribosomal RNA genes characterized by TAR cloning and long-read sequencing.

A high-quality genome sequence of Rosa chinensis to elucidate ornamental traits.

A graph-based approach to diploid genome assembly.

Emergence of a novel mobile colistin resistance gene, mcr-8, in NDM-producing Klebsiella pneumoniae.

High-quality genomes reveal new differences between the great apes

Strand-seq enables reliable separation of long reads by chromosome via expectation maximization.

GC content elevates mutation and recombination rates in the yeast Saccharomyces cerevisiae.

Genomic variation among and within six Juglans species.

Genome mining-mediated discovery of a new avermipeptin analogue in Streptomyces actuosus ATCC 25421.

The Chara genome: Secondary complexity and implications for plant terrestrialization.

Genome-based population structure analysis of the strawberry plant pathogen Xanthomonas fragariae reveals two distinct groups that evolved independently before its species description.

Biosynthetic Baeyer-Villiger chemistry enables access to two anthracene scaffolds from a single gene cluster in Deep-Sea-derived Streptomyces olivaceus SCSIO T05.

The complete mitochondrial genome of the Basidiomycete edible fungus Hypsizygus marmoreus

Potential survival and pathogenesis of a novel strain, Vibrio parahaemolyticus FORC_022, isolated from a soy sauce marinated crab by genome and transcriptome analyses.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert