P5-C3 Archives - Page 12 of 16

July 7, 2019

Complete gene sequence of spider attachment silk protein (PySp1) reveals novel linker regions and extreme repeat homogenization.

Spiders use a myriad of silk types for daily survival, and each silk type has a unique suite of task-specific mechanical properties. Of all spider silk types, pyriform silk is distinct because it is a combination of a dry protein fiber and wet glue. Pyriform silk fibers are coated with wet cement and extruded into “attachment discs” that adhere silks to each other and to substrates. The mechanical properties of spider silk types are linked to the primary and higher-level structures of spider silk proteins (spidroins). Spidroins are often enormous molecules (>250 kDa) and have a lengthy repetitive region that is flanked by relatively short (~100 amino acids), non-repetitive amino- and carboxyl-terminal regions. The amino acid sequence motifs in the repetitive region vary greatly between spidroin type, while motif length and number underlie the remarkable mechanical properties of spider silk fibers. Existing knowledge of pyriform spidroins is fragmented, making it difficult to define links between the structure and function of pyriform spidroins. Here, we present the full-length sequence of the gene encoding pyriform spidroin 1 (PySp1) from the silver garden spider Argiope argentata. The predicted protein is similar to previously reported PySp1 sequences but the A. argentata PySp1 has a uniquely long and repetitive “linker”, which bridges the amino-terminal and repetitive regions. Predictions of the hydrophobicity and secondary structure of A. argentata PySp1 identify regions important to protein self-assembly. Analysis of the full complement of A. argentata PySp1 repeats reveals extreme intragenic homogenization, and comparison of A. argentata PySp1 repeats with other PySp1 sequences identifies variability in two sub-repetitive expansion regions. Overall, the full-length A. argentata PySp1 sequence provides new evidence for understanding how pyriform spidroins contribute to the properties of pyriform silk fibers. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

July 7, 2019

Genetic adaptation of porcine circovirus type 1 to cultured porcine kidney cells revealed by single-molecule long-read sequencing technology.

Porcine circovirus type 1 (PCV1) is a nonpathogenic circovirus, and a contaminant of the porcine kidney (PK-15) cell line. We present the complete and annotated genome sequence of strain Szeged of PCV1, determined by Pacific Biosciences RSII long-read sequencing platform. Copyright © 2017 Tombácz et al.

July 7, 2019

A pipeline for local assembly of minisatellite alleles from single-molecule sequencing data.

The advent of Next Generation Sequencing (NGS) has led to the generation of enormous volumes of short read sequence data, cheaply and in reasonable time scales. Nevertheless, the quality of genome assemblies generated using NGS technologies has been greatly affected, compared to those generated using Sanger DNA sequencing. This is largely due to the inability of short read sequence data to scaffold repetitive structures, creating gaps, inversions and rearrangements and resulting in assemblies that are, at best, draft forms. Third generation single-molecule sequencing (SMS) technologies (e.g. Pacific Biosciences Single Molecule Real Time (SMRT) system) address this challenge by generating sequences with increased read lengths, offering the prospect to better recover these complex repetitive structures, concomitantly improving assembly quality.Here, we evaluate the ability of SMS data (specifically human genome Pacific Biosciences SMRT data) to recover poorly represented repetitive sequences (specifically, GC-rich human minisatellites). To do this we designed a pipeline for the collection, processing and local assembly of single-molecule sequence data to form accurate contiguous local reconstructions. Our results show the recovery of an allele of the non-coding minisatellite MS1 (located on chromosome 1 at 1p33-35) at greater than 97% identity to reference (GRCh38) from the unprocessed sequence data of a haploid complete hydatidiform mole (CHM1) cell line. Furthermore, our assembly revealed an allele of over 500 repeat units; much larger than the reference (GRCh38), but consistent in structure with naturally occurring alleles that are segregating in human populations. This local assembly’s reconstruction was validated with the release of the whole genome assemblies GCA_001297185.1 and GCA_000772585.3, where this allele occurs. Additionally, application of this pipeline to coding minisatellites in the PRDM9 and ZNF93 genes enabled recovery of high identity allele structures for these sequence regions whose length was confirmed by PCR from cell line genomic DNA. The internal repeat structure of the PRDM9 allele recovered was consistent with common human-specific alleles.Code available at https://github.com/ndliberial/smrt_pipeline CONTACT: dno2@le.ac.uk. © The Author 2016. Published by Oxford University Press.

July 7, 2019

Single-Molecule sequencing of the Drosophila serrata genome.

Long-read sequencing technology promises to greatly enhance de novo assembly of genomes for nonmodel species. Although the error rates of long reads have been a stumbling block, sequencing at high coverage permits the self-correction of many errors. Here, we sequence and de novo assemble the genome of Drosophila serrata, a species from the montium subgroup that has been well-studied for latitudinal clines, sexual selection, and gene expression, but which lacks a reference genome. Using 11 PacBio single-molecule real-time (SMRT cells), we generated 12 Gbp of raw sequence data comprising ~65 × whole-genome coverage. Read lengths averaged 8940 bp (NRead50 12,200) with the longest read at 53 kbp. We self-corrected reads using the PBDagCon algorithm and assembled the genome using the MHAP algorithm within the PBcR assembler. Total genome length was 198 Mbp with an N50 just under 1 Mbp. Contigs displayed a high degree of chromosome arm-level conservation with the D. melanogaster genome and many could be sensibly placed on the D. serrata physical map. We also provide an initial annotation for this genome using in silico gene predictions that were supported by RNA-seq data. Copyright © 2017 Allen et al.

July 7, 2019

Sequencing and de novo assembly of a near complete indica rice genome.

A high-quality reference genome is critical for understanding genome structure, genetic variation and evolution of an organism. Here we report the de novo assembly of an indica rice genome Shuhui498 (R498) through the integration of single-molecule sequencing and mapping data, genetic map and fosmid sequence tags. The 390.3?Mb assembly is estimated to cover more than 99% of the R498 genome and is more continuous than the current reference genomes of japonica rice Nipponbare (MSU7) and Arabidopsis thaliana (TAIR10). We annotate high-quality protein-coding genes in R498 and identify genetic variations between R498 and Nipponbare and presence/absence variations by comparing them to 17 draft genomes in cultivated rice and its closest wild relatives. Our results demonstrate how to de novo assemble a highly contiguous and near-complete plant genome through an integrative strategy. The R498 genome will serve as a reference for the discovery of genes and structural variations in rice.

July 7, 2019

Complete genome sequence of the original Escherichia coli isolate, strain NCTC86.

Escherichia coli is the most well-studied bacterium and a common colonizer of the lower mammalian gastrointestinal tract. We report here the complete genome sequence of the original Escherichia coli isolate, strain NCTC86, which was described by Theodor Escherich, for whom the genus is named. Copyright © 2017 Khetrapal et al.

July 7, 2019

High metabolic versatility of different toxigenic and non-toxigenic Clostridioides difficile isolates.

Clostridioides difficile (formerly Clostridium difficile) is a major nosocomial pathogen with an increasing number of community-acquired infections causing symptoms from mild diarrhea to life-threatening colitis. The pathogenicity of C. difficile is considered to be mainly associated with the production of genome-encoded toxins A and B. In addition, some strains also encode and express the binary toxin CDT. However; a large number of non-toxigenic C. difficile strains have been isolated from the human gut and the environment. In this study, we characterized the growth behavior, motility and fermentation product formation of 17 different C. difficile isolates comprising five different major genomic clades and five different toxin inventories in relation to the C. difficile model strains 630?erm and R20291. Within 33 determined fermentation products, we identified two yet undescribed products (5-methylhexanoate and 4-(methylthio)-butanoate) of C. difficile. Our data revealed major differences in the fermentation products obtained after growth in a medium containing casamino acids and glucose as carbon and energy source. While the metabolism of branched chain amino acids remained comparable in all isolates, the aromatic amino acid uptake and metabolism and the central carbon metabolism-associated fermentation pathways varied strongly between the isolates. The patterns obtained followed neither the classification of the clades nor the ribotyping patterns nor the toxin distribution. As the toxin formation is strongly connected to the metabolism, our data allow an improved differentiation of C. difficile strains. The observed metabolic flexibility provides the optimal basis for the adaption in the course of infection and to changing conditions in different environments including the human gut. Copyright © 2017 Elsevier GmbH. All rights reserved.

July 7, 2019

Complete genome sequence of the uropathogenic Escherichia coli strain NU14.

Escherichia coli is the most common bacterium causing urinary tract infections in humans. We report here the complete genome sequence of the uropathogenic Escherichia coli strain NU14, a clinical pyelonephritis isolate used for studying pathogenesis. Copyright © 2017 Mehershahi and Chen.

July 7, 2019

Genome assembly of Chryseobacterium sp. strain IHBB 10212 from glacier top-surface soil in the Indian trans-Himalayas with potential for hydrolytic enzymes

The cold-active esterases are gaining importance due to their catalytic activities finding applications in chemical industry, food processes and detergent industry as additives, and organic synthesis of unstable compounds as catalysts. In the present study, the complete genome sequence of 4,843,645 bp with an average 34.08% G + C content and 4260 protein-coding genes are reported for the low temperature-active esterase-producing novel strain of Chrysobacterium isolated from the top-surface soil of a glacier in the cold deserts of the Indian trans-Himalayas. The genome contained two plasmids of 16,553 and 11,450 bp with 40.54 and 40.37% G + C contents, respectively. Several genes encoding the hydrolysis of ester linkages of triglycerides into fatty acids and glycerol were predicted in the genome. The annotation also predicted the genes encoding proteases, lipases, amylases, ß-glucosidases, endoglucanases and xylanases involved in biotechnological processes. The complete genome sequence of Chryseobacterium sp. strain IHBB 10212 and two plasmids have been deposited vide accession numbers CP015199, CP015200 and CP015201 at DDBJ/EMBL/GenBank.

July 7, 2019

Genome sequencing reveals the origin of the allotetraploid Arabidopsis suecica.

Polyploidy is an example of instantaneous speciation when it involves the formation of a new cytotype that is incompatible with the parental species. Because new polyploid individuals are likely to be rare, establishment of a new species is unlikely unless polyploids are able to reproduce through self-fertilization (selfing), or asexually. Conversely, selfing (or asexuality) makes it possible for polyploid species to originate from a single individual-a bona fide speciation event. The extent to which this happens is not known. Here, we consider the origin of Arabidopsis suecica, a selfing allopolyploid between Arabidopsis thaliana and Arabidopsis arenosa, which has hitherto been considered to be an example of a unique origin. Based on whole-genome re-sequencing of 15 natural A. suecica accessions, we identify ubiquitous shared polymorphism with the parental species, and hence conclusively reject a unique origin in favor of multiple founding individuals. We further estimate that the species originated after the last glacial maximum in Eastern Europe or central Eurasia (rather than Sweden, as the name might suggest). Finally, annotation of the self-incompatibility loci in A. suecica revealed that both loci carry non-functional alleles. The locus inherited from the selfing A. thaliana is fixed for an ancestral non-functional allele, whereas the locus inherited from the outcrossing A. arenosa is fixed for a novel loss-of-function allele. Furthermore, the allele inherited from A. thaliana is predicted to transcriptionally silence the allele inherited from A. arenosa, suggesting that loss of self-incompatibility may have been instantaneous.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

July 7, 2019

Complete genome sequence of Burkholderia stabilis FERMP-21014.

Cholesterol esterase (EC 3.1.1.13) was identified in a bacterium, Burkholderia stabilis strain FERMP-21014. Here, we report the complete genome sequence of B. stabilis FERMP-21014, which has been used in the commercial production of cholesterol esterase. The genome sequence information may be useful for improving production levels of cholesterol esterase. Copyright © 2017 Konishi et al.

July 7, 2019

Comparative analysis of Ralstonia solanacearum methylomes.

Ralstonia solanacearum is an important soil-borne plant pathogen with broad geographical distribution and the ability to cause wilt disease in many agriculturally important crops. Genome sequencing of multiple R. solanacearum strains has identified both unique and shared genetic traits influencing their evolution and ability to colonize plant hosts. Previous research has shown that DNA methylation can drive speciation and modulate virulence in bacteria, but the impact of epigenetic modifications on the diversification and pathogenesis of R. solanacearum is unknown. Sequencing of R. solanacearum strains GMI1000 and UY031 using Single Molecule Real-Time technology allowed us to perform a comparative analysis of R. solanacearum methylomes. Our analysis identified a novel methylation motif associated with a DNA methylase that is conserved in all complete Ralstonia spp. genomes and across the Burkholderiaceae, as well as a methylation motif associated to a phage-borne methylase unique to R. solanacearum UY031. Comparative analysis of the conserved methylation motif revealed that it is most prevalent in gene promoter regions, where it displays a high degree of conservation detectable through phylogenetic footprinting. Analysis of hyper- and hypo-methylated loci identified several genes involved in global and virulence regulatory functions whose expression may be modulated by DNA methylation. Analysis of genome-wide modification patterns identified a significant correlation between DNA modification and transposase genes in R. solanacearum UY031, driven by the presence of a high copy number of ISrso3 insertion sequences in this genome and pointing to a novel mechanism for regulation of transposition. These results set a firm foundation for experimental investigations into the role of DNA methylation in R. solanacearum evolution and its adaptation to different plants.

July 7, 2019

Genomics and comparative genomic analyses provide insight into the taxonomy and pathogenic potential of novel Emmonsia pathogens.

Over the last 50 years, newly described species of Emmonsia-like fungi have been implicated globally as sources of systemic human mycosis (emmonsiosis). Their ability to convert into yeast-like cells capable of replication and extra-pulmonary dissemination during the course of infection differentiates them from classical Emmonsia species. Immunocompromised patients are at highest risk of emmonsiosis and exhibit high mortality rates. In order to investigate the molecular basis for pathogenicity of the newly described Emmonsia species, genomic sequencing and comparative genomic analyses of Emmonsia sp. 5z489, which was isolated from a non-deliberately immunosuppressed diabetic patient in China and represents a novel seventh isolate of Emmonsia-like fungi, was performed. The genome size of 5z489 was 35.5 Mbp in length, which is ~5 Mbp larger than other Emmonsia strains. Further, 9,188 protein genes were predicted in the 5z489 genome and 16% of the assembly was identified as repetitive elements, which is the largest abundance in Emmonsia species. Phylogenetic analyses based on whole genome data classified 5z489 and CAC-2015a, another novel isolate, as members of the genus Emmonsia. Our analyses showed that divergences among Emmonsia occurred much earlier than other genera within the family Ajellomycetaceae, suggesting relatively distant evolutionary relationships among the genus. Through comparisons of Emmonsia species, we discovered significant pathogenicity characteristics within the genus as well as putative virulence factors that may play a role in the infection and pathogenicity of the novel Emmonsia strains. Moreover, our analyses revealed a novel distribution mode of DNA methylation patterns across the genome of 5z489, with >50% of methylated bases located in intergenic regions. These methylation patterns differ considerably from other reported fungi, where most methylation occurs in repetitive loci. It is unclear if this difference is related to physiological adaptations of new Emmonsia, but this question warrants further investigation. Overall, our analyses provide a framework from which to further study the evolutionary dynamics of Emmonsia strains and identity the underlying molecular mechanisms that determine the infectious and pathogenic potency of these fungal pathogens, and also provide insight into potential targets for therapeutic intervention of emmonsiosis and further research.

July 7, 2019

Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure.

The natural environment serves as a reservoir of opportunistic pathogens. A well-established method for studying the epidemiology of such opportunists is multilocus sequence typing, which in many cases has defined strains predisposed to causing infection. Burkholderia multivorans is an important pathogen in people with cystic fibrosis (CF) and its epidemiology suggests that strains are acquired from non-human sources such as the natural environment. This raises the central question of whether the isolation source (CF or environment) or the multilocus sequence type (ST) of B. multivorans better predicts their genomic content and functionality. We identified four pairs of B. multivorans isolates, representing distinct STs and consisting of one CF and one environmental isolate each. All genomes were sequenced using the PacBio SMRT sequencing technology, which resulted in eight high-quality B. multivorans genome assemblies. The present study demonstrated that the genomic structure of the examined B. multivorans STs is highly conserved and that the B. multivorans genomic lineages are defined by their ST. Orthologous protein families were not uniformly distributed among chromosomes, with core orthologs being enriched on the primary chromosome and ST-specific orthologs being enriched on the second and third chromosome. The ST-specific orthologs were enriched in genes involved in defense mechanisms and secondary metabolism, corroborating the strain-specificity of these virulence characteristics. Finally, the same B. multivorans genomic lineages occur in both CF and environmental samples and on different continents, demonstrating their ubiquity and evolutionary persistence.

July 7, 2019

Draft nuclear genome, complete chloroplast genome, and complete mitochondrial genome for the biofuel/bioproduct feedstock species Scenedesmus obliquus strain DOE0152z.

The green alga Scenedesmus obliquus is an emerging platform species for the industrial production of biofuels. Here, we report the draft assembly and annotation for the nuclear, plastid, and mitochondrial genomes of S. obliquus strain DOE0152z. Copyright © 2017 Starkenburg et al.

Auto Tag: P5-C3

Complete gene sequence of spider attachment silk protein (PySp1) reveals novel linker regions and extreme repeat homogenization.

Genetic adaptation of porcine circovirus type 1 to cultured porcine kidney cells revealed by single-molecule long-read sequencing technology.

A pipeline for local assembly of minisatellite alleles from single-molecule sequencing data.

Single-Molecule sequencing of the Drosophila serrata genome.

Sequencing and de novo assembly of a near complete indica rice genome.

Complete genome sequence of the original Escherichia coli isolate, strain NCTC86.

High metabolic versatility of different toxigenic and non-toxigenic Clostridioides difficile isolates.

Complete genome sequence of the uropathogenic Escherichia coli strain NU14.

Genome assembly of Chryseobacterium sp. strain IHBB 10212 from glacier top-surface soil in the Indian trans-Himalayas with potential for hydrolytic enzymes

Genome sequencing reveals the origin of the allotetraploid Arabidopsis suecica.

Complete genome sequence of Burkholderia stabilis FERMP-21014.

Comparative analysis of Ralstonia solanacearum methylomes.

Genomics and comparative genomic analyses provide insight into the taxonomy and pathogenic potential of novel Emmonsia pathogens.

Comparative genomics of Burkholderia multivorans, a ubiquitous pathogen with a highly conserved genomic structure.

Draft nuclear genome, complete chloroplast genome, and complete mitochondrial genome for the biofuel/bioproduct feedstock species Scenedesmus obliquus strain DOE0152z.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert