Validation Archives - Page 26 of 29

September 22, 2019 |

Fast and inexpensive protocols for consistent extraction of high quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications.

Modern genotyping techniques, such as SNP analysis and genotyping by sequencing (GBS), are hampered by poor DNA quality and purity, particularly in challenging plant species, rich in secondary metabolites. We therefore investigated the utility of a pre-wash step using a buffered sorbitol solution, prior to DNA extraction using a high salt CTAB extraction protocol, in a high throughput or miniprep setting. This pre-wash appears to remove interfering metabolites, such as polyphenols and polysaccharides, from tissue macerates. We also investigated the adaptability of the sorbitol pre-wash for RNA extraction using a lithium chloride-based protocol. The method was successfully applied to a variety of tissues, including leaf, cambium and fruit of diverse plant species including annual crops, forest and fruit trees, herbarium leaf material and lyophilized fungal mycelium. We consistently obtained good yields of high purity DNA or RNA in all species tested. The protocol has been validated for thousands of DNA samples by generating high data quality in dense SNP arrays. DNA extracted from Eucalyptus spp. leaf and cambium as well as mycelium from Trichoderma spp. was readily digested with restriction enzymes and performed consistently in AFLP assays. Scaled-up DNA extractions were also suitable for long read sequencing. Successful RNA quality control and good RNA-Seq data for Eucalyptus and cashew confirms the effectiveness of the sorbitol buffer pre-wash for high quality RNA extraction.

September 22, 2019 |

Type II restriction modification system in Ureaplasma parvum OMC-P162 strain.

Ureaplasma parvum serovar 3 strain, OMC-P162, was isolated from the human placenta of a preterm delivery at 26 weeks’ gestation. In this study, we sequenced the complete genome of OMC-P162 and compared it with other serovar 3 strains isolated from patients with different clinical conditions. Ten unique genes in OMC-P162, five of which encoded for hypothetical proteins, were identified. Of these, genes UPV_229 and UPV_230 formed an operon whose open reading frames were predicted to code for a DNA methyltransferase and a hypothetical protein, respectively. DNA modification analysis of the OMC-P162 genome identified N4-methylcytosine (m4C) and N6-methyladenine (m6A), but not 5-methylocytosine (m5C). UPV230 recombinant protein displayed endonuclease activity and recognized the CATG sequence, resulting in a blunt cut between A and T. This restriction enzyme activity was identical to that of the cultivated OMC-P162 strain, suggesting that this restriction enzyme was naturally expressed in OMC-P162. We designated this enzyme as UpaP162. Treatment of pT7Blue plasmid with recombinant protein UPV229 completely blocked UpaP162 restriction enzyme activity. These results suggest that the UPV_229 and UPV_230 genes act as a type II restriction-modification system in Ureaplasma OMC-P162.

September 22, 2019 |

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.

September 22, 2019 |

Repeat elements organise 3D genome structure and mediate transcription in the filamentous fungus Epichloë festucae.

Structural features of genomes, including the three-dimensional arrangement of DNA in the nucleus, are increasingly seen as key contributors to the regulation of gene expression. However, studies on how genome structure and nuclear organisation influence transcription have so far been limited to a handful of model species. This narrow focus limits our ability to draw general conclusions about the ways in which three-dimensional structures are encoded, and to integrate information from three-dimensional data to address a broader gamut of biological questions. Here, we generate a complete and gapless genome sequence for the filamentous fungus, Epichloë festucae. We use Hi-C data to examine the three-dimensional organisation of the genome, and RNA-seq data to investigate how Epichloë genome structure contributes to the suite of transcriptional changes needed to maintain symbiotic relationships with the grass host. Our results reveal a genome in which very repeat-rich blocks of DNA with discrete boundaries are interspersed by gene-rich sequences that are almost repeat-free. In contrast to other species reported to date, the three-dimensional structure of the genome is anchored by these repeat blocks, which act to isolate transcription in neighbouring gene-rich regions. Genes that are differentially expressed in planta are enriched near the boundaries of these repeat-rich blocks, suggesting that their three-dimensional orientation partly encodes and regulates the symbiotic relationship formed by this organism.

September 22, 2019 |

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Genome sequencing is revolutionising infectious disease epidemiology, providing a huge step forward in sensitivity and specificity over more traditional molecular typing techniques. However, the complexity of genome data often means that its analysis and interpretation requires high-performance compute infrastructure and dedicated bioinformatics support. Furthermore, current methods have limitations that can differ between analyses and are often opaque to the user, and their reliance on multiple external dependencies makes reproducibility difficult. Here I introduce SKA, a toolkit for analysis of genome sequence data from closely-related, small, haploid genomes. SKA uses split kmers to rapidly identify variation between genome sequences, making it possible to analyse hundreds of genomes on a standard home computer. Tests on publicly available simulated and real-life data show that SKA is both faster and more efficient than the gold standard methods used today while retaining similar levels of accuracy for epidemiological purposes. SKA can take raw read data or genome assemblies as input and calculate pairwise distances, create single linkage clusters and align genomes to a reference genome or using a reference-free approach. SKA requires few decisions to be made by the user, which, along with its computational efficiency, allows genome analysis to become accessible to those with only basic bioinformatics training. The limitations of SKA are also far more transparent than for current approaches, and future improvements to mitigate these limitations are possible. Overall, SKA is a powerful addition to the armoury of the genomic epidemiologist. SKA source code is available from Github (https://github.com/simonrharris/SKA).

September 22, 2019 |

Targeted genotyping of variable number tandem repeats with adVNTR.

Whole-genome sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single-nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. Here, we consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100 bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. Although existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole-genome sequencing reads remains challenging. We describe a method, adVNTR, that uses hidden Markov models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single-molecule (Pacific Biosciences [PacBio]) whole-genome and whole-exome sequencing, and show good results on multiple simulated and real data sets.© 2018 Bakhtiari et al.; Published by Cold Spring Harbor Laboratory Press.

September 22, 2019 |

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

The evolutionary dynamics of the conflict between transposable elements (TEs) and their host genome remain elusive. This conflict will be intense in stress-adapted plants as stress can often reactivate TEs. Mangroves reduce TE load convergently in their adaptation to intertidal environments and thus provide a unique opportunity to address the host-TE conflict and its interaction with stress adaptation. Using the mangrove Rhizophora apiculata as a model, we investigated methylation and short interfering RNA (siRNA) targeting patterns in relation to the abundance and age of long terminal repeat (LTR) retrotransposons. We also examined the distance of LTR retrotransposons to genes, the impact on neighboring gene expression and population frequencies. We found differential accumulation amongst classes of LTR retrotransposons despite high overall methylation levels. This can be attributed to 24-nucleotide siRNA-mediated CHH methylation preferentially targeting Gypsy elements, particularly in their LTR regions. Old Gypsy elements possess unusually abundant siRNAs which show cross-mapping to young copies. Gypsy elements appear to be closer to genes and under stronger purifying selection than other classes. Our results suggest a continuous host-TE battle masked by the TE load reduction in R. apiculata. This conflict may enable mangroves, such as R. apiculata, to maintain genetic diversity and thus evolutionary potential during stress adaptation.© 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

September 22, 2019 |

Functional genomic analysis of phthalate acid ester (PAE) catabolism genes in the versatile PAE-mineralising bacterium Rhodococcus sp. 2G.

Microbial degradation is considered the most promising method for removing phthalate acid esters (PAEs) from polluted environments; however, a comprehensive genomic understanding of the entire PAE catabolic process is still lacking. In this study, the repertoire of PAE catabolism genes in the metabolically versatile bacterium Rhodococcus sp. 2G was examined using genomic, metabolic, and bioinformatic analyses. A total of 4930 coding genes were identified from the 5.6?Mb genome of the 2G strain, including 337 esterase/hydrolase genes and 48 transferase and decarboxylase genes that were involved in hydrolysing PAEs into phthalate acid (PA) and decarboxylating PA into benzoic acid (BA). One gene cluster (xyl) responsible for transforming BA into catechol and two catechol-catabolism gene clusters controlling the ortho (cat) and meta (xyl &mhp) cleavage pathways were also identified. The proposed PAE catabolism pathway and some key degradation genes were validated by intermediate-utilising tests and real-time quantitative polymerase chain reaction. Our results provide novel insight into the mechanisms of PAE biodegradation at the molecular level and useful information on gene resources for future studies. Copyright © 2018 Elsevier B.V. All rights reserved.

September 22, 2019 |

Computational tools to unmask transposable elements.

A substantial proportion of the genome of many species is derived from transposable elements (TEs). Moreover, through various self-copying mechanisms, TEs continue to proliferate in the genomes of most species. TEs have contributed numerous regulatory, transcript and protein innovations and have also been linked to disease. However, notwithstanding their demonstrated impact, many genomic studies still exclude them because their repetitive nature results in various analytical complexities. Fortunately, a growing array of methods and software tools are being developed to cater for them. This Review presents a summary of computational resources for TEs and highlights some of the challenges and remaining gaps to perform comprehensive genomic analyses that do not simply ‘mask’ repeats.

September 22, 2019 |

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

Selection of highly productive algal strains is crucial for establishing economically viable biomass and biopro- duct cultivation systems. Characterization of algal genomes, including understanding strain-specific differences in genome content and architecture is a critical step in this process. Using genomic analyses, we demonstrate significant differences between three strains of Chlorella sorokiniana (strain 1228, UTEX 1230, and DOE1412). We found that unique, strain-specific genes comprise a substantial proportion of each genome, and genomic regions with> 80% local nucleotide identity constitute <15% of each genome among the strains, indicating substantial strain specific evolution. Furthermore, cataloging of meiosis and other sex-related genes in C. sor- okiniana strains suggests strategic breeding could be utilized to improve biomass and bioproduct yields if a sexual cycle can be characterized. Finally, preliminary investigation of epigenetic machinery suggests the pre- sence of potentially unique transcriptional regulation in each strain. Our data demonstrate that these three C. sorokiniana strains represent significantly different genomic content. Based on these findings, we propose in- dividualized assessment of each strain for potential performance in cultivation systems.

September 22, 2019 |

Alpha- and beta-mannan utilization by marine Bacteroidetes.

Marine microscopic algae carry out about half of the global carbon dioxide fixation into organic matter. They provide organic substrates for marine microbes such as members of the Bacteroidetes that degrade algal polysaccharides using carbohydrate-active enzymes (CAZymes). In Bacteroidetes genomes CAZyme encoding genes are mostly grouped in distinct regions termed polysaccharide utilization loci (PULs). While some studies have shown involvement of PULs in the degradation of algal polysaccharides, the specific substrates are for the most part still unknown. We investigated four marine Bacteroidetes isolated from the southern North Sea that harbour putative mannan-specific PULs. These PULs are similarly organized as PULs in human gut Bacteroides that digest a- and ß-mannans from yeasts and plants respectively. Using proteomics and defined growth experiments with polysaccharides as sole carbon sources we could show that the investigated marine Bacteroidetes express the predicted functional proteins required for a- and ß-mannan degradation. Our data suggest that algal mannans play an as yet unknown important role in the marine carbon cycle, and that biochemical principles established for gut or terrestrial microbes also apply to marine bacteria, even though their PULs are evolutionarily distant.© 2018 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

September 22, 2019 |

Genomic evidence for asymmetric introgression by sexual selection in the common wall lizard.

Strongly selected characters can be transferred from one lineage to another with limited genetic exchange, resulting in asymmetric introgression and a mosaic genome in the receiving population. However, systems are rarely sufficiently well studied to link the pattern of introgression to its underlying process. Male common wall lizards in western Italy exhibit exaggeration of a suite of sexually selected characters that make them outcompete males from a distantly related lineage that lack these characters. This results in asymmetric hybridization and adaptive introgression of the suite of characters following secondary contact. We developed genomewide markers to infer the demographic history of gene flow between different genetic lineages, identify the spread of the sexually selected syndrome, and test the prediction that introgression should be asymmetric and heterogeneous across the genome. Our results show that secondary contact was accompanied by gene flow in both directions across most of the genome, but with approximately 3% of the genome showing highly asymmetric introgression in the predicted direction. Demographic simulations reveal that this asymmetric gene flow is more recent than the initial secondary contact, and the data suggest that the exaggerated male sexual characters originated within the Italian lineage and subsequently spread throughout this lineage before eventually reaching the contact zone. These results demonstrate that sexual selection can cause a suite of characters to spread throughout both closely and distantly related lineages with limited gene flow across the genome at large.© 2018 John Wiley & Sons Ltd.

September 22, 2019 |

Comparative genomic and methylome analysis of non-virulent D74 and virulent Nagasaki Haemophilus parasuis isolates.

Haemophilus parasuis is a respiratory pathogen of swine and the etiological agent of Glässer’s disease. H. parasuis isolates can exhibit different virulence capabilities ranging from lethal systemic disease to subclinical carriage. To identify genomic differences between phenotypically distinct strains, we obtained the closed whole-genome sequence annotation and genome-wide methylation patterns for the highly virulent Nagasaki strain and for the non-virulent D74 strain. Evaluation of the virulence-associated genes contained within the genomes of D74 and Nagasaki led to the discovery of a large number of toxin-antitoxin (TA) systems within both genomes. Five predicted hemolysins were identified as unique to Nagasaki and seven putative contact-dependent growth inhibition toxin proteins were identified only in strain D74. Assessment of all potential vtaA genes revealed thirteen present in the Nagasaki genome and three in the D74 genome. Subsequent evaluation of the predicted protein structure revealed that none of the D74 VtaA proteins contain a collagen triple helix repeat domain. Additionally, the predicted protein sequence for two D74 VtaA proteins is substantially longer than any predicted Nagasaki VtaA proteins. Fifteen methylation sequence motifs were identified in D74 and fourteen methylation sequence motifs were identified in Nagasaki using SMRT sequencing analysis. Only one of the methylation sequence motifs was observed in both strains indicative of the diversity between D74 and Nagasaki. Subsequent analysis also revealed diversity in the restriction-modification systems harbored by D74 and Nagasaki. The collective information reported in this study will aid in the development of vaccines and intervention strategies to decrease the prevalence and disease burden caused by H. parasuis.

September 22, 2019 |

Complete genome sequencing of Comamonas kerstersii 8943, a causative agent for peritonitis.

Because of poor differentiation among the members of genus Comamonas using phenotypic methods, human infections caused by C. kerstersii are sporadically reported in the literature. Here, we represent the first complete genome sequence of C. kerstersii 8943, which caused peritonitis in a patient with continuous ambulatory peritoneal dialysis (CAPD). The complete genome with no gaps was obtained using third-generation Pacific Biosciences (PacBio) RSII sequencing system with single-molecule real-time (SMRT) analysis. Protein-coding genes, rRNAs and tRNAs were predicted. Functional annotations of the genome using different databases revealed several genes related to pathogenicity including antibiotic resistance genes and prophages. Our work demonstrates that whole genome sequencing can enhance the resolution of clinical investigations and our data can be used as a reference genome during the rapid diagnosis of C. kerstersii infections in the future.

September 22, 2019 |

Characterization of Streptococcus pluranimalium from a cattle with mastitis by whole genome sequencing and functional validation.

Streptococcus pluranimalium is a new member of the Streptococcus genus isolated from multiple different animal hosts. It has been identified as a pathogen associated with subclinical mastitis, valvular endocarditis and septicaemia in animals. Moreover, this bacterium has emerged as a new pathogen for human infective endocarditis and brain abscess. However, the patho-biological properties of S. pluranimalium remain virtually unknown. The aim of this study was to determine the complete genome sequence of S. pluranimalium strain TH11417 isolated from a cattle with mastitis, and to characterize its antimicrobial resistance, virulence, and carbon catabolism.The genome of S. pluranimalium TH11417, determined by single-molecule real-time (SMRT) sequencing, consists of 2,065,522 base pair (bp) with a G?+?C content of 38.65%, 2,007 predicted coding sequence (CDS), 58 transfer RNA (tRNA) genes and five ribosome RNA (rRNA) operons. It contains a novel ISSpl1 element (a memeber of the IS3 family) and a ?11417.1 prophage that carries the mef(A), msr(D) and lnu(C) genes. Consistently, our antimicrobial susceptibility test confirmed that S. pluranimalium TH11417 was resistant to erythromycin and lincomycin. However, this strain did not show virulence in murine pneumonia (intranasal inoculation, 107 colony forming unit – CFU) and sepsis (intraperitoneal inoculation, 107 CFU) models. Additionally, this strain is able to grow with glucose, lactose or galactose as the sole carbon source, and possesses a lactose-specific phosphoenolpyruvate-dependent phosphotransferase system (PTS).We reported the first whole genome sequence of S. pluranimalium isolated from a cattle with mastitis. It harbors a prophage carrying the mef(A), msr(D) and lnu(C) genes, and is avirulent in the murine infection model.

Auto Tag: Validation

Fast and inexpensive protocols for consistent extraction of high quality DNA and RNA from challenging plant and fungal samples for high-throughput SNP genotyping and sequencing applications.

Type II restriction modification system in Ureaplasma parvum OMC-P162 strain.

How long are long tandem repeats? A challenge for current methods of whole-genome sequence assembly: The case of satellites in Caenorhabditis elegans.

Repeat elements organise 3D genome structure and mediate transcription in the filamentous fungus Epichloë festucae.

SKA: Split Kmer Analysis Toolkit for Bacterial Genomic Epidemiology

Targeted genotyping of variable number tandem repeats with adVNTR.

Constant conflict between Gypsy LTR retrotransposons and CHH methylation within a stress-adapted mangrove genome.

Functional genomic analysis of phthalate acid ester (PAE) catabolism genes in the versatile PAE-mineralising bacterium Rhodococcus sp. 2G.

Computational tools to unmask transposable elements.

Genomic characterization reveals significant divergence within Chlorella sorokiniana (Chlorellales, Trebouxiophyceae)

Alpha- and beta-mannan utilization by marine Bacteroidetes.

Genomic evidence for asymmetric introgression by sexual selection in the common wall lizard.

Comparative genomic and methylome analysis of non-virulent D74 and virulent Nagasaki Haemophilus parasuis isolates.

Complete genome sequencing of Comamonas kerstersii 8943, a causative agent for peritonitis.

Characterization of Streptococcus pluranimalium from a cattle with mastitis by whole genome sequencing and functional validation.

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert