Allele Archives - Page 55 of 80

July 7, 2019

Botrytis, the good, the bad and the ugly

Botrytis spp. are efficient pathogens, causing devastating diseases and significant crop losses in a wide variety of plant species. Here we outline our review of these pathogens, as well as highlight the major advances of the past 10 years in studying Botrytis in interaction with its hosts. Progress in molecular genetics and the development of relevant phylogenetic markers in particular, has resulted in the characterisation of approximately 30 species. The host range of Botrytis spp. includes plant species that are members of 170 families of cultivated plants.

July 7, 2019

Lesions from patients with sporadic cerebral cavernous malformations harbor somatic mutations in the CCM genes: evidence for a common biochemical pathway for CCM pathogenesis.

Cerebral cavernous malformations (CCMs) are vascular lesions affecting the central nervous system. CCM occurs either sporadically or in an inherited, autosomal dominant manner. Constitutional (germline) mutations in any of three genes, KRIT1, CCM2 and PDCD10, can cause the inherited form. Analysis of CCM lesions from inherited cases revealed biallelic somatic mutations, indicating that CCM follows a Knudsonian two-hit mutation mechanism. It is still unknown, however, if the sporadic cases of CCM also follow this genetic mechanism. We extracted DNA from 11 surgically excised lesions from sporadic CCM patients, and sequenced the three CCM genes in each specimen using a next-generation sequencing approach. Four sporadic CCM lesion samples (36%) were found to contain novel somatic mutations. Three of the lesions contained a single somatic mutation, and one lesion contained two biallelic somatic mutations. Herein, we also describe evidence of somatic mosaicism in a patient presenting with over 130 CCM lesions localized to one hemisphere of the brain. Finally, in a lesion regrowth sample, we found that the regrown CCM lesion contained the same somatic mutation as the original lesion. Together, these data bolster the idea that all forms of CCM have a genetic underpinning of the two-hit mutation mechanism in the known CCM genes. Recent studies have found aberrant Rho kinase activation in inherited CCM pathogenesis, and we present evidence that this pathway is activated in sporadic CCM patients. These results suggest that all CCM patients, including those with the more common sporadic form, are potentially amenable to the same therapy. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

July 7, 2019

The effects of read length, quality and quantity on microsatellite discovery and primer development: from Illumina to PacBio.

The advent of next-generation sequencing (NGS) technologies has transformed the way microsatellites are isolated for ecological and evolutionary investigations. Recent attempts to employ NGS for microsatellite discovery have used the 454, Illumina, and Ion Torrent platforms, but other methods including single-molecule real-time DNA sequencing (Pacific Biosciences or PacBio) remain viable alternatives. We outline a workflow from sequence quality control to microsatellite marker validation in three plant species using PacBio circular consensus sequencing (CCS). We then evaluate the performance of PacBio CCS in comparison with other NGS platforms for microsatellite isolation, through simulations that focus on variations in read length, read quantity and sequencing error rate. Although quality control of CCS reads reduced microsatellite yield by around 50%, hundreds of microsatellite loci that are expected to have improved conversion efficiency to functional markers were retrieved for each species. The simulations quantitatively validate the advantages of long reads and emphasize the detrimental effects of sequencing errors on NGS-enabled microsatellite development. In view of the continuing improvement in read length on NGS platforms, sequence quality and the corresponding strategies of quality control will become the primary factors to consider for effective microsatellite isolation. Among current options, PacBio CCS may be optimal for rapid, small-scale microsatellite development due to its flexibility in scaling sequencing effort, while platforms such as Illumina MiSeq will provide cost-efficient solutions for multispecies microsatellite projects. © 2014 John Wiley & Sons Ltd.

July 7, 2019

A fault-tolerant method for HLA typing with PacBio data.

Human leukocyte antigen (HLA) genes are critical genes involved in important biomedical aspects, including organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most polymorphic genes in humans and the difference between two alleles is only a single base pair substitution in many cases. The next generation sequencing (NGS) technologies could be used for high throughput HLA typing but in silico methods are still needed to correctly assign the alleles of a sample. Computer scientists have developed such methods for various NGS platforms, such as Illumina, Roche 454 and Ion Torrent, based on the characteristics of the reads they generate. However, the method for PacBio reads was less addressed, probably owing to its high error rates. The PacBio system has the longest read length among available NGS platforms, and therefore is the only platform capable of having exon 2 and exon 3 of HLA genes on the same read to unequivocally solve the ambiguity problem caused by the “phasing” issue.We proposed a new method BayesTyping1 to assign HLA alleles for PacBio circular consensus sequencing reads using Bayes’ theorem. The method was applied to simulated data of the three loci HLA-A, HLA-B and HLA-DRB1. The experimental results showed its capability to tolerate the disturbance of sequencing errors and external noise reads.The BayesTyping1 method could overcome the problems of HLA typing using PacBio reads, which mostly arise from sequencing errors of PacBio reads and the divergence of HLA genes, to some extent.

July 7, 2019

Dubowitz syndrome is a complex comprised of multiple, genetically distinct and phenotypically overlapping disorders.

Dubowitz syndrome is a rare disorder characterized by multiple congenital anomalies, cognitive delay, growth failure, an immune defect, and an increased risk of blood dyscrasia and malignancy. There is considerable phenotypic variability, suggesting genetic heterogeneity. We clinically characterized and performed exome sequencing and high-density array SNP genotyping on three individuals with Dubowitz syndrome, including a pair of previously-described siblings (Patients 1 and 2, brother and sister) and an unpublished patient (Patient 3). Given the siblings’ history of bone marrow abnormalities, we also evaluated telomere length and performed radiosensitivity assays. In the siblings, exome sequencing identified compound heterozygosity for a known rare nonsense substitution in the nuclear ligase gene LIG4 (rs104894419, NM_002312.3:c.2440C>T) that predicts p.Arg814X (MAF:0.0002) and an NM_002312.3:c.613delT variant that predicts a p.Ser205Leufs*29 frameshift. The frameshift mutation has not been reported in 1000 Genomes, ESP, or ClinSeq. These LIG4 mutations were previously reported in the sibling sister; her brother had not been previously tested. Western blotting showed an absence of a ligase IV band in both siblings. In the third patient, array SNP genotyping revealed a de novo ~ 3.89 Mb interstitial deletion at chromosome 17q24.2 (chr 17:62,068,463-65,963,102, hg18), which spanned the known Carney complex gene PRKAR1A. In all three patients, a median lymphocyte telomere length of = 1st centile was observed and radiosensitivity assays showed increased sensitivity to ionizing radiation. Our work suggests that, in addition to dyskeratosis congenita, LIG4 and 17q24.2 syndromes also feature shortened telomeres; to confirm this, telomere length testing should be considered in both disorders. Taken together, our work and other reports on Dubowitz syndrome, as currently recognized, suggest that it is not a unitary entity but instead a collection of phenotypically similar disorders. As a clinical entity, Dubowitz syndrome will need continual re-evaluation and re-definition as its constituent phenotypes are determined.

July 7, 2019

The oxygen-independent metabolism of cyclic monoterpenes in Castellaniella defragrans 65Phen.

The facultatively anaerobic betaproteobacterium Castellaniella defragrans 65Phen utilizes acyclic, monocyclic and bicyclic monoterpenes as sole carbon source under oxic as well as anoxic conditions. A biotransformation pathway of the acyclic ß-myrcene required linalool dehydratase-isomerase as initial enzyme acting on the hydrocarbon. An in-frame deletion mutant did not use myrcene, but was able to grow on monocyclic monoterpenes. The genome sequence and a comparative proteome analysis together with a random transposon mutagenesis were conducted to identify genes involved in the monocyclic monoterpene metabolism. Metabolites accumulating in cultures of transposon and in-frame deletion mutants disclosed the degradation pathway.Castellaniella defragrans 65Phen oxidizes the monocyclic monoterpene limonene at the primary methyl group forming perillyl alcohol. The genome of 3.95 Mb contained a 70 kb genome island coding for over 50 proteins involved in the monoterpene metabolism. This island showed higher homology to genes of another monoterpene-mineralizing betaproteobacterium, Thauera terpenica 58EuT, than to genomes of the family Alcaligenaceae, which harbors the genus Castellaniella. A collection of 72 transposon mutants unable to grow on limonene contained 17 inactivated genes, with 46 mutants located in the two genes ctmAB (cyclic terpene metabolism). CtmA and ctmB were annotated as FAD-dependent oxidoreductases and clustered together with ctmE, a 2Fe-2S ferredoxin gene, and ctmF, coding for a NADH:ferredoxin oxidoreductase. Transposon mutants of ctmA, B or E did not grow aerobically or anaerobically on limonene, but on perillyl alcohol. The next steps in the pathway are catalyzed by the geraniol dehydrogenase GeoA and the geranial dehydrogenase GeoB, yielding perillic acid. Two transposon mutants had inactivated genes of the monoterpene ring cleavage (mrc) pathway. 2-Methylcitrate synthase and 2-methylcitrate dehydratase were also essential for the monoterpene metabolism but not for growth on acetate.The genome of Castellaniella defragrans 65Phen is related to other genomes of Alcaligenaceae, but contains a genomic island with genes of the monoterpene metabolism. Castellaniella defragrans 65Phen degrades limonene via a limonene dehydrogenase and the oxidation of perillyl alcohol. The initial oxidation at the primary methyl group is independent of molecular oxygen.

July 7, 2019

Insights into the preservation of the homomorphic sex-determining chromosome of Aedes aegypti from the discovery of a male-biased gene tightly linked to the M-locus.

The preservation of a homomorphic sex-determining chromosome in some organisms without transformation into a heteromorphic sex chromosome is a long-standing enigma in evolutionary biology. A dominant sex-determining locus (or M-locus) in an undifferentiated homomorphic chromosome confers the male phenotype in the yellow fever mosquito Aedes aegypti. Genetic evidence suggests that the M-locus is in a nonrecombining region. However, the molecular nature of the M-locus has not been characterized. Using a recently developed approach based on Illumina sequencing of male and female genomic DNA, we identified a novel gene, myo-sex, that is present almost exclusively in the male genome but can sporadically be found in the female genome due to recombination. For simplicity, we define sequences that are primarily found in the male genome as male-biased. Fluorescence in situ hybridization (FISH) on A. aegypti chromosomes demonstrated that the myo-sex probe localized to region 1q21, the established location of the M-locus. Myo-sex is a duplicated myosin heavy chain gene that is highly expressed in the pupa and adult male. Myo-sex shares 83% nucleotide identity and 97% amino acid identity with its closest autosomal paralog, consistent with ancient duplication followed by strong purifying selection. Compared with males, myo-sex is expressed at very low levels in the females that acquired it, indicating that myo-sex may be sexually antagonistic. This study establishes a framework to discover male-biased sequences within a homomorphic sex-determining chromosome and offers new insights into the evolutionary forces that have impeded the expansion of the nonrecombining M-locus in A. aegypti.

July 7, 2019

LUMPY: a probabilistic framework for structural variant discovery.

Comprehensive discovery of structural variation (SV) from whole genome sequencing data requires multiple detection signals including read-pair, split-read, read-depth and prior knowledge. Owing to technical challenges, extant SV discovery algorithms either use one signal in isolation, or at best use two sequentially. We present LUMPY, a novel SV discovery framework that naturally integrates multiple SV signals jointly across multiple samples. We show that LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency. We also report a set of 4,564 validated breakpoints from the NA12878 human genome. https://github.com/arq5x/lumpy-sv.

July 7, 2019

Association mapping, patterns of linkage disequilibrium and selection in the vicinity of the PHYTOCHROME C gene in pearl millet.

Linkage analysis confirmed the association in the region of PHYC in pearl millet. The comparison of genes found in this region suggests that PHYC is the best candidate. Major efforts are currently underway to dissect the phenotype-genotype relationship in plants and animals using existing populations. This method exploits historical recombinations accumulated in these populations. However, linkage disequilibrium sometimes extends over a relatively long distance, particularly in genomic regions containing polymorphisms that have been targets for selection. In this case, many genes in the region could be statistically associated with the trait shaped by the selected polymorphism. Statistical analyses could help in identifying the best candidate genes into such a region where an association is found. In a previous study, we proposed that a fragment of the PHYTOCHROME C gene (PHYC) is associated with flowering time and morphological variations in pearl millet. In the present study, we first performed linkage analyses using three pearl millet F2 families to confirm the presence of a QTL in the vicinity of PHYC. We then analyzed a wider genomic region of ~100 kb around PHYC to pinpoint the gene that best explains the association with the trait in this region. A panel of 90 pearl millet inbred lines was used to assess the association. We used a Markov chain Monte Carlo approach to compare 75 markers distributed along this 100-kb region. We found the best candidate markers on the PHYC gene. Signatures of selection in this region were assessed in an independent data set and pointed to the same gene. These results foster confidence in the likely role of PHYC in phenotypic variation and encourage the development of functional studies.

July 7, 2019

Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.

Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.

July 7, 2019

Genome sequencing of two Neorhizobium galegae strains reveals a noeT gene responsible for the unusual acetylation of the nodulation factors.

The species Neorhizobium galegae comprises two symbiovars that induce nodules on Galega plants. Strains of both symbiovars, orientalis and officinalis, induce nodules on the same plant species, but fix nitrogen only in their own host species. The mechanism behind this strict host specificity is not yet known. In this study, genome sequences of representatives of the two symbiovars were produced, providing new material for studying properties of N. galegae, with a special interest in genomic differences that may play a role in host specificity.The genome sequences confirmed that the two representative strains are much alike at a whole-genome level. Analysis of orthologous genes showed that N. galegae has a higher number of orthologs shared with Rhizobium than with Agrobacterium. The symbiosis plasmid of strain HAMBI 1141 was shown to transfer by conjugation under optimal conditions. In addition, both sequenced strains have an acetyltransferase gene which was shown to modify the Nod factor on the residue adjacent to the non-reducing-terminal residue. The working hypothesis that this gene is of major importance in directing host specificity of N. galegae could not, however, be confirmed.Strains of N. galegae have many genes differentiating them from strains of Agrobacterium, Rhizobium and Sinorhizobium. However, the mechanism behind their ecological difference is not evident. Although the final determinant for the strict host specificity of N. galegae remains to be identified, the gene responsible for the species-specific acetylation of the Nod factors was identified in this study. We propose the name noeT for this gene to reflect its role in symbiosis.

July 7, 2019

Safety of the surrogate microorganism Enterococcus faecium NRRL B-2354 for use in thermal process validation.

Enterococcus faecium NRRL B-2354 is a surrogate microorganism used in place of pathogens for validation of thermal processing technologies and systems. We evaluated the safety of strain NRRL B-2354 based on its genomic and functional characteristics. The genome of E. faecium NRRL B-2354 was sequenced and found to comprise a 2,635,572-bp chromosome and a 214,319-bp megaplasmid. A total of 2,639 coding sequences were identified, including 45 genes unique to this strain. Hierarchical clustering of the NRRL B-2354 genome with 126 other E. faecium genomes as well as pbp5 locus comparisons and multilocus sequence typing (MLST) showed that the genotype of this strain is most similar to commensal, or community-associated, strains of this species. E. faecium NRRL B-2354 lacks antibiotic resistance genes, and both NRRL B-2354 and its clonal relative ATCC 8459 are sensitive to clinically relevant antibiotics. This organism also lacks, or contains nonfunctional copies of, enterococcal virulence genes including acm, cyl, the ebp operon, esp, gelE, hyl, IS16, and associated phenotypes. It does contain scm, sagA, efaA, and pilA, although either these genes were not expressed or their roles in enterococcal virulence are not well understood. Compared with the clinical strains TX0082 and 1,231,502, E. faecium NRRL B-2354 was more resistant to acidic conditions (pH 2.4) and high temperatures (60°C) and was able to grow in 8% ethanol. These findings support the continued use of E. faecium NRRL B-2354 in thermal process validation of food products.

July 7, 2019

Whole-genome analysis of Exserohilum rostratum from an outbreak of fungal meningitis and other infections.

Exserohilum rostratum was the cause of most cases of fungal meningitis and other infections associated with the injection of contaminated methylprednisolone acetate produced by the New England Compounding Center (NECC). Until this outbreak, very few human cases of Exserohilum infection had been reported, and very little was known about this dematiaceous fungus, which usually infects plants. Here, we report using whole-genome sequencing (WGS) for the detection of single nucleotide polymorphisms (SNPs) and phylogenetic analysis to investigate the molecular origin of the outbreak using 22 isolates of E. rostratum retrieved from 19 case patients with meningitis or epidural/spinal abscesses, 6 isolates from contaminated NECC vials, and 7 isolates unrelated to the outbreak. Our analysis indicates that all 28 isolates associated with the outbreak had nearly identical genomes of 33.8 Mb. A total of 8 SNPs were detected among the outbreak genomes, with no more than 2 SNPs separating any 2 of the 28 genomes. The outbreak genomes were separated from the next most closely related control strain by ~136,000 SNPs. We also observed significant genomic variability among strains unrelated to the outbreak, which may suggest the possibility of cryptic speciation in E. rostratum. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

July 7, 2019

An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella.

Comparative genomics based on whole genome sequencing (WGS) is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks). Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1) next-generation sequencing (NGS) platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD), (2) algorithms used to construct a SNP (single nucleotide polymorphism) matrix (reference-based and reference-free), and (3) phylogenetic inference method (FastTreeMP, GARLI, and RAxML). We carried out these analyses on 194 whole genome sequences representing 107 unique Salmonella enterica subsp. enterica ser. Montevideo strains. Reference-based approaches for identifying SNPs produced trees that were significantly more similar to one another than those produced under the reference-free approach. Topologies inferred using a core matrix (i.e., no missing data) were significantly more discordant than those inferred using a non-core matrix that allows for some missing data. However, allowing for too much missing data likely results in a high false discovery rate of SNPs. When analyzing the same SNP matrix, we observed that the more thorough inference methods implemented in GARLI and RAxML produced more similar topologies than FastTreeMP. Our results also confirm that reproducibility varies among NGS platforms where the MiSeq had the lowest number of pairwise differences among replicate runs. Our investigation into the robustness of clustering patterns illustrates the importance of carefully considering how data from different platforms are combined and analyzed. We found clear differences in the topologies inferred, and certain methods performed significantly better than others for discriminating between the highly clonal organisms investigated here. The methods supported by our results represent a preliminary set of guidelines and a step towards developing validated standards for clustering based on whole genome sequence data.

July 7, 2019

Pseudoautosomal region 1 length polymorphism in the human population.

The human sex chromosomes differ in sequence, except for the pseudoautosomal regions (PAR) at the terminus of the short and the long arms, denoted as PAR1 and PAR2. The boundary between PAR1 and the unique X and Y sequences was established during the divergence of the great apes. During a copy number variation screen, we noted a paternally inherited chromosome X duplication in 15 independent families. Subsequent genomic analysis demonstrated that an insertional translocation of X chromosomal sequence into theMa Y chromosome generates an extended PAR. The insertion is generated by non-allelic homologous recombination between a 548 bp LTR6B repeat within the Y chromosome PAR1 and a second LTR6B repeat located 105 kb from the PAR boundary on the X chromosome. The identification of the reciprocal deletion on the X chromosome in one family and the occurrence of the variant in different chromosome Y haplogroups demonstrate this is a recurrent genomic rearrangement in the human population. This finding represents a novel mechanism shaping sex chromosomal evolution.

Auto Tag: Allele

Botrytis, the good, the bad and the ugly

Lesions from patients with sporadic cerebral cavernous malformations harbor somatic mutations in the CCM genes: evidence for a common biochemical pathway for CCM pathogenesis.

The effects of read length, quality and quantity on microsatellite discovery and primer development: from Illumina to PacBio.

A fault-tolerant method for HLA typing with PacBio data.

Dubowitz syndrome is a complex comprised of multiple, genetically distinct and phenotypically overlapping disorders.

The oxygen-independent metabolism of cyclic monoterpenes in Castellaniella defragrans 65Phen.

Insights into the preservation of the homomorphic sex-determining chromosome of Aedes aegypti from the discovery of a male-biased gene tightly linked to the M-locus.

LUMPY: a probabilistic framework for structural variant discovery.

Association mapping, patterns of linkage disequilibrium and selection in the vicinity of the PHYTOCHROME C gene in pearl millet.

Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.

Genome sequencing of two Neorhizobium galegae strains reveals a noeT gene responsible for the unusual acetylation of the nodulation factors.

Safety of the surrogate microorganism Enterococcus faecium NRRL B-2354 for use in thermal process validation.

Whole-genome analysis of Exserohilum rostratum from an outbreak of fungal meningitis and other infections.

An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella.

Pseudoautosomal region 1 length polymorphism in the human population.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert