Haplotype Archives - Page 40 of 49

July 7, 2019

Non-toxin-producing Bacillus cereus strains belonging to the B. anthracis clade isolated from the International Space Station.

In an ongoing Microbial Observatory investigation of the International Space Station (ISS), 11 Bacillus strains (2 from the Kibo Japanese experimental module, 4 from the U.S. segment, and 5 from the Russian module) were isolated and their whole genomes were sequenced. A comparative analysis of the 16S rRNA gene sequences of these isolates showed the highest similarity (>99%) to the Bacillus anthracis-B. cereus-B. thuringiensis group. The fatty acid composition, polar lipid profile, peptidoglycan type, and matrix-assisted laser desorption ionization-time of flight profiles were consistent with the B. cereus sensu lato group. The phenotypic traits such as motile rods, enterotoxin production, lack of capsule, and resistance to gamma phage/penicillin observed in ISS isolates were not characteristics of B. anthracis. Whole-genome sequence characterizations showed that ISS strains had the plcR non-B. anthracis ancestral “C” allele and lacked anthrax toxin-encoding plasmids pXO1 and pXO2, excluding their identification as B. anthracis. The genetic identities of all 11 ISS isolates characterized via gyrB analyses arbitrarily identified them as members of the B. cereus group, but traditional DNA-DNA hybridization (DDH) showed that the ISS isolates are similar to B. anthracis (88% to 90%) but distant from the B. cereus (42%) and B. thuringiensis (48%) type strains. The DDH results were supported by average nucleotide identity (>98.5%) and digital DDH (>86%) analyses. However, the collective phenotypic traits and genomic evidence were the reasons to exclude the ISS isolates from B. anthracis. Nevertheless, multilocus sequence typing and whole-genome single nucleotide polymorphism analyses placed these isolates in a clade that is distinct from previously described members of the B. cereus sensu lato group but closely related to B. anthracis. IMPORTANCE The International Space Station Microbial Observatory (Microbial Tracking-1) study is generating a microbial census of the space station’s surfaces and atmosphere by using advanced molecular microbial community analysis techniques supported by traditional culture-based methods and modern bioinformatic computational modeling. This approach will lead to long-term, multigenerational studies of microbial population dynamics in a closed environment and address key questions, including whether microgravity influences the evolution and genetic modification of microorganisms. The spore-forming Bacillus cereus sensu lato group consists of pathogenic (B. anthracis), food poisoning (B. cereus), and biotechnologically useful (B. thuringiensis) microorganisms; their presence in a closed system such as the ISS might be a concern for the health of crew members. A detailed characterization of these potential pathogens would lead to the development of suitable countermeasures that are needed for long-term future missions and a better understanding of microorganisms associated with space missions.

July 7, 2019

Untangling heteroplasmy, structure, and evolution of an atypical mitochondrial genome by PacBio Sequencing.

The highly compact mitochondrial (mt) genome of terrestrial isopods (Oniscidae) presents two unusual features. First, several loci can individually encode two tRNAs, thanks to single nucleotide polymorphisms at anticodon sites. Within-individual variation (heteroplasmy) at these loci is thought to have been maintained for millions of years because individuals that do not carry all tRNA genes die, resulting in strong balancing selection. Second, the oniscid mtDNA genome comes in two conformations: a ~14 kb linear monomer and a ~28 kb circular dimer comprising two monomer units fused in palindrome. We hypothesized that heteroplasmy actually results from two genome units of the same dimeric molecule carrying different tRNA genes at mirrored loci. This hypothesis, however, contradicts the earlier proposition that dimeric molecules result from the replication of linear monomers-a process that should yield totally identical genome units within a dimer. To solve this contradiction, we used the SMRT (PacBio) technology to sequence mirrored tRNA loci in single dimeric molecules. We show that dimers do present different tRNA genes at mirrored loci; thus covalent linkage, rather than balancing selection, maintains vital variation at anticodons. We also leveraged unique features of the SMRT technology to detect linear monomers closed by hairpins and carrying noncomplementary bases at anticodons. These molecules contain the necessary information to encode two tRNAs at the same locus, and suggest new mechanisms of transition between linear and circular mtDNA. Overall, our analyses clarify the evolution of an atypical mt genome where dimerization counterintuitively enabled further mtDNA compaction. Copyright © 2017 by the Genetics Society of America.

July 7, 2019

Genome sequencing reveals the origin of the allotetraploid Arabidopsis suecica.

Polyploidy is an example of instantaneous speciation when it involves the formation of a new cytotype that is incompatible with the parental species. Because new polyploid individuals are likely to be rare, establishment of a new species is unlikely unless polyploids are able to reproduce through self-fertilization (selfing), or asexually. Conversely, selfing (or asexuality) makes it possible for polyploid species to originate from a single individual-a bona fide speciation event. The extent to which this happens is not known. Here, we consider the origin of Arabidopsis suecica, a selfing allopolyploid between Arabidopsis thaliana and Arabidopsis arenosa, which has hitherto been considered to be an example of a unique origin. Based on whole-genome re-sequencing of 15 natural A. suecica accessions, we identify ubiquitous shared polymorphism with the parental species, and hence conclusively reject a unique origin in favor of multiple founding individuals. We further estimate that the species originated after the last glacial maximum in Eastern Europe or central Eurasia (rather than Sweden, as the name might suggest). Finally, annotation of the self-incompatibility loci in A. suecica revealed that both loci carry non-functional alleles. The locus inherited from the selfing A. thaliana is fixed for an ancestral non-functional allele, whereas the locus inherited from the outcrossing A. arenosa is fixed for a novel loss-of-function allele. Furthermore, the allele inherited from A. thaliana is predicted to transcriptionally silence the allele inherited from A. arenosa, suggesting that loss of self-incompatibility may have been instantaneous.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

July 7, 2019

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

Candida krusei is a diploid, heterozygous yeast that is an opportunistic fungal pathogen in immunocompromised patients. This species also is utilized for fermenting cocoa beans during chocolate production. One major concern in the clinical setting is the innate resistance of this species to the most commonly used antifungal drug fluconazole. Here we report a high-quality genome sequence and assembly for the first clinical isolate of C. krusei, strain 81-B-5, into 11 scaffolds generated with PacBio sequencing technology. Gene annotation and comparative analysis revealed a unique profile of transporters that could play a role in drug resistance or adaptation to different environments. In addition, we show that while 82% of the genome is highly heterozygous, a 2.0 Mb region of the largest scaffold has undergone loss of heterozygosity. This genome will serve as a reference for further genetic studies of this pathogen. Copyright © 2017 Author et al.

July 7, 2019

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Motivation: Despite recent advances in algorithms design to characterize structural variation using high-throughput short read sequencing (HTS) data, characterization of novel sequence insertions longer than the average read length remains a challenging task. This is mainly due to both computational difficulties and the complexities imposed by genomic repeats in generating reliable assemblies to accurately detect both the sequence content and the exact location of such insertions. Additionally, de novo genome assembly algorithms typically require a very high depth of coverage, which may be a limiting factor for most genome studies. Therefore, characterization of novel sequence insertions is not a routine part of most sequencing projects. There are only a handful of algorithms that are specifically developed for novel sequence insertion discovery that can bypass the need for the whole genome de novo assembly. Still, most such algorithms rely on high depth of coverage, and to our knowledge there is only one method (PopIns) that can use multi-sample data to “collectively” obtain a very high coverage dataset to accurately find insertions common in a given population. Result: Here, we present Pamir, a new algorithm to efficiently and accurately discover and genotype novel sequence insertions using either single or multiple genome sequencing datasets. Pamir is able to detect breakpoint locations of the insertions and calculate their zygosity (i.e. heterozygous versus homozygous) by analyzing multiple sequence signatures, matching one-end-anchored sequences to small-scale de novo assemblies of unmapped reads, and conducting strand-aware local assembly. We test the efficacy of Pamir on both simulated and real data, and demonstrate its potential use in accurate and routine identification of novel sequence insertions in genome projects. Availability and implementation: Pamir is available at https://github.com/vpc-ccg/pamir. Contact:fhach@sfu.ca, prostatecentre.com or calkan@cs.bilkent.edu.tr Supplementary information:Supplementary data are available at Bioinformatics online.

July 7, 2019

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Assembly of complex genomes using short reads remains a major challenge, which usually yields highly fragmented assemblies. Generation of ultradense linkage maps is promising for anchoring such assemblies, but traditional linkage mapping methods are hindered by the infrequency and unevenness of meiotic recombination that limit attainable map resolution. Here we develop a sequencing-based “in vitro” linkage mapping approach (called RadMap), where chromosome breakage and segregation are realized by generating hundreds of “subhaploid” fosmid/bacterial-artificial-chromosome clone pools, and by restriction site-associated DNA sequencing of these clone pools to produce an ultradense whole-genome restriction map to facilitate genome scaffolding. A bootstrap-based minimum spanning tree algorithm is developed for grouping and ordering of genome-wide markers and is implemented in a user-friendly, integrated software package (AMMO). We perform extensive analyses to validate the power and accuracy of our approach in the model plant Arabidopsis thaliana and human. We also demonstrate the utility of RadMap for enhancing the contiguity of a variety of whole-genome shotgun assemblies generated using either short Illumina reads (300 bp) or long PacBio reads (6-14 kb), with up to 15-fold improvement of N50 (~816 kb-3.7 Mb) and high scaffolding accuracy (98.1-98.5%). RadMap outperforms BioNano and Hi-C when input assembly is highly fragmented (contig N50 = 54 kb). RadMap can capture wide-range contiguity information and provide an efficient and flexible tool for high-resolution physical mapping and scaffolding of highly fragmented assemblies. Copyright © 2017 Dou et al.

July 7, 2019

Genome graphs

There is increasing recognition that a single, monoploid reference genome is a poor universal reference structure for human genetics, because it represents only a tiny fraction of human variation. Adding this missing variation results in a structure that can be described as a mathematical graph: a genome graph. We demonstrate that, in comparison to the existing reference genome (GRCh38), genome graphs can substantially improve the fractions of reads that map uniquely and perfectly. Furthermore, we show that this fundamental simplification of read mapping transforms the variant calling problem from one in which many non-reference variants must be discovered de-novo to one in which the vast majority of variants are simply re-identified within the graph. Using standard benchmarks as well as a novel reference-free evaluation, we show that a simplistic variant calling procedure on a genome graph can already call variants at least as well as, and in many cases better than, a state-of-the-art method on the linear human reference genome. We anticipate that graph-based references will supplant linear references in humans and in other applications where cohorts of sequenced individuals are available.

July 7, 2019

The MHC locus and genetic susceptibility to autoimmune and infectious diseases.

In the past 50 years, variants in the major histocompatibility complex (MHC) locus, also known as the human leukocyte antigen (HLA), have been reported as major risk factors for complex diseases. Recent advances, including large genetic screens, imputation, and analyses of non-additive and epistatic effects, have contributed to a better understanding of the shared and specific roles of MHC variants in different diseases. We review these advances and discuss the relationships between MHC variants involved in autoimmune and infectious diseases. Further work in this area will help to distinguish between alternative hypotheses for the role of pathogens in autoimmune disease development.

July 7, 2019

Whole genome sequencing predicts novel human disease models in rhesus macaques.

Rhesus macaques are an important pre-clinical model of human disease. To advance our understanding of genomic variation that may influence disease, we surveyed genome-wide variation in 21 rhesus macaques. We employed best-practice variant calling, validated with Mendelian inheritance. Next, we used alignment data from our cohort to detect genomic regions likely to produce inaccurate genotypes, potentially due to either gene duplication or structural variation between individuals. We generated a final dataset of >16 million high confidence variants, including 13 million in Chinese-origin rhesus macaques, an increasingly important disease model. We detected an average of 131 mutations predicted to severely alter protein coding per animal, and identified 45 such variants that coincide with known pathogenic human variants. These data suggest that expanded screening of existing breeding colonies will identify novel models of human disease, and that increased genomic characterization can help inform research studies in macaques. Copyright © 2017 Elsevier Inc. All rights reserved.

July 7, 2019

A large gene family in fission yeast encodes spore killers that subvert Mendel’s law.

Spore killers in fungi are selfish genetic elements that distort Mendelian segregation in their favor. It remains unclear how many species harbor them and how diverse their mechanisms are. Here, we discover two spore killers from a natural isolate of the fission yeast Schizosaccharomyces pombe. Both killers belong to the previously uncharacterized wtf gene family with 25 members in the reference genome. These two killers act in strain-background-independent and genome-location-independent manners to perturb the maturation of spores not inheriting them. Spores carrying one killer are protected from its killing effect but not that of the other killer. The killing and protecting activities can be uncoupled by mutation. The numbers and sequences of wtf genes vary considerably between S. pombe isolates, indicating rapid divergence. We propose that wtf genes contribute to the extensive intraspecific reproductive isolation in S. pombe, and represent ideal models for understanding how segregation-distorting elements act and evolve.

July 7, 2019

The genetic basis of resistance and matching-allele interactions of a host-parasite system: The Daphnia magna-Pasteuria ramosa model.

Negative frequency-dependent selection (NFDS) is an evolutionary mechanism suggested to govern host-parasite coevolution and the maintenance of genetic diversity at host resistance loci, such as the vertebrate MHC and R-genes in plants. Matching-allele interactions of hosts and parasites that prevent the emergence of host and parasite genotypes that are universally resistant and infective are a genetic mechanism predicted to underpin NFDS. The underlying genetics of matching-allele interactions are unknown even in host-parasite systems with empirical support for coevolution by NFDS, as is the case for the planktonic crustacean Daphnia magna and the bacterial pathogen Pasteuria ramosa. We fine-map one locus associated with D. magna resistance to P. ramosa and genetically characterize two haplotypes of the Pasteuria resistance (PR-) locus using de novo genome and transcriptome sequencing. Sequence comparison of PR-locus haplotypes finds dramatic structural polymorphisms between PR-locus haplotypes including a large portion of each haplotype being composed of non-homologous sequences resulting in haplotypes differing in size by 66 kb. The high divergence of PR-locus haplotypes suggest a history of multiple, diverse and repeated instances of structural mutation events and restricted recombination. Annotation of the haplotypes reveals striking differences in gene content. In particular, a group of glycosyltransferase genes that is present in the susceptible but absent in the resistant haplotype. Moreover, in natural populations, we find that the PR-locus polymorphism is associated with variation in resistance to different P. ramosa genotypes, pointing to the PR-locus polymorphism as being responsible for the matching-allele interactions that have been previously described for this system. Our results conclusively identify a genetic basis for the matching-allele interaction observed in a coevolving host-parasite system and provide a first insight into its molecular basis.

July 7, 2019

Genome-wide identification of the mutation underlying fleece variation and discriminating ancestral hairy species from modern woolly sheep.

The composition and structure of fleece variation observed in mammals is a consequence of a strong selective pressure for fiber production after domestication. In sheep, fleece variation discriminates ancestral species carrying a long and hairy fleece from modern domestic sheep (Ovis aries) owning a short and woolly fleece. Here, we report that the “woolly” allele results from the insertion of an antisense EIF2S2 retrogene (called asEIF2S2) into the 3′ UTR of the IRF2BP2 gene leading to an abnormal IRF2BP2 transcript. We provide evidence that this chimeric IRF2BP2/asEIF2S2 messenger 1) targets the genuine sense EIF2S2 RNA and 2) creates a long endogenous double-stranded RNA which alters the expression of both EIF2S2 and IRF2BP2 mRNA. This represents a unique example of a phenotype arising via a RNA-RNA hybrid, itself generated through a retroposition mechanism. Our results bring new insights on the sheep population history thanks to the identification of the molecular origin of an evolutionary phenotypic variation.© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

July 7, 2019

Critical points for an accurate human genome analysis.

Next-generation sequencing is radically changing how DNA diagnostic laboratories operate. What started as a single-gene profession is now developing into gene panel sequencing and whole-exome and whole-genome sequencing (WES/WGS) analyses. With further advances in sequencing technology and concomitant price reductions, WGS will soon become the standard and be routinely offered. Here, we focus on the critical steps involved in performing WGS, with a particular emphasis on points where WGS differs from WES, the important variables that should be taken into account, and the quality control measures that can be taken to monitor the process. The points discussed here, combined with recent publications on guidelines for reporting variants, will facilitate the routine implementation of WGS into a diagnostic setting.© 2017 Wiley Periodicals, Inc.

July 7, 2019

Genetic control of plasticity of oil yield for combined abiotic stresses using a joint approach of crop modelling and genome-wide association.

Understanding the genetic basis of phenotypic plasticity is crucial for predicting and managing climate change effects on wild plants and crops. Here, we combined crop modelling and quantitative genetics to study the genetic control of oil yield plasticity for multiple abiotic stresses in sunflower. First, we developed stress indicators to characterize 14 environments for three abiotic stresses (cold, drought and nitrogen) using the SUNFLO crop model and phenotypic variations of three commercial varieties. The computed plant stress indicators better explain yield variation than descriptors at the climatic or crop levels. In those environments, we observed oil yield of 317 sunflower hybrids and regressed it with three selected stress indicators. The slopes of cold stress norm reaction were used as plasticity phenotypes in the following genome-wide association study. Among the 65 534 tested Single Nucleotide Polymorphisms (SNPs), we identified nine quantitative trait loci controlling oil yield plasticity to cold stress. Associated single nucleotide polymorphisms are localized in genes previously shown to be involved in cold stress responses: oligopeptide transporters, lipid transfer protein, cystatin, alternative oxidase or root development. This novel approach opens new perspectives to identify genomic regions involved in genotype-by-environment interaction of a complex traits to multiple stresses in realistic natural or agronomical conditions.© 2017 John Wiley & Sons Ltd.

July 7, 2019

Auto Tag: Haplotype

Non-toxin-producing Bacillus cereus strains belonging to the B. anthracis clade isolated from the International Space Station.

Untangling heteroplasmy, structure, and evolution of an atypical mitochondrial genome by PacBio Sequencing.

Genome sequencing reveals the origin of the allotetraploid Arabidopsis suecica.

Whole genome sequence of the heterozygous clinical isolate Candida krusei 81-B-5.

Discovery and genotyping of novel sequence insertions in many sequenced individuals

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Genome graphs

The MHC locus and genetic susceptibility to autoimmune and infectious diseases.

Whole genome sequencing predicts novel human disease models in rhesus macaques.

A large gene family in fission yeast encodes spore killers that subvert Mendel’s law.

The genetic basis of resistance and matching-allele interactions of a host-parasite system: The Daphnia magna-Pasteuria ramosa model.

Genome-wide identification of the mutation underlying fleece variation and discriminating ancestral hairy species from modern woolly sheep.

Critical points for an accurate human genome analysis.

Genetic control of plasticity of oil yield for combined abiotic stresses using a joint approach of crop modelling and genome-wide association.

A novel HLA-B18 allele, HLA-B18:124, identified in a German volunteer bone marrow donor.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert