Variant detection Archives - Page 56 of 65

July 7, 2019

Complete genome sequence of emm4 Streptococcus pyogenes MEW427, a throat isolate from a child meeting clinical criteria for pediatric autoimmune neuropsychiatric disorders associated with streptococcus (PANDAS).

We report the complete genome assembly of the Streptococcus pyogenes type emm4 strain MEW427 (also referred to as strain UM001 in the Pediatric Acute-Onset Neuropsychiatric Syndrome [PANS] Research Consortium), a throat isolate from a child with acute-onset neuropsychiatric symptoms meeting clinical criteria for PANDAS (pediatric autoimmune neuropsychiatric disorders associated with streptococcus). The genome length is 1,814,455 bp with 38.51% G+C%. Copyright © 2016 Jacob et al.

July 7, 2019

Draft genome sequences of three European laboratory derivatives from enterohemorrhagic Escherichia coli O157:H7 strain EDL933, including two plasmids.

Escherichia coliO157:H7 EDL933, isolated in 1982 in the United States, was the first enterohemorrhagicE. coli(EHEC) strain sequenced. Unfortunately, European labs can no longer receive the original strain. We checked three European EDL933 derivatives and found major genetic deviations (deletions, inversions) in two strains. All EDL933 strains contain the cryptic EHEC-plasmid, not reported before. Copyright © 2016 Fellner et al.

July 7, 2019

Exploring structural variants in environmentally sensitive gene families.

Environmentally sensitive plant gene families like NBS-LRRs, receptor kinases, defensins and others, are known to be highly variable. However, most existing strategies for discovering and describing structural variation in complex gene families provide incomplete and imperfect results. The move to de novo genome assemblies for multiple accessions or individuals within a species is enabling more comprehensive and accurate insights about gene family variation. Earlier array-based genome hybridization and sequence-based read mapping methods were limited by their reliance on a reference genome and by misplacement of paralogous sequences. Variant discovery based on de novo genome assemblies overcome the problems arising from a reference genome and reduce sequence misplacement. As de novo genome sequencing moves to the use of longer reads, artifacts will be minimized, intact tandem gene clusters will be constructed accurately, and insights into rapid evolution will become feasible. Copyright © 2016 Elsevier Ltd. All rights reserved.

July 7, 2019

The Atlantic salmon genome provides insights into rediploidization.

The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.

July 7, 2019

Conservation of the essential genome among Caulobacter and Brevundimonas species.

When the genomes of Caulobacter isolates NA1000 and K31 were compared, numerous genome rearrangements were observed. In contrast, similar comparisons of closely related species of other bacterial genera revealed nominal rearrangements. A phylogenetic analysis of the 16S rRNA indicated that K31 is more closely related to Caulobacter henricii CB4 than to other known Caulobacters. Therefore, we sequenced the CB4 genome and compared it to all of the available Caulobacter genomes to study genome rearrangements, discern the conservation of the NA1000 essential genome, and address concerns about using 16S rRNA to group Caulobacter species. We also sequenced the novel bacteria, Brevundimonas DS20, a representative of the genus most closely related to Caulobacter and used it as part of an outgroup for phylogenetic comparisons. We expected to find that there would be fewer rearrangements when comparing more closely related Caulobacters. However, we found that relatedness was not correlated with the amount of observed “genome scrambling.” We also discovered that nearly all of the essential genes previously identified for C. crescentus are present in the other Caulobacter genomes and in the Brevundimonas genomes as well. However, a few of these essential genes were only found in NA1000, and some were missing in a combination of one or more species, while other proteins were 100 % identical across species. Also, phylogenetic comparisons of highly conserved genomic regions revealed clades similar to those identified by 16S rRNA-based phylogenies, verifying that 16S rRNA sequence comparisons are a valid method for grouping Caulobacters.

July 7, 2019

Gene duplication confers enhanced expression of 27-kDa ?-zein for endosperm modification in quality protein maize.

The maizeopaque2(o2) mutant has a high nutritional value but it develops a chalky endosperm that limits its practical use. Genetic selection foro2modifiers can convert the normally chalky endosperm of the mutant into a hard, vitreous phenotype, yielding what is known as quality protein maize (QPM). Previous studies have shown that enhanced expression of 27-kDa ?-zein in QPM is essential for endosperm modification. Taking advantage of genome-wide association study analysis of a natural population, linkage mapping analysis of a recombinant inbred line population, and map-based cloning, we identified a quantitative trait locus (q?27) affecting expression of 27-kDa ?-zein.q?27was mapped to the same region as the majoro2 modifier(o2 modifier1) on chromosome 7 near the 27-kDa ?-zein locus.q?27resulted from a 15.26-kb duplication at the 27-kDa ?-zein locus, which increases the level of gene expression. This duplication occurred before maize domestication; however, the gene structure ofq?27appears to be unstable and the DNA rearrangement frequently occurs at this locus. Because enhanced expression of 27-kDa ?-zein is critical for endosperm modification in QPM,q?27is expected to be under artificial selection. This discovery provides a useful molecular marker that can be used to accelerate QPM breeding.

July 7, 2019

Single-molecule sequencing assists genome assembly improvement and structural variation inference.

Dear editor, The single-molecule real-time (SMRT) sequencing platform presented by Pacific Biosciences (PacBio) is regarded as a third-generation sequencing technology (Eid et al., 2009, Roberts et al., 2013). PacBio delivers long reads from several to tens of kilobases (kbs), which are ideal for filling unsequenced gaps due to unusual sequence contexts, such as high-GC content or repeat-rich regions (Bashir et al., 2012, Berlin et al., 2015, Chaisson et al., 2015). PacBio long reads are also favorable for detecting large DNA fragments harboring structural variations (SVs), such as inversions, translocations, duplications, and large insertions/deletions (indels) (Ritz et al., 2010, English et al., 2014). However, one drawback of PacBio is the high error rate of base calling for single pass coverage of the genome (Au et al., 2012, Koren et al., 2012). This drawback can be mitigated by increasing sequencing coverage to achieve high consensus accuracy, but the requirements may be prohibitive for the de novo assembly of large- or medium-size genomes using only PacBio when considering both budgetary and computational costs. Alternatively, PacBio may be used for assembly improvement of near-finished reference genomes, especially for filling gaps in which unsequenced bases are represented by the letter N (English et al., 2012). Here, we combined PacBio (~15x) with Illumina reads (~40x) to improve the genome assemblies of African wild (Oryza barthii) and cultivated rice (O. glaberrima), and to infer large SVs between O. barthii and O. glaberrima.

July 7, 2019

Complete genome sequence of Salmonella enterica serovar Typhimurium strain SO3 (sequence type 302) isolated from a baby with meningitis in Mexico.

The complete genome of Salmonella entericaserovar Typhimurium strain SO3 (sequence type 302), isolated from a fatal meningitis infection in Mexico, was determined using PacBio technology. The chromosome hosts six complete prophages and is predicted to harbor 51 genomic islands, including 13 pathogenicity islands (SPIs). It carries the Salmonella virulence plasmid (pSTV). Copyright © 2016 Vinuesa et al.

July 7, 2019

First complete genome sequence of the Dutch veterinary Coxiella burnetii strain NL3262, originating from the largest global Q fever outbreak, and draft genome sequence of its epidemiologically linked chronic human isolate NLhu3345937

The largest global Q fever outbreak occurred in The Netherlands during 2007 to 2010. Goats and sheep were identified as the major sources of disease. Here, we report the first complete genome sequence of Coxiella burnetiigoat outbreak strain NL3262 and that of an epidemiologically linked chronic human strain, both having the outbreak-related CbNL01multilocus variable-number tandem-repeat analysis (MLVA) genotype. Copyright © 2016 Kuley et al.

July 7, 2019

Evolutionary redesign of the Atlantic cod (Gadus morhua L.) Toll-like receptor repertoire by gene losses and expansions.

Genome sequencing of the teleost Atlantic cod demonstrated loss of the Major Histocompatibility Complex (MHC) class II, an extreme gene expansion of MHC class I and gene expansions and losses in the innate pattern recognition receptor (PRR) family of Toll-like receptors (TLR). In a comparative genomic setting, using an improved version of the genome, we characterize PRRs in Atlantic cod with emphasis on TLRs demonstrating the loss of TLR1/6, TLR2 and TLR5 and expansion of TLR7, TLR8, TLR9, TLR22 and TLR25. We find that Atlantic cod TLR expansions are strongly influenced by diversifying selection likely to increase the detectable ligand repertoire through neo- and subfunctionalization. Using RNAseq we find that Atlantic cod TLRs display likely tissue or developmental stage-specific expression patterns. In a broader perspective, a comprehensive vertebrate TLR phylogeny reveals that the Atlantic cod TLR repertoire is extreme with regards to losses and expansions compared to other teleosts. In addition we identify a substantial shift in TLR repertoires following the evolutionary transition from an aquatic vertebrate (fish) to a terrestrial (tetrapod) life style. Collectively, our findings provide new insight into the function and evolution of TLRs in Atlantic cod as well as the evolutionary history of vertebrate innate immunity.

July 7, 2019

Understanding the genetics of APOE and TOMM40 and role of mitochondrial structure and function in clinical pharmacology of Alzheimer’s disease.

The methodology of Genome-Wide Association Screening (GWAS) has been applied for more than a decade. Translation to clinical utility has been limited, especially in Alzheimer’s Disease (AD). It has become standard practice in the analyses of more than two dozen AD GWAS studies to exclude the apolipoprotein E (APOE) region because of its extraordinary statistical support, unique thus far in complex human diseases. New genes associated with AD are proposed frequently based on SNPs associated with odds ratio (OR) < 1.2. Most of these SNPs are not located within the associated gene exons or introns but are located variable distances away. Often pathologic hypotheses for these genes are presented, with little or no experimental support. By eliminating the analyses of the APOE-TOMM40 linkage disequilibrium region, the relationship and data of several genes that are co-located in that LD region have been largely ignored. Early negative interpretations limited the interest of understanding the genetic data derived from GWAS, particularly regarding the TOMM40 gene. This commentary describes the history and problem(s) in interpretation of the genetic interrogation of the "APOE" region and provides insight into a metabolic mitochondrial basis for the etiology of AD using both APOE and TOMM40 genetics. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

July 7, 2019

Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.

Long arrays of near-identical tandem repeats are a common feature of centromeric and subtelomeric regions in complex genomes. These sequences present a source of repeat structure diversity that is commonly ignored by standard genomic tools. Unlike reads shorter than the underlying repeat structure that rely on indirect inference methods, e.g. assembly, long reads allow direct inference of satellite higher order repeat structure. To automate characterization of local centromeric tandem repeat sequence variation we have designed Alpha-CENTAURI (ALPHA satellite CENTromeric AUtomated Repeat Identification), that takes advantage of Pacific Bioscience long-reads from whole-genome sequencing datasets. By operating on reads prior to assembly, our approach provides a more comprehensive set of repeat-structure variants and is not impacted by rearrangements or sequence underrepresentation due to misassembly.We demonstrate the utility of Alpha-CENTAURI in characterizing repeat structure for alpha satellite containing reads in the hydatidiform mole (CHM1, haploid-like) genome. The pipeline is designed to report local repeat organization summaries for each read, thereby monitoring rearrangements in repeat units, shifts in repeat orientation and sites of array transition into non-satellite DNA, typically defined by transposable element insertion. We validate the method by showing consistency with existing centromere high order repeat references. Alpha-CENTAURI can, in principle, run on any sequence data, offering a method to generate a sequence repeat resolution that could be readily performed using consensus sequences available for other satellite families in genomes without high-quality reference assemblies.Documentation and source code for Alpha-CENTAURI are freely available at http://github.com/volkansevim/alpha-CENTAURI CONTACT: ali.bashir@mssm.eduSupplementary information: Supplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press.

July 7, 2019

Near-Complete Genome Sequence of Clostridium paradoxum Strain JW-YL-7.

Clostridium paradoxum strain JW-YL-7 is a moderately thermophilic anaerobic alkaliphile isolated from the municipal sewage treatment plant in Athens, GA. We report the near-complete genome sequence of C. paradoxum strain JW-YL-7 obtained by using PacBio DNA sequencing and Pilon for sequence assembly refinement with Illumina data. Copyright © 2016 Lancaster et al.

July 7, 2019

Next-generation sequencing-based detection of germline L1-mediated transductions.

While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking.Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species.By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.

July 7, 2019

Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida.

Petunia hybrida is a popular bedding plant that has a long history as a genetic model system. We report the whole-genome sequencing and assembly of inbred derivatives of its two wild parents, P. axillaris N and P. inflata S6. The assemblies include 91.3% and 90.2% coverage of their diploid genomes (1.4 Gb; 2n?=?14) containing 32,928 and 36,697 protein-coding genes, respectively. The genomes reveal that the Petunia lineage has experienced at least two rounds of hexaploidization: the older gamma event, which is shared with most Eudicots, and a more recent Solanaceae event that is shared with tomato and other solanaceous species. Transcription factors involved in the shift from bee to moth pollination reside in particularly dynamic regions of the genome, which may have been key to the remarkable diversity of floral colour patterns and pollination systems. The high-quality genome sequences will enhance the value of Petunia as a model system for research on unique biological phenomena such as small RNAs, symbiosis, self-incompatibility and circadian rhythms.

Auto Tag: Variant detection

Complete genome sequence of emm4 Streptococcus pyogenes MEW427, a throat isolate from a child meeting clinical criteria for pediatric autoimmune neuropsychiatric disorders associated with streptococcus (PANDAS).

Draft genome sequences of three European laboratory derivatives from enterohemorrhagic Escherichia coli O157:H7 strain EDL933, including two plasmids.

Exploring structural variants in environmentally sensitive gene families.

The Atlantic salmon genome provides insights into rediploidization.

Conservation of the essential genome among Caulobacter and Brevundimonas species.

Gene duplication confers enhanced expression of 27-kDa ?-zein for endosperm modification in quality protein maize.

Single-molecule sequencing assists genome assembly improvement and structural variation inference.

Complete genome sequence of Salmonella enterica serovar Typhimurium strain SO3 (sequence type 302) isolated from a baby with meningitis in Mexico.

First complete genome sequence of the Dutch veterinary Coxiella burnetii strain NL3262, originating from the largest global Q fever outbreak, and draft genome sequence of its epidemiologically linked chronic human isolate NLhu3345937

Evolutionary redesign of the Atlantic cod (Gadus morhua L.) Toll-like receptor repertoire by gene losses and expansions.

Understanding the genetics of APOE and TOMM40 and role of mitochondrial structure and function in clinical pharmacology of Alzheimer’s disease.

Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing.

Near-Complete Genome Sequence of Clostridium paradoxum Strain JW-YL-7.

Next-generation sequencing-based detection of germline L1-mediated transductions.

Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert