Stanford University Archives - Page 13 of 13

July 7, 2019 |

Multiplication of blaOXA-23 is common in clinical Acinetobacter baumannii, but does not enhance carbapenem resistance.

To investigate the copy number of blaOXA-23 and its correlation with carbapenem resistance in carbapenem-resistant Acinetobacter baumannii (CRAB).A total of 113 blaOXA-23-positive clinical CRAB isolates were collected from two hospitals in Zhejiang province, China. Their genetic relatedness was determined by MLST. The MIC of imipenem was determined using the agar diffusion method and the copy number of blaOXA-23 was measured using quantitative real-time PCR (qRT-PCR). The complete genomes of five clinical CRAB strains were sequenced using PacBio technology to investigate the multiplication mechanism of blaOXA-23.Most of the isolates (100/113) belonged to global clone II and the MIC of imipenem ranged from 16 to 96 mg/L. The gene blaOXA-23 resided exclusively in Tn2006 or Tn2009. Approximately 38% of the isolates carried two or more copies of blaOXA-23. The copy number of blaOXA-23 was not correlated with the MIC of imipenem. Within the five sequenced strains, multiple copies of blaOXA-23 were either tandemly clustered or independently inserted at different genomic sites.Multiplication of blaOXA-23 is common in CRAB, but does not enhance carbapenem resistance. Multiplication can be present in the form of either tandem amplifications or independent insertions at different sites.© The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019 |

A photoreceptor contributes to the natural variation of diapause induction in Daphnia magna.

Diapause is an adaptation that allows organisms to survive harsh environmental conditions. In species occurring over broad habitat ranges, both the timing and the intensity of diapause induction can vary across populations, revealing patterns of local adaptation. Understanding the genetic architecture of this fitness-related trait would help clarify how populations adapt to their local environments. In the cyclical parthenogenetic crustacean Daphnia magna, diapause induction is a phenotypic plastic life history trait linked to sexual reproduction, as asexual females have the ability to switch to sexual reproduction and produce resting stages, their sole strategy for surviving habitat deterioration. We have previously shown that the induction of resting stage production correlates with changes in photoperiod that indicate the imminence of habitat deterioration and have identified a Quantitative Trait Locus (QTL) responsible for some of the variation in the induction of resting stages. Here, new data allows us to anchor the QTL to a large scaffold and then, using a combination of a new mapping panel, targeted association mapping and selection analysis in natural populations, to identify candidate genes within the QTL. Our results show that variation in a rhodopsin photoreceptor gene plays a significant role in the variation observed in resting stage induction. This finding provides a mechanistic explanation for the link between diapause and day-length perception that has been suggested in diverse arthropod taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019 |

Information-optimal genome assembly via sparse read-overlap graphs.

In the context of third-generation long-read sequencing technologies, read-overlap-based approaches are expected to play a central role in the assembly step. A fundamental challenge in assembling from a read-overlap graph is that the true sequence corresponds to a Hamiltonian path on the graph, and, under most formulations, the assembly problem becomes NP-hard, restricting practical approaches to heuristics. In this work, we avoid this seemingly fundamental barrier by first setting the computational complexity issue aside, and seeking an algorithm that targets information limits In particular, we consider a basic feasibility question: when does the set of reads contain enough information to allow unambiguous reconstruction of the true sequence?Based on insights from this information feasibility question, we present an algorithm-the Not-So-Greedy algorithm-to construct a sparse read-overlap graph. Unlike most other assembly algorithms, Not-So-Greedy comes with a performance guarantee: whenever information feasibility conditions are satisfied, the algorithm reduces the assembly problem to an Eulerian path problem on the resulting graph, and can thus be solved in linear time. In practice, this theoretical guarantee translates into assemblies of higher quality. Evaluations on both simulated reads from real genomes and a PacBio Escherichia coli K12 dataset demonstrate that Not-So-Greedy compares favorably with standard string graph approaches in terms of accuracy of the resulting read-overlap graph and contig N50.Available at github.com/samhykim/nsgcourtade@eecs.berkeley.edu or dntse@stanford.eduSupplementary data are available at Bioinformatics online.© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

July 7, 2019 |

svclassify: a method to establish benchmark structural variant calls.

The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but no similar high-quality benchmark SV calls exist for this genome. Since SV callers output highly discordant results, we developed methods to combine multiple forms of evidence from multiple sequencing technologies to classify candidate SVs into likely true or false positives. Our method (svclassify) calculates annotations from one or more aligned bam files from many high-throughput sequencing technologies, and then builds a one-class model using these annotations to classify candidate SVs as likely true or false positives.We first used pedigree analysis to develop a set of high-confidence breakpoint-resolved large deletions. We then used svclassify to cluster and classify these deletions as well as a set of high-confidence deletions from the 1000 Genomes Project and a set of breakpoint-resolved complex insertions from Spiral Genetics. We find that likely SVs cluster separately from likely non-SVs based on our annotations, and that the SVs cluster into different types of deletions. We then developed a supervised one-class classification method that uses a training set of random non-SV regions to determine whether candidate SVs have abnormal annotations different from most of the genome. To test this classification method, we use our pedigree-based breakpoint-resolved SVs, SVs validated by the 1000 Genomes Project, and assembly-based breakpoint-resolved insertions, along with semi-automated visualization using svviz.We find that candidate SVs with high scores from multiple technologies have high concordance with PCR validation and an orthogonal consensus method MetaSV (99.7 % concordant), and candidate SVs with low scores are questionable. We distribute a set of 2676 high-confidence deletions and 68 high-confidence insertions with high svclassify scores from these call sets for benchmarking SV callers. We expect these methods to be particularly useful for establishing high-confidence SV calls for benchmark samples that have been characterized by multiple technologies.

July 7, 2019 |

Collection and storage of HLA NGS genotyping data for the 17th International HLA and Immunogenetics Workshop.

For over 50?years, the International HLA and Immunogenetics Workshops (IHIW) have advanced the fields of histocompatibility and immunogenetics (H&I) via community sharing of technology, experience and reagents, and the establishment of ongoing collaborative projects. Held in the fall of 2017, the 17th IHIW focused on the application of next generation sequencing (NGS) technologies for clinical and research goals in the H&I fields. NGS technologies have the potential to allow dramatic insights and advances in these fields, but the scope and sheer quantity of data associated with NGS raise challenges for their analysis, collection, exchange and storage. The 17th IHIW adopted a centralized approach to these issues, and we developed the tools, services and systems to create an effective system for capturing and managing these NGS data. We worked with NGS platform and software developers to define a set of distinct but equivalent NGS typing reports that record NGS data in a uniform fashion. The 17th IHIW database applied our standards, tools and services to collect, validate and store those structured, multi-platform data in an automated fashion. We have created community resources to enable exploration of the vast store of curated sequence and allele-name data in the IPD-IMGT/HLA Database, with the goal of creating a long-term community resource that integrates these curated data with new NGS sequence and polymorphism data, for advanced analyses and applications. Copyright © 2017 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

July 7, 2019 |

Full genome sequence of the Western Reserve strain of vaccinia virus determined by third-generation sequencing.

The vaccinia virus is a large, complex virus belonging to thePoxviridaefamily. Here, we report the complete, annotated genome sequence of the neurovirulent Western Reserve laboratory strain of this virus, which was sequenced on the Pacific Biosciences RS II and Oxford Nanopore MinION platforms. Copyright © 2018 Prazsák et al.

July 7, 2019 |

A fast approximate algorithm for mapping long reads to large reference databases.

Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290?×?faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each =5?kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.

July 7, 2019 |

Darwin: A genomics co-processor provides up to 15,000 X acceleration on long read assembly

of life in fundamental ways. Genomics data, however, is far outpacing Moore’s Law. Third-generation sequencing tech- nologies produce 100× longer reads than second generation technologies and reveal a much broader mutation spectrum of disease and evolution. However, these technologies incur prohibitively high computational costs. Over 1,300 CPU hours are required for reference-guided assembly of the human genome (using [47]), and over 15,600 CPU hours are required for de novo assembly [57]. This paper describes “Darwin” — a co-processor for genomic sequence alignment that, without sacrificing sensitivity, provides up to 15,000× speedup over the state-of-the-art software for reference-guided assembly of third-generation reads. Darwin achieves this speedup through hardware/algorithm co-design, trading more easily accelerated alignment for less memory-intensive filtering, and by optimizing the memory system for filtering. Darwin combines a hardware-accelerated version of D-SOFT, a novel filtering algorithm, with a hardware-accelerated version of GACT, a novel alignment algorithm. GACT generates near-optimal alignments of arbitrarily long genomic sequences using constant memory for the compute-intensive step. Dar- win is adaptable, with tunable speed and sensitivity to match emerging sequencing technologies and to meet the requirements of genomic applications beyond read assembly.

July 7, 2019 |

Meeting report: mobile genetic elements and genome plasticity 2018

The Mobile Genetic Elements and Genome Plasticity conference was hosted by Keystone Symposia in Santa Fe, NM USA, February 11–15, 2018. The organizers were Marlene Belfort, Evan Eichler, Henry Levin and Lynn Maquat. The goal of this conference was to bring together scientists from around the world to discuss the function of transposable elements and their impact on host species. Central themes of the meeting included recent innovations in genome analysis and the role of mobile DNA in disease and evolution. The conference included 200 scientists who participated in poster presentations, short talks selected from abstracts, and invited talks. A total of 58 talks were organized into eight sessions and two workshops. The topics varied from mechanisms of mobilization, to the structure of genomes and their defense strategies to protect against transposable elements.

July 7, 2019 |

Long-read-based genome sequences of pandemic and environmental Vibrio cholerae strains.

The bacterium Vibrio cholerae exhibits two distinct lifestyles, one as an aquatic bacterium and the other as the etiological agent of the pandemic human disease cholera. Here, we report closed genome sequences of two seventh pandemic V. cholerae O1 El Tor strains, A1552 and N16961, and the environmental strain Sa5Y.

Auto Tag: Stanford University

Multiplication of blaOXA-23 is common in clinical Acinetobacter baumannii, but does not enhance carbapenem resistance.

A photoreceptor contributes to the natural variation of diapause induction in Daphnia magna.

Information-optimal genome assembly via sparse read-overlap graphs.

svclassify: a method to establish benchmark structural variant calls.

Collection and storage of HLA NGS genotyping data for the 17th International HLA and Immunogenetics Workshop.

Full genome sequence of the Western Reserve strain of vaccinia virus determined by third-generation sequencing.

A fast approximate algorithm for mapping long reads to large reference databases.

Darwin: A genomics co-processor provides up to 15,000 X acceleration on long read assembly

Meeting report: mobile genetic elements and genome plasticity 2018

Long-read-based genome sequences of pandemic and environmental Vibrio cholerae strains.

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert