Variant detection Archives - Page 63 of 65

July 7, 2019

Complete genome sequences of Canadian epidemic methicillin-resistant Staphylococcus aureus strains CMRSA3 and CMRSA6.

Methicillin-resistant Staphylococcus aureus (MRSA) clonal complex 8 (CC8) sequence type 239 (ST239) represents a predominant hospital-associated MRSA sublineage present worldwide. The Canadian epidemic MRSA strains CMRSA3 and CMRSA6 are moderately virulent members of this group but are closely related to the highly virulent strain TW20. Whole-genome sequencing of CMRSA3 and CMRSA6 was conducted to identify genetic determinants associated with their virulence.

July 7, 2019

Meeting report: mobile genetic elements and genome plasticity 2018

The Mobile Genetic Elements and Genome Plasticity conference was hosted by Keystone Symposia in Santa Fe, NM USA, February 11–15, 2018. The organizers were Marlene Belfort, Evan Eichler, Henry Levin and Lynn Maquat. The goal of this conference was to bring together scientists from around the world to discuss the function of transposable elements and their impact on host species. Central themes of the meeting included recent innovations in genome analysis and the role of mobile DNA in disease and evolution. The conference included 200 scientists who participated in poster presentations, short talks selected from abstracts, and invited talks. A total of 58 talks were organized into eight sessions and two workshops. The topics varied from mechanisms of mobilization, to the structure of genomes and their defense strategies to protect against transposable elements.

July 7, 2019

Fast-SG: an alignment-free algorithm for hybrid assembly.

Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short- and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes.Here, we propose a new method, called Fast-SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast-SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast-SG outperforms the state-of-the-art short-read aligners when building the scaffoldinggraph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast-SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878).Fast-SG opens a door to achieve accurate hybrid long-range reconstructions of large genomes with low effort, high portability, and low cost.

July 7, 2019

Complete genome sequence of Acinetobacter schindleri SGAir0122 isolated from Singapore Air.

Acinetobacter schindleri strain SGAir0122 was isolated from tropical air samples collected in Singapore. The prevalence of nosocomial infection caused by this Gram-negative bacterium indicates its clinical significance as an opportunistic human pathogen. Its complete genome consists of one chromosome of 3.105?Mb and a plasmid of 181?kb. Copyright © 2018 Kee et al.

July 7, 2019

Genome sequence of the soybean cyst nematode (Heteroderaglycines)endosymbiont “Candidatus Cardinium hertigii” strain cHgTN10.

In this study, we present the genome sequence of the “Candidatus Cardinium hertigii” strain cHgTN10, an endosymbiotic bacterium of the plant-parasitic nematode Heterodera glycines This is the first genome assembly reported for an endosymbiont directly sequenced from a tylenchid nematode. Copyright © 2018 Showmaker et al.

July 7, 2019

Genome sequence of Geobacillus thermoleovorans SGAir0734, isolated from Singapore air.

The thermophilic bacterium Geobacillus thermoleovorans was isolated from a tropical air sample collected in Singapore. The genome was sequenced on the PacBio RS II platform and consists of one chromosome with 3.6?Mb and one plasmid with 75?kb. The genome comprises 3,509 protein-coding genes, 88 tRNAs, and 27 rRNAs. Copyright © 2018 Gaultier et al.

July 7, 2019

Genome sequence of Bacillus velezensis SGAir0473, isolated from tropical air collected in Singapore.

Bacillus velezensis strain SGAir0473 (Firmicutes) was isolated from tropical air collected in Singapore. Its genome was assembled using short reads and single-molecule real-time sequencing and comprises one chromosome with 4.18?Mb. The genome consists of 3,937 protein-coding genes, 86 tRNAs, and 27 rRNAs. Copyright © 2018 Lim et al.

July 7, 2019

Genome sequence of Pantoea ananatis SGAir0210, isolated from outdoor air in Singapore.

Pantoea ananatis SGAir0210 was isolated from outdoor air collected in Singapore. The genome was assembled from long reads generated by single-molecule real-time sequencing complemented with short reads. The genome size was approximately 4.81 Mb, with 4,303 protein-coding genes, 80 tRNAs, and 22 rRNAs identified. Copyright © 2018 Luhung et al.

July 7, 2019

A universal SNP and small-indel variant caller using deep neural networks.

Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools. The learned model generalizes across genome builds and mammalian species, allowing nonhuman sequencing projects to benefit from the wealth of human ground-truth data. We further show that DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, including deep whole genomes from 10X Genomics and Ion Ampliseq exomes, highlighting the benefits of using more automated and generalizable techniques for variant calling.

July 7, 2019

Spalter: A meta machine learning approach to distinguish true DNA variants from sequencing artefacts

Being able to distinguish between true DNA variants and technical sequencing artefacts is a fundamental task in whole genome, exome or targeted gene analysis. Variant calling tools provide diagnostic parameters, such as strand bias or an aggregated overall quality for each called variant, to help users make an informed choice about which variants to accept or discard. Having several such quality indicators poses a problem for the users of variant callers because they need to set or adjust thresholds for each such indicator. Alternatively, machine learning methods can be used to train a classifier based on these indicators. This approach needs large sets of labeled training data, which is not easily available. The new approach presented here relies on the idea that a true DNA variant exists independently of technical features of the read in which it appears (e.g. base quality, strand, position in the read). Therefore the nucleotide separability classification problem – predicting the nucleotide state of each read in a given pileup based on technical features only – should be near impossible to solve for true variants. Nucleotide separability, i.e. achievable classification accuracy, can either be used to distinguish between true variants and technical artefacts directly, using a thresholding approach, or it can be used as a meta-feature to train a separability-based classifier. This article explores both possibilities with promising results, showing accuracies around 90%.

July 7, 2019

STRetch: detecting and discovering pathogenic short tandem repeat expansions.

Short tandem repeat (STR) expansions have been identified as the causal DNA mutation in dozens of Mendelian diseases. Most existing tools for detecting STR variation with short reads do so within the read length and so are unable to detect the majority of pathogenic expansions. Here we present STRetch, a new genome-wide method to scan for STR expansions at all loci across the human genome. We demonstrate the use of STRetch for detecting STR expansions using short-read whole-genome sequencing data at known pathogenic loci as well as novel STR loci. STRetch is open source software, available from github.com/Oshlack/STRetch .

July 7, 2019

MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies.

Large-scale bacterial population genetics studies are now routine due to cost-effective Illumina short-read sequencing. However, analysing plasmid content remains difficult due to incomplete assembly of plasmids. Bacterial isolates can contain any number of plasmids and assembly remains complicated due to the presence of repetitive elements. Numerous tools have been developed to analyse plasmids but the performance and functionality of the tools are variable. The MOB-suite was developed as a set of modular tools for reconstruction and typing of plasmids from draft assembly data to facilitate characterization of plasmids. Using a set of closed genomes with publicly available Illumina data, the MOB-suite identified contigs of plasmid origin with both high sensitivity and specificity (95 and 88?%, respectively). In comparison, plasmidfinder demonstrated high specificity (99?%) but limited sensitivity (50?%). Using the same dataset of 377 known plasmids, MOB-recon accurately reconstructed 207 plasmids so that they were assigned to a single grouping without other plasmid or chromosomal sequences, whereas plasmidSPAdes was only able to accurately reconstruct 102 plasmids. In general, plasmidSPAdes has a tendency to merge different plasmids together, with 208 plasmids undergoing merge events. The MOB-suite reduces the number of errors but produces more hybrid plasmids, with 84 plasmids undergoing both splits and merges. The MOB-suite also provides replicon typing similar to plasmidfinder but with the inclusion of relaxase typing and prediction of conjugation potential. The MOB-suite is written in Python 3 and is available from https://github.com/phac-nml/mob-suite.

July 7, 2019

Genome resequencing and analysis of d-lactic acid fermentation ability of Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293

Genome resequencing of D-lactic acid-producing Leuconostoc mesenteroides ATCC 8293 revealed 28 base errors in the version published in the 2017. Based on the revised genome annotation, four genes encoding putative D- lactate dehydrogenases were identified. The transcriptional expression of each gene was analyzed at different growth phases and the functionality of each gene was studied in Escherichia coli. Bioreactor studies indicated that L. mesenteroides ATCC 8293 produced D-lactic acid and ethanol at a ratio of 1.7:1 (g/g) regardless of the glucose concentration.

July 7, 2019

Traditional Norwegian kveik are a genetically distinct group of domesticated Saccharomyces cerevisiae brewing yeasts.

The widespread production of fermented food and beverages has resulted in the domestication of Saccharomyces cerevisiae yeasts specifically adapted to beer production. While there is evidence beer yeast domestication was accelerated by industrialization of beer, there also exists a farmhouse brewing culture in western Norway which has passed down yeasts referred to as kveik for generations. This practice has resulted in ale yeasts which are typically highly flocculant, phenolic off flavor negative (POF-), and exhibit a high rate of fermentation, similar to previously characterized lineages of domesticated yeast. Additionally, kveik yeasts are reportedly high-temperature tolerant, likely due to the traditional practice of pitching yeast into warm (>28°C) wort. Here, we characterize kveik yeasts from 9 different Norwegian sources via PCR fingerprinting, whole genome sequencing of selected strains, phenotypic screens, and lab-scale fermentations. Phylogenetic analysis suggests that kveik yeasts form a distinct group among beer yeasts. Additionally, we identify a novel POF- loss-of-function mutation, as well as SNPs and CNVs potentially relevant to the thermotolerance, high ethanol tolerance, and high fermentation rate phenotypes of kveik strains. We also identify domestication markers related to flocculation in kveik. Taken together, the results suggest that Norwegian kveik yeasts are a genetically distinct group of domesticated beer yeasts with properties highly relevant to the brewing sector.

July 7, 2019

Annotated draft genome sequence of the apple scab pathogen Venturia inaequalis

Apple scab is one of the most economically important diseases of apples worldwide. The disease is caused by the haploid ascomycete Venturia inaequalis. We present here an annotated V. inaequalis whole-genome sequence of 72?Mb, assembled into 238 contigs, with 13,761 predicted genes.

Auto Tag: Variant detection

Complete genome sequences of Canadian epidemic methicillin-resistant Staphylococcus aureus strains CMRSA3 and CMRSA6.

Meeting report: mobile genetic elements and genome plasticity 2018

Fast-SG: an alignment-free algorithm for hybrid assembly.

Complete genome sequence of Acinetobacter schindleri SGAir0122 isolated from Singapore Air.

Genome sequence of the soybean cyst nematode (Heteroderaglycines)endosymbiont “Candidatus Cardinium hertigii” strain cHgTN10.

Genome sequence of Geobacillus thermoleovorans SGAir0734, isolated from Singapore air.

Genome sequence of Bacillus velezensis SGAir0473, isolated from tropical air collected in Singapore.

Genome sequence of Pantoea ananatis SGAir0210, isolated from outdoor air in Singapore.

A universal SNP and small-indel variant caller using deep neural networks.

Spalter: A meta machine learning approach to distinguish true DNA variants from sequencing artefacts

STRetch: detecting and discovering pathogenic short tandem repeat expansions.

MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies.

Genome resequencing and analysis of d-lactic acid fermentation ability of Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293

Traditional Norwegian kveik are a genetically distinct group of domesticated Saccharomyces cerevisiae brewing yeasts.

Annotated draft genome sequence of the apple scab pathogen Venturia inaequalis

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert