Read length Archives - Page 17 of 29

September 22, 2019

Construction and characterization of bacterial artificial chromosomes harboring the full-length genome of a highly attenuated vaccinia virus LC16m8.

LC16m8 (m8), a highly attenuated vaccinia virus (VAC) strain, was developed as a smallpox vaccine, and its safety and immunogenicity have been confirmed. Here, we aimed to develop a system that recovers infectious m8 from a bacterial artificial chromosome (BAC) that retains the full-length viral genomic DNA (m8-BAC system). The infectious virus was successfully recovered from a VAC-BAC plasmid, named pLC16m8-BAC. Furthermore, the bacterial replicon-free virus was generated by intramolecular homologous recombination and was successfully recovered from a modified VAC-BAC plasmid, named pLC16m8.8S-BAC. Also, the growth of the recovered virus was indistinguishable from that of authentic m8. The full genome sequence of the plasmid, which harbors identical inverted terminal repeats (ITR) to that of authentic m8, was determined by long-read next-generation sequencing (NGS). The ITR contains x 18 to 32 of the 70 and x 30 to 45 of 54 base pair tandem repeats, and the number of tandem repeats was different between the ITR left and right. Since the virus recovered from pLC16m8.8S-BAC was expected to retain the identical viral genome to that of m8, including the ITR, a reference-based alignment following a short-read NGS was performed to validate the sequence of the recovered virus. Based on the pattern of coverage depth in the ITR, no remarkable differences were observed between the virus and m8, and the other region was confirmed to be identical as well. In summary, this new system can recover the virus, which is geno- and phenotypically indistinguishable from authentic m8.

September 22, 2019

Analyzing AbrB-knockout effects through genome and transcriptome sequencing of Bacillus licheniformis DW2.

As an industrial bacterium, Bacillus licheniformis DW2 produces bacitracin which is an important antibiotic for many pathogenic microorganisms. Our previous study showed AbrB-knockout could significantly increase the production of bacitracin. Accordingly, it was meaningful to understand its genome features, expression differences between wild and AbrB-knockout (?AbrB) strains, and the regulation of bacitracin biosynthesis. Here, we sequenced, de novo assembled and annotated its genome, and also sequenced the transcriptomes in three growth phases. The genome of DW2 contained a DNA molecule of 4,468,952 bp with 45.93% GC content and 4,717 protein coding genes. The transcriptome reads were mapped to the assembled genome, and obtained 4,102~4,536 expressed genes from different samples. We investigated transcription changes in B. licheniformis DW2 and showed that ?AbrB caused hundreds of genes up-regulation and down-regulation in different growth phases. We identified a complete bacitracin synthetase gene cluster, including the location and length of bacABC, bcrABC, and bacT, as well as their arrangement. The gene cluster bcrABC were significantly up-regulated in ?AbrB strain, which supported the hypothesis in previous study of bcrABC transporting bacitracin out of the cell to avoid self-intoxication, and was consistent with the previous experimental result that ?AbrB could yield more bacitracin. This study provided a high quality reference genome for B. licheniformis DW2, and the transcriptome data depicted global alterations across two strains and three phases offered an understanding of AbrB regulation and bacitracin biosynthesis through gene expression.

September 22, 2019

Simulating the dynamics of targeted capture sequencing with CapSim.

Targeted sequencing using capture probes has become increasingly popular in clinical applications due to its scalability and cost-effectiveness. The approach also allows for higher sequencing coverage of the targeted regions resulting in better analysis statistical power. However, because of the dynamics of the hybridization process, it is difficult to evaluate the efficiency of the probe design prior to the experiments which are time consuming and costly.We developed CapSim, a software package for simulation of targeted sequencing. Given a genome sequence and a set of probes, CapSim simulates the fragmentation, the dynamics of probe hybridization and the sequencing of the captured fragments on Illumina and PacBio sequencing platforms. The simulated data can be used for evaluating the performance of the analysis pipeline, as well as the efficiency of the probe design. Parameters of the various stages in the sequencing process can also be evaluated in order to optimize the experiments.CapSim is publicly available under BSD license at https://github.com/Devika1/capsim.l.coin@imb.uq.edu.au.Supplementary data are available at Bioinformatics online.© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

September 22, 2019

Assembly and analysis of a qingke reference genome demonstrate its close genetic relation to modern cultivated barley.

Qingke, the local name of hulless barley in the Tibetan Plateau, is a staple food for Tibetans. The availability of its reference genome sequences could be useful for studies on breeding and molecular evolution. Taking advantage of the third-generation sequencer (PacBio), we de novo assembled a 4.84-Gb genome sequence of qingke, cv. Zangqing320 and anchored a 4.59-Gb sequence to seven chromosomes. Of the 46,787 annotated ‘high-confidence’ genes, 31 564 were validated by RNA-sequencing data of 39 wild and cultivated barley genotypes with wide genetic diversity, and the results were also confirmed by nonredundant protein database from NCBI. As some gaps in the reference genome of Morex were covered in the reference genome of Zangqing320 by PacBio reads, we believe that the Zangqing320 genome provides the useful supplements for the Morex genome. Using the qingke genome as a reference, we conducted a genome comparison, revealing a close genetic relationship between a hulled barley (cv. Morex) and a hulless barley (cv. Zangqing320), which is strongly supported by the low-diversity regions in the two genomes. Considering the origin of Morex from its breeding pedigree, we then demonstrated a close genomic relationship between modern cultivated barley and qingke. Given this genomic relationship and the large genetic diversity between qingke and modern cultivated barley, we propose that qingke could provide elite genes for barley improvement.© 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

September 22, 2019

Enterobacter bugandensis: a novel enterobacterial species associated with severe clinical infection.

Nosocomial pathogens can cause life-threatening infections in neonates and immunocompromised patients. E. bugandensis (EB-247) is a recently described species of Enterobacter, associated with neonatal sepsis. Here we demonstrate that the extended spectrum ß-lactam (ESBL) producing isolate EB-247 is highly virulent in both Galleria mellonella and mouse models of infection. Infection studies in a streptomycin-treated mouse model showed that EB-247 is as efficient as Salmonella Typhimurium in inducing systemic infection and release of proinflammatory cytokines. Sequencing and analysis of the complete genome and plasmid revealed that virulence properties are associated with the chromosome, while antibiotic-resistance genes are exclusively present on a 299?kb IncHI plasmid. EB-247 grew in high concentrations of human serum indicating septicemic potential. Using whole genome-based transcriptome analysis we found 7% of the genome was mobilized for growth in serum. Upregulated genes include those involved in the iron uptake and storage as well as metabolism. The lasso peptide microcin J25 (MccJ25), an inhibitor of iron-uptake and RNA polymerase activity, inhibited EB-247 growth. Our studies indicate that Enterobacter bugandensis is a highly pathogenic species of the genus Enterobacter. Further studies on the colonization and virulence potential of E. bugandensis and its association with septicemic infection is now warranted.

September 22, 2019

Anisogamy evolved with a reduced sex-determining region in volvocine green algae

Male and female gametes differing in size—anisogamy—emerged independently from isogamous ancestors in various eukaryotic lineages, although genetic bases of this emergence are still unknown. Volvocine green algae are a model lineage for investigating the transition from isogamy to anisogamy. Here we focus on two closely related volvocine genera that bracket this transition—isogamous Yamagishiella and anisogamous Eudorina. We generated de novo nuclear genome assemblies of both sexes of Yamagishiella and Eudorina to identify the dimorphic sex-determining chromosomal region or mating-type locus (MT) from each. In contrast to the large (>1?Mb) and complex MT of oogamous Volvox, Yamagishiella and Eudorina MT are smaller (7–268?kb) and simpler with only two sex-limited genes—the minus/male-limited MID and the plus/female-limited FUS1. No prominently dimorphic gametologs were identified in either species. Thus, the first step to anisogamy in volvocine algae presumably occurred without an increase in MT size and complexity.

September 22, 2019

A validation approach of an end-to-end whole genome sequencing workflow for source tracking of Listeria monocytogenes and Salmonella enterica.

Whole genome sequencing (WGS), using high throughput sequencing technology, reveals the complete sequence of the bacterial genome in a few days. WGS is increasingly being used for source tracking, pathogen surveillance and outbreak investigation due to its high discriminatory power. In the food industry, WGS used for source tracking is beneficial to support contamination investigations. Despite its increased use, no standards or guidelines are available today for the use of WGS in outbreak and/or trace-back investigations. Here we present a validation of our complete (end-to-end) WGS workflow for Listeria monocytogenes and Salmonella enterica including: subculture of isolates, DNA extraction, sequencing and bioinformatics analysis. This end-to-end WGS workflow was evaluated according to the following performance criteria: stability, repeatability, reproducibility, discriminatory power, and epidemiological concordance. The current study showed that few single nucleotide polymorphism (SNPs) were observed for L. monocytogenes and S. enterica when comparing genome sequences from five independent colonies from the first subculture and five independent colonies after the tenth subculture. Consequently, the stability of the WGS workflow for L. monocytogenes and S. enterica was demonstrated despite the few genomic variations that can occur during subculturing steps. Repeatability and reproducibility were also demonstrated. The WGS workflow was shown to have a high discriminatory power and has the ability to show genetic relatedness. Additionally, the WGS workflow was able to reproduce published outbreak investigation results, illustrating its capability of showing epidemiological concordance. The current study proposes a validation approach comprising all steps of a WGS workflow and demonstrates that the workflow can be applied to L. monocytogenes or S. enterica.

September 22, 2019

Targeted long-read sequencing of a locus under long-term balancing selection in Capsella.

Rapid advances in short-read DNA sequencing technologies have revolutionized population genomic studies, but there are genomic regions where this technology reaches its limits. Limitations mostly arise due to the difficulties in assembly or alignment to genomic regions of high sequence divergence and high repeat content, which are typical characteristics for loci under strong long-term balancing selection. Studying genetic diversity at such loci therefore remains challenging. Here, we investigate the feasibility and error rates associated with targeted long-read sequencing of a locus under balancing selection. For this purpose, we generated bacterial artificial chromosomes (BACs) containing the Brassicaceae S-locus, a region under strong negative frequency-dependent selection which has previously proven difficult to assemble in its entirety using short reads. We sequence S-locus BACs with single-molecule long-read sequencing technology and conduct de novo assembly of these S-locus haplotypes. By comparing repeated assemblies resulting from independent long-read sequencing runs on the same BAC clone we do not detect any structural errors, suggesting that reliable assemblies are generated, but we estimate an indel error rate of 5.7×10-5 A similar error rate was estimated based on comparison of Illumina short-read sequences and BAC assemblies. Our results show that, until de novo assembly of multiple individuals using long-read sequencing becomes feasible, targeted long-read sequencing of loci under balancing selection is a viable option with low error rates for single nucleotide polymorphisms or structural variation. We further find that short-read sequencing is a valuable complement, allowing correction of the relatively high rate of indel errors that result from this approach. Copyright © 2018 Bachmann et al.

September 22, 2019

Whole genome sequencing of greater amberjack (Seriola dumerili) for SNP identification on aligned scaffolds and genome structural variation analysis using parallel resequencing

Greater amberjack (Seriola dumerili) is distributed in tropical and temperate waters worldwide and is an important aquaculture fish. We carried out de novo sequencing of the greater amberjack genome to construct a reference genome sequence to identify single nucleotide polymorphisms (SNPs) for breeding amberjack by marker-assisted or gene-assisted selection as well as to identify functional genes for biological traits. We obtained 200 times coverage and constructed a high-quality genome assembly using next generation sequencing technology. The assembled sequences were aligned onto a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map by sequence homology. A total of 215 of the longest amberjack sequences, with a total length of 622.8?Mbp (92% of the total length of the genome scaffolds), were lined up on the yellowtail RH map. We resequenced the whole genomes of 20 greater amberjacks and mapped the resulting sequences onto the reference genome sequence. About 186,000 nonredundant SNPs were successfully ordered on the reference genome. Further, we found differences in the genome structural variations between two greater amberjack populations using BreakDancer. We also analyzed the greater amberjack transcriptome and mapped the annotated sequences onto the reference genome sequence.

September 22, 2019

Reproducible integration of multiple sequencing datasets to form high-confidence SNP, indel, and reference calls for five human genome reference materials

Benchmark small variant calls from the Genome in a Bottle Consortium (GIAB) for the CEPH/HapMap genome NA12878 (HG001) have been used extensively for developing, optimizing, and demonstrating performance of sequencing and bioinformatics methods. Here, we develop a reproducible, cloud-based pipeline to integrate multiple sequencing datasets and form benchmark calls, enabling application to arbitrary human genomes. We use these reproducible methods to form high-confidence calls with respect to GRCh37 and GRCh38 for HG001 and 4 additional broadly-consented genomes from the Personal Genome Project that are available as NIST Reference Materials. These new genomes’ broad, open consent with few restrictions on availability of samples and data is enabling a uniquely diverse array of applications. Our new methods produce 17% more high-confidence SNPs, 176% more indels, and 12% larger regions than our previously published calls. To demonstrate that these calls can be used for accurate benchmarking, we compare other high-quality callsets to ours (e.g., Illumina Platinum Genomes), and we demonstrate that the majority of discordant calls are errors in the other callsets, We also highlight challenges in interpreting performance metrics when benchmarking against imperfect high-confidence calls. We show that benchmarking tools from the Global Alliance for Genomics and Health can be used with our calls to stratify performance metrics by variant type and genome context and elucidate strengths and weaknesses of a method.

September 22, 2019

Long-read genome sequence and assembly of Leptopilina boulardi: a specialist Drosophila parasitoid

Background: Leptopilina boulardi is a specialist parasitoid belonging to the order Hymenoptera, which attacks the larval stages of Drosophila. The Leptopilina genus has enormous value in the biological control of pests as well as in understanding several aspects of host-parasitoid biology. However, none of the members of Figitidae family has their genomes sequenced. In order to improve the understanding of the parasitoid wasps by generating genomic resources, we sequenced the whole genome of L. boulardi. Findings: Here, we report a high quality genome of L. boulardi, assembled from 70Gb of Illumina reads and 10.5Gb of PacBio reads, forming a total coverage of 230X. The 375Mb draft genome has an N50 of 275Kb with 6315 scaffolds >500bp, and encompasses >95% complete BUSCOs. The GC% of the genome is 28.26%, and RepeatMasker identified 868105 repeat elements covering 43.9% of the assembly. A total of 25259 protein-coding genes were predicted using a combination of ab-initio and RNA-Seq based methods, with an average gene size of 3.9Kb. 78.11% of the predicted genes could be annotated with at least one function. Conclusion: Our study provides a highly reliable assembly of this parasitoid wasp, which will be a valuable resource to researchers studying parasitoids. In particular, it can help delineate the host-parasitoid mechanisms that are part of the Drosophila-Leptopilina model system.

September 22, 2019

The genome sequence of a new strain of Mycobacterium ulcerans ecovar Liflandii, emerging as a sturgeon pathogen

Mycobacterium ulcerans ecovar Liflandii (MuLiflandii) is emerging as a non-mycobacterial pathogen in amphibians. Here, we make the first report on the prevalence of a new strain of MuLiflandii infection in Chinese sturgeon. All the diseased fish showed the classic clinical symptoms of ascites and/or muscle ulceration. A new slow-growing and acid-fast bacillus ASM001 strain was obtained from the ascites of infected fish; this strain demonstrated pathogenicity when tested in hybrid sturgeon. The complete genome sequence of MuLiflandii ASM001 is a circular chromosome of 6,167,296?bp, with a G?+?C content of 65.57%, containing 4518 predicted coding DNA sequences and 999 pseudo-genes, 3 rRNA operons, and 47 transfer RNA sequences. In addition, we found 245 copies of IS2404, 34 microsatellites, and 36 CRISPR sequences in the whole MuLiflandii ASM001 genome. Among the predicted genes of MuLiflandii ASM001, we found orthologs of 203 virulence factors of clinical MuLiflandii 128FXT operating in host cell invasion, modulation of phagocyte function, and survival inside the macrophages. These virulence factor candidates provide a key basis for understanding their pathogenic mechanisms at the molecular level. A comparative analysis that used complete, existing genomes showed that MuLiflandii ASM001 has high synteny with MuLiflandii 128FXT. We anticipate the availability of the complete MuLiflandii ASM001 genome sequence will provide a valuable resource for comparative genomic studies of MuLiflandii isolates, as well as provide new insights into the host, ecological, and functional diversity of the genus Mycobacterium.

September 22, 2019

Ploidy variation in Kluyveromyces marxianus separates dairy and non-dairy isolates.

Kluyveromyces marxianus is traditionally associated with fermented dairy products, but can also be isolated from diverse non-dairy environments. Because of thermotolerance, rapid growth and other traits, many different strains are being developed for food and industrial applications but there is, as yet, little understanding of the genetic diversity or population genetics of this species. K. marxianus shows a high level of phenotypic variation but the only phenotype that has been clearly linked to a genetic polymorphism is lactose utilisation, which is controlled by variation in the LAC12 gene. The genomes of several strains have been sequenced in recent years and, in this study, we sequenced a further nine strains from different origins. Analysis of the Single Nucleotide Polymorphisms (SNPs) in 14 strains was carried out to examine genome structure and genetic diversity. SNP diversity in K. marxianus is relatively high, with up to 3% DNA sequence divergence between alleles. It was found that the isolates include haploid, diploid, and triploid strains, as shown by both SNP analysis and flow cytometry. Diploids and triploids contain long genomic tracts showing loss of heterozygosity (LOH). All six isolates from dairy environments were diploid or triploid, whereas 6 out 7 isolates from non-dairy environment were haploid. This also correlated with the presence of functional LAC12 alleles only in dairy haplotypes. The diploids were hybrids between a non-dairy and a dairy haplotype, whereas triploids included three copies of a dairy haplotype.

September 22, 2019

CliqueSNV: Scalable reconstruction of intra-host viral populations from NGS reads

Highly mutable RNA viruses such as influenza A virus, human immunodeficiency virus and hepatitis C virus exist in infected hosts as highly heterogeneous populations of closely related genomic variants. The presence of low-frequency variants with few mutations with respect to major strains may result in an immune escape, emergence of drug resistance, and an increase of virulence and infectivity. Next-generation sequencing technologies permit detection of sample intra-host viral population at extremely great depth, thus providing an opportunity to access low-frequency variants. Long read lengths offered by single-molecule sequencing technologies allow all viral variants to be sequenced in a single pass. However, high sequencing error rates limit the ability to study heterogeneous viral populations composed of rare, closely related variants. In this article, we present CliqueSNV, a novel reference-based method for reconstruction of viral variants from NGS data. It efficiently constructs an allele graph based on linkage between single nucleotide variations and identifies true viral variants by merging cliques of that graph using combinatorial optimization techniques. The new method outperforms existing methods in both accuracy and running time on experimental and simulated NGS data for titrated levels of known viral variants. For PacBio reads, it accurately reconstructs variants with frequency as low as 0.1%. For Illumina reads, it fully reconstructs main variants. The open source implementation of CliqueSNV is freely available for download at https://github.com/vyacheslav-tsivina/CliqueSNV

September 22, 2019

Genome analysis of Fimbriiglobus ruber SP5T, a planctomycete with confirmed chitinolytic capability.

Members of the bacterial order Planctomycetales have often been observed in associations with Crustacea. The ability to degrade chitin, however, has never been reported for any of the cultured planctomycetes although utilization of N-acetylglucosamine (GlcNAc) as a sole carbon and nitrogen source is well recognized for these bacteria. Here, we demonstrate the chitinolytic capability of a member of the family Gemmataceae, Fimbriiglobus ruber SP5T, which was isolated from a peat bog. As revealed by metatranscriptomic analysis of chitin-amended peat, the pool of 16S rRNA reads from F. ruber increased in response to chitin availability. Strain SP5T displayed only weak growth on amorphous chitin as a sole source of carbon but grew well with chitin as a source of nitrogen. The genome of F. ruber SP5T is 12.364 Mb in size and is the largest among all currently determined planctomycete genomes. It encodes several enzymes putatively involved in chitin degradation, including two chitinases affiliated with the glycoside hydrolase (GH) family GH18, GH20 family ß-N-acetylglucosaminidase, and the complete set of enzymes required for utilization of GlcNAc. The gene encoding one of the predicted chitinases was expressed in Escherichia coli, and the endochitinase activity of the recombinant enzyme was confirmed. The genome also contains genes required for the assembly of type IV pili, which may be used to adhere to chitin and possibly other biopolymers. The ability to use chitin as a source of nitrogen is of special importance for planctomycetes that inhabit N-depleted ombrotrophic wetlands. IMPORTANCE Planctomycetes represent an important part of the microbial community in Sphagnum-dominated peatlands, but their potential functions in these ecosystems remain poorly understood. This study reports the presence of chitinolytic potential in one of the recently described peat-inhabiting members of the family Gemmataceae, Fimbriiglobus ruber SP5T This planctomycete uses chitin, a major constituent of fungal cell walls and exoskeletons of peat-inhabiting arthropods, as a source of nitrogen in N-depleted ombrotrophic Sphagnum-dominated peatlands. This study reports the chitin-degrading capability of representatives of the order Planctomycetales. Copyright © 2018 American Society for Microbiology.

Auto Tag: Read length

Construction and characterization of bacterial artificial chromosomes harboring the full-length genome of a highly attenuated vaccinia virus LC16m8.

Analyzing AbrB-knockout effects through genome and transcriptome sequencing of Bacillus licheniformis DW2.

Simulating the dynamics of targeted capture sequencing with CapSim.

Assembly and analysis of a qingke reference genome demonstrate its close genetic relation to modern cultivated barley.

Enterobacter bugandensis: a novel enterobacterial species associated with severe clinical infection.

Anisogamy evolved with a reduced sex-determining region in volvocine green algae

A validation approach of an end-to-end whole genome sequencing workflow for source tracking of Listeria monocytogenes and Salmonella enterica.

Targeted long-read sequencing of a locus under long-term balancing selection in Capsella.

Whole genome sequencing of greater amberjack (Seriola dumerili) for SNP identification on aligned scaffolds and genome structural variation analysis using parallel resequencing

Reproducible integration of multiple sequencing datasets to form high-confidence SNP, indel, and reference calls for five human genome reference materials

Long-read genome sequence and assembly of Leptopilina boulardi: a specialist Drosophila parasitoid

The genome sequence of a new strain of Mycobacterium ulcerans ecovar Liflandii, emerging as a sturgeon pathogen

Ploidy variation in Kluyveromyces marxianus separates dairy and non-dairy isolates.

CliqueSNV: Scalable reconstruction of intra-host viral populations from NGS reads

Genome analysis of Fimbriiglobus ruber SP5T, a planctomycete with confirmed chitinolytic capability.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert