Pacbio reads Archives - Page 49 of 53

July 7, 2019

Complete genome sequence of the sesame pathogen Ralstonia solanacearum strain SEPPX 05.

Ralstonia solanacearum is a soil-borne phytopathogen associated with bacterial wilt disease of sesame. R. solanacearum is the predominant agent causing damping-off from tropical to temperate regions. Because bacterial wilt has decreased the sesame industry yield, we sequenced the SEPPX05 genome using PacBio and Illumina HiSeq 2500 systems and revealed that R. solanacearum strain SEPPX05 carries a bipartite genome consisting of a 3,930,849 bp chromosome and a 2,066,085 bp megaplasmid with 66.84% G+C content that harbors 5,427 coding sequences. Based on the whole genome, phylogenetic analysis showed that strain SEPPX05 is grouped with two phylotype I strains (EP1 and GMI1000). Pan-genomic analysis shows that R. solanacearum is a complex species with high biological diversity and was able to colonize various environments during evolution. Despite deletions, insertions, and inversions, most genes of strain SEPPX05 have relatively high levels of synteny compared with strain GMI1000. We identified 104 genes involved in virulence-related factors in the SEPPX05 genome and eight absent genes encoding T3Es of GMI1000. Comparing SEPPX05 with other species, we found highly conserved secretion systems central to modulating interactions of host bacteria. These data may provide important clues for understanding underlying pathogenic mechanisms of R. solanacearum and help in the control of sesame bacterial wilt.

July 7, 2019

Complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from diseased olive flounder (Paralichthys olivaceus).

Here, we report the complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from the kidney of a diseased olive flounder in South Korea in 2013. Sequencing using a PacBio platform yielded a circular chromosome of 2,128,740?bp and a plasmid of 23,538?bp, harboring 2,123 and 24 protein-coding genes, respectively. Copyright © 2018 Lee et al.

July 7, 2019

A fast approximate algorithm for mapping long reads to large reference databases.

Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290?×?faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each =5?kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.

July 7, 2019

Tigmint: correcting assembly errors using linked reads from large molecules.

Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap.To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing.Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone.

July 7, 2019

Complete genome sequence of Sphingobium sp. strain YG1, a lignin model dimer-metabolizing bacterium isolated from sediment in Kagoshima Bay, Japan.

Sphingobium sp. strain YG1 is a lignin model dimer-metabolizing bacterium newly isolated from sediment in Kagoshima, Japan, at a depth of 102 m. Here, we report the complete genome nucleotide sequence of strain YG1. Copyright © 2018 Ohta et al.

July 7, 2019

Complete genome sequence of Lactobacillus plantarum subsp. plantarum strain LB1-2, Iiolated from the hindgut of European honeybees, Apis mellifera L., from the Philippines.

Lactobacillus plantarum subsp. plantarum strain LB1-2, isolated from the hindgut of European honeybees in the Philippines, is active against Paenibacillus larvae and has broad activity against several Gram-positive and Gram-negative bacteria. The complete genome sequence reported herein contains gene clusters for multiple bacteriocins and extensive gene inventories for carbohydrate metabolism. Copyright © 2018 Ilagan-Cruzada et al.

July 7, 2019

Genome sequences of five Mycobacterium bovis strains isolated from farmed animals and wildlife in Canada.

Mycobacterium bovis is the causative agent of bovine tuberculosis, an infectious disease that affects both animals and humans and thus presents a risk to public health and the livestock industry. Here, we report the genome sequences of five Mycobacterium bovis strains that represent major genotype clusters observed in farmed animals and wildlife in Canada.© Crown copyright 2018.

July 7, 2019

Complete genome sequence of Altererythrobacter sp. strain B11, an aromatic monomer-degrading bacterium, Iiolated from deep-sea sediment under the seabed off Kashima, Japan.

Altererythrobacter sp. strain B11 is an aromatic monomer-degrading bacterium newly isolated from sediment under the seabed off Kashima, Japan, at a depth of 2,100?m. Here, we report the complete nucleotide sequence of the genome of strain B11. Copyright © 2018 Maeda et al.

July 7, 2019

Complete genome sequence of the live attenuated vaccine strain Brucella melitensis Rev.1.

Live attenuated vaccines are essential elements in control programs for the prevention of brucellosis. Here, we report the whole-genome sequence of the original Elberg Brucella melitensis Rev.1 vaccine strain, passage 101 (1970). Commercial lines of the original strain have been successfully used in small ruminants worldwide. Copyright © 2018 Salmon-Divon et al.

July 7, 2019

Complete genome sequence of the symbiotic strain Bradyrhizobium icense LMTR 13T, isolated from lima bean (Phaseolus lunatus) in Peru.

The complete genome sequence of Bradyrhizobium icense LMTR 13T, a root nodule bacterium isolated from the legume Phaseolus lunatus, is reported here. The genome consists of a circular 8,322,773-bp chromosome which codes for a large and novel symbiotic island as well as genes putatively involved in soil and root colonization. Copyright © 2018 Ormeño-Orrillo et al.

July 7, 2019

A draft genome sequence for the Ixodes scapularis cell line, ISE6

Background: The tick cell line ISE6, derived from Ixodes scapularis, is commonly used for amplification and detection of arboviruses in environmental or clinical samples. Methods: To assist with sequence-based assays, we sequenced the ISE6 genome with single-molecule, long-read technology. Results: The draft assembly appears near complete based on gene content analysis, though it appears to lack some instances of repeats in this highly repetitive genome. The assembly appears to have separated the haplotypes at many loci. DNA short read pairs, used for validation only, mapped to the cell line assembly at a higher rate than they mapped to the Ixodes scapularis reference genome sequence. Conclusions: The assembly could be useful for filtering host genome sequence from sequence data obtained from cells infected with pathogens.

July 7, 2019

Short genome report of cellulose-producing commensal Escherichia coli 1094.

Bacterial surface colonization and biofilm formation often rely on the production of an extracellular polymeric matrix that mediates cell-cell and cell-surface contacts. In Escherichia coli and many Betaproteobacteria and Gammaproteobacteria cellulose is often the main component of the extracellular matrix. Here we report the complete genome sequence of the cellulose producing strain E. coli 1094 and compare it with five other closely related genomes within E. coli phylogenetic group A. We present a comparative analysis of the regions encoding genes responsible for cellulose biosynthesis and discuss the changes that could have led to the loss of this important adaptive advantage in several E. coli strains. Data deposition: The annotated genome sequence has been deposited at the European Nucleotide Archive under the accession number PRJEB21000.

July 7, 2019

The complete mitochondrial genome of Sanghuangporus sanghuang (Hymenochaetaceae, Basidiomycota)

Sanghuang is a polypore mushroom, which has been widely used in oriental medicine. Since recent molecular phylogenetic studies elucidated its species delimitation, Sanghaungporus sanghuang became the official name of this fungus. In this study, the complete sequence of the mitochondrial DNA of S. sanghuang was determined. The whole genome was 112,060?bp containing 14 proteins, 2 ribosomal RNA subunits, and 45 transfer RNAs. The overall GC content of the genome was 23.21%. A neighbour-joining tree based on atp6 sequence data showed its close relationship with the species of Ganoderma and Trametes.

July 7, 2019

Darwin: A genomics co-processor provides up to 15,000 X acceleration on long read assembly

of life in fundamental ways. Genomics data, however, is far outpacing Moore’s Law. Third-generation sequencing tech- nologies produce 100× longer reads than second generation technologies and reveal a much broader mutation spectrum of disease and evolution. However, these technologies incur prohibitively high computational costs. Over 1,300 CPU hours are required for reference-guided assembly of the human genome (using [47]), and over 15,600 CPU hours are required for de novo assembly [57]. This paper describes “Darwin” — a co-processor for genomic sequence alignment that, without sacrificing sensitivity, provides up to 15,000× speedup over the state-of-the-art software for reference-guided assembly of third-generation reads. Darwin achieves this speedup through hardware/algorithm co-design, trading more easily accelerated alignment for less memory-intensive filtering, and by optimizing the memory system for filtering. Darwin combines a hardware-accelerated version of D-SOFT, a novel filtering algorithm, with a hardware-accelerated version of GACT, a novel alignment algorithm. GACT generates near-optimal alignments of arbitrarily long genomic sequences using constant memory for the compute-intensive step. Dar- win is adaptable, with tunable speed and sensitivity to match emerging sequencing technologies and to meet the requirements of genomic applications beyond read assembly.

July 7, 2019

Genome sequence resources for the wheat stripe rust pathogen (Puccinia striiformis f. sp. tritici) and the barley stripe rust pathogen (Puccinia striiformis f. sp. hordei)

Puccinia striiformis f. sp. tritici causes devastating stripe (yellow) rust on wheat and P. striiformis f. sp. hordei causes stripe rust on barley. Several P. striiformis f. sp. tritici genomes are available, but no P. striiformis f. sp. hordei genome is available. More genomes of P. striiformis f. sp. tritici and P. striiformis f. sp. hordei are needed to understand the genome evolution and molecular mechanisms of their pathogenicity. We sequenced P. striiformis f. sp. tritici isolate 93-210 and P. striiformis f. sp. hordei isolate 93TX-2, using PacBio and Illumina technologies and RNA sequencing. Their genomic sequences were assembled to contigs with high continuity and showed significant structural differences. The circular mitochondria genomes of both were complete. These genomes provide high-quality resources for deciphering the genomic basis of rapid evolution and host adaptation, identifying genes for avirulence and other important traits, and studying host-pathogen interactions.

Auto Tag: Pacbio reads

Complete genome sequence of the sesame pathogen Ralstonia solanacearum strain SEPPX 05.

Complete genome sequence of multiple-antibiotic-resistant Streptococcus parauberis strain SPOF3K, isolated from diseased olive flounder (Paralichthys olivaceus).

A fast approximate algorithm for mapping long reads to large reference databases.

Tigmint: correcting assembly errors using linked reads from large molecules.

Complete genome sequence of Sphingobium sp. strain YG1, a lignin model dimer-metabolizing bacterium isolated from sediment in Kagoshima Bay, Japan.

Complete genome sequence of Lactobacillus plantarum subsp. plantarum strain LB1-2, Iiolated from the hindgut of European honeybees, Apis mellifera L., from the Philippines.

Genome sequences of five Mycobacterium bovis strains isolated from farmed animals and wildlife in Canada.

Complete genome sequence of Altererythrobacter sp. strain B11, an aromatic monomer-degrading bacterium, Iiolated from deep-sea sediment under the seabed off Kashima, Japan.

Complete genome sequence of the live attenuated vaccine strain Brucella melitensis Rev.1.

Complete genome sequence of the symbiotic strain Bradyrhizobium icense LMTR 13T, isolated from lima bean (Phaseolus lunatus) in Peru.

A draft genome sequence for the Ixodes scapularis cell line, ISE6

Short genome report of cellulose-producing commensal Escherichia coli 1094.

The complete mitochondrial genome of Sanghuangporus sanghuang (Hymenochaetaceae, Basidiomycota)

Darwin: A genomics co-processor provides up to 15,000 X acceleration on long read assembly

Genome sequence resources for the wheat stripe rust pathogen (Puccinia striiformis f. sp. tritici) and the barley stripe rust pathogen (Puccinia striiformis f. sp. hordei)

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert