PacBio RSII Archives - Page 18 of 37

July 7, 2019

Complete genome sequence of the extremely thermoacidophilic archaeon Acidianus manzaensis YN-25.

The complete genome of Acidianus manzaensis YN-25 consists of a chromosome of 2,687,463 bp, with a G+C content of 30.62% and 2,746 coding DNA sequences. This archaeon contains a series of specific genes involved in the oxidation of elemental sulfur and reduced inorganic sulfur compounds. Copyright © 2017 Ma et al.

July 7, 2019

Comparative genomic and phylogenetic analysis of a toxigenic clinical isolate of Corynebacterium diphtheriae strain B-D-16-78 from Malaysia.

In this study, we report the comparative genomics and phylogenetic analysis of Corynebacterium diphtheriae strain B-D-16-78 that was isolated from a clinical specimen in 2016. The complete genome of C. diphtheriae strain B-D-16-78 was sequenced using PacBio Single Molecule, Real-Time sequencing technology and consists of a 2,474,151-bp circular chromosome with an average GC content of 53.56%. The core genome of C. diphtheriae was also deduced from a total of 74 strains with complete or draft genome sequences and the core genome-based phylogenetic analysis revealed close genetic relationship among strains that shared the same MLST allelic profile. In the context of CRISPR-Cas system, which confers adaptive immunity against re-invading DNA, 73 out of 86 spacer sequences were found to be unique to Malaysian strains which harboured only type-II-C and/or type-I-E-a systems. A total of 48 tox genes which code for the diphtheria toxin were retrieved from the 74 genomes and with the exception of one truncated gene, only nucleotide substitutions were detected when compared to the tox gene sequence of PW8. More than half were synonymous substitution and only two were nonsynonymous substitutions whereby H24Y was predicted to have a damaging effect on the protein function whilst T262V was predicted to be tolerated. Both toxigenic and non-toxigenic toxin-gene bearing strains have been isolated in Malaysia but the repeated isolation of toxigenic strains with the same MLST profile suggests the possibility of some of these strains may be circulating in the population. Hence, efforts to increase herd immunity should be continued and supported by an effective monitoring and surveillance system to track, manage and control outbreak of cases. Copyright © 2017 Elsevier B.V. All rights reserved.

July 7, 2019

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters.

Trichoderma reesei (Ascomycota, Pezizomycotina) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis, and plant-fungal interactions. The genomes of QM6a and its high enzyme-producing mutants have been sequenced by second-generation-sequencing methods and are publicly available from the Joint Genome Institute. While these genome sequences have offered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architecture, and genetic modification or engineering.We integrated Pacific Biosciences and Illumina sequencing platforms for the highest-quality genome assembly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres, and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These findings open up a new perspective for future centromere and chromosome architecture studies. Next, we demonstrate that sexual crossing readily induced cytosine-to-thymine point mutations on both tandem and unlinked duplicated sequences. We also show by bioinformatic analysis that T. reesei has evolved a robust repeat-induced point mutation (RIP) system to accumulate AT-rich sequences, with longer AT-rich blocks having more RIP mutations. The widespread distribution of AT-rich blocks correlates genome-wide partitions with gene clusters, explaining why clustering of genes has been reported to not influence gene expression in T. reesei.Compartmentation of ancestral gene clusters by AT-rich blocks might promote flexibilities that are evolutionarily advantageous in this fungus’ soil habitats and other natural environments. Our analyses, together with the complete genome sequence, provide a better blueprint for biotechnological and industrial applications.

July 7, 2019

ConcatSeq: A method for increasing throughput of single molecule sequencing by concatenating short DNA fragments.

Single molecule sequencing (SMS) platforms enable base sequences to be read directly from individual strands of DNA in real-time. Though capable of long read lengths, SMS platforms currently suffer from low throughput compared to competing short-read sequencing technologies. Here, we present a novel strategy for sequencing library preparation, dubbed ConcatSeq, which increases the throughput of SMS platforms by generating long concatenated templates from pools of short DNA molecules. We demonstrate adaptation of this technique to two target enrichment workflows, commonly used for oncology applications, and feasibility using PacBio single molecule real-time (SMRT) technology. Our approach is capable of increasing the sequencing throughput of the PacBio RSII platform by more than five-fold, while maintaining the ability to correctly call allele frequencies of known single nucleotide variants. ConcatSeq provides a versatile new sample preparation tool for long-read sequencing technologies.

July 7, 2019

Evidence for contemporary switching of the O-antigen gene cluster between Shiga toxin-producing Escherichia coli strains colonizing cattle.

Shiga toxin-producing Escherichia coli (STEC) comprise a group of zoonotic enteric pathogens with ruminants, especially cattle, as the main reservoir. O-antigens are instrumental for host colonization and bacterial niche adaptation. They are highly immunogenic and, therefore, targeted by the adaptive immune system. The O-antigen is one of the most diverse bacterial cell constituents and variation not only exists between different bacterial species, but also between individual isolates/strains within a single species. We recently identified STEC persistently infecting cattle and belonging to the different serotypes O156:H25 (n = 21) and O182:H25 (n = 15) that were of the MLST sequence types ST300 or ST688. These STs differ by a single nucleotide in purA only. Fitness-, virulence-associated genome regions, and CRISPR/CAS (clustered regularly interspaced short palindromic repeats/CRISPR associated sequence) arrays of these STEC O156:H25 and O182:H25 isolates were highly similar, and identical genomic integration sites for the stx converting bacteriophages and the core LEE, identical Shiga toxin converting bacteriophage genes for stx1a, identical complete LEE loci, and identical sets of chemotaxis and flagellar genes were identified. In contrast to this genomic similarity, the nucleotide sequences of the O-antigen gene cluster (O-AGC) regions between galF and gnd and very few flanking genes differed fundamentally and were specific for the respective serotype. Sporadic aEPEC O156:H8 isolates (n = 5) were isolated in temporal and spatial proximity. While the O-AGC and the corresponding 5′ and 3′ flanking regions of these aEPEC isolates were identical to the respective region in the STEC O156:H25 isolates, the core genome, the virulence associated genome regions and the CRISPR/CAS elements differed profoundly. Our cumulative epidemiological and molecular data suggests a recent switch of the O-AGC between isolates with O156:H8 strains having served as DNA donors. Such O-antigen switches can affect the evaluation of a strain’s pathogenic and virulence potential, suggesting that NGS methods might lead to a more reliable risk assessment.

July 7, 2019

Updated reference genome sequence and annotation of Mycobacterium bovis AF2122/97.

We report here an update to the reference genome sequence of the bovine tuberculosis bacillus Mycobacterium bovis AF2122/97, generated using an integrative multiomics approach. The update includes 42 new coding sequences (CDSs), 14 modified annotations, 26 single-nucleotide polymorphism (SNP) corrections, and disclosure that the RD900 locus, previously described as absent from the genome, is in fact present. Copyright © 2017 Malone et al.

July 7, 2019

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Assembly of complex genomes using short reads remains a major challenge, which usually yields highly fragmented assemblies. Generation of ultradense linkage maps is promising for anchoring such assemblies, but traditional linkage mapping methods are hindered by the infrequency and unevenness of meiotic recombination that limit attainable map resolution. Here we develop a sequencing-based “in vitro” linkage mapping approach (called RadMap), where chromosome breakage and segregation are realized by generating hundreds of “subhaploid” fosmid/bacterial-artificial-chromosome clone pools, and by restriction site-associated DNA sequencing of these clone pools to produce an ultradense whole-genome restriction map to facilitate genome scaffolding. A bootstrap-based minimum spanning tree algorithm is developed for grouping and ordering of genome-wide markers and is implemented in a user-friendly, integrated software package (AMMO). We perform extensive analyses to validate the power and accuracy of our approach in the model plant Arabidopsis thaliana and human. We also demonstrate the utility of RadMap for enhancing the contiguity of a variety of whole-genome shotgun assemblies generated using either short Illumina reads (300 bp) or long PacBio reads (6-14 kb), with up to 15-fold improvement of N50 (~816 kb-3.7 Mb) and high scaffolding accuracy (98.1-98.5%). RadMap outperforms BioNano and Hi-C when input assembly is highly fragmented (contig N50 = 54 kb). RadMap can capture wide-range contiguity information and provide an efficient and flexible tool for high-resolution physical mapping and scaffolding of highly fragmented assemblies. Copyright © 2017 Dou et al.

July 7, 2019

Reclassification of the specialized metabolite producer Pseudomonas mesoacidophila ATCC 31433 as a member of the Burkholderia cepacia complex.

Pseudomonas mesoacidophila ATCC 31433 is a Gram-negative bacterium, first isolated from Japanese soil samples, that produces the monobactam isosulfazecin and the ß-lactam-potentiating bulgecins. To characterize the biosynthetic potential of P. mesoacidophila ATCC 31433, its complete genome was determined using single-molecule real-time DNA sequence analysis. The 7.8-Mb genome comprised four replicons, three chromosomal (each encoding rRNA) and one plasmid. Phylogenetic analysis demonstrated that P. mesoacidophila ATCC 31433 was misclassified at the time of its deposition and is a member of the Burkholderia cepacia complex, most closely related to Burkholderia ubonensis The sequenced genome shows considerable additional biosynthetic potential; known gene clusters for malleilactone, ornibactin, isosulfazecin, alkylhydroxyquinoline, and pyrrolnitrin biosynthesis and several uncharacterized biosynthetic gene clusters for polyketides, nonribosomal peptides, and other metabolites were identified. Furthermore, P. mesoacidophila ATCC 31433 harbors many genes associated with environmental resilience and antibiotic resistance and was resistant to a range of antibiotics and metal ions. In summary, this bioactive strain should be designated B. cepacia complex strain ATCC 31433, pending further detailed taxonomic characterization.IMPORTANCE This work reports the complete genome sequence of Pseudomonas mesoacidophila ATCC 31433, a known producer of bioactive compounds. Large numbers of both known and novel biosynthetic gene clusters were identified, indicating that P. mesoacidophila ATCC 31433 is an untapped resource for discovery of novel bioactive compounds. Phylogenetic analysis demonstrated that P. mesoacidophila ATCC 31433 is in fact a member of the Burkholderia cepacia complex, most closely related to the species Burkholderia ubonensis Further investigation of the classification and biosynthetic potential of P. mesoacidophila ATCC 31433 is warranted. Copyright © 2017 Loveridge et al.

July 7, 2019

Genome sequence of Acinetobacter lactucae OTEC-02, isolated from hydrocarbon-contaminated soil.

Acinetobacter lactucae OTEC-02 was isolated from hydrocarbon-contaminated soils. Whole-genome sequence analysis was performed to learn more about the strain’s ability to degrade different types of recalcitrant toxic monoaromatic hydrocarbons. The genome of this bacterium revealed its genomic properties and versatile metabolic features, as well as a complete prophage. Copyright © 2017 Rogel-Hernandez et al.

July 7, 2019

Genome sequence of Pasteurella multocida Razi 0002 of avian origin.

We report here on the genome sequence of Pasteurella multocida Razi 0002 of avian origin, isolated in Iran. The genome has a size of 2,289,036 bp, a G+C content of 40.3%, and is predicted to contain 2,079 coding sequences. Copyright © 2017 Sprague and Tadayon.

July 7, 2019

Hybrid assembly with long and short reads improves discovery of gene family expansions.

Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation.We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation.Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies.Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations.

July 7, 2019

Multiple genome sequences of Lactobacillus plantarum strains.

We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses. Copyright © 2017 Kafka et al.

July 7, 2019

Improved high-quality draft genome sequence and annotation of Burkholderia contaminans LMG 23361T.

Burkholderia contaminans LMG 23361 is the type strain of the species isolated from the milk of a dairy sheep with mastitis. Some pharmaceutical products contain disinfectants such as benzalkonium chloride (BZK) and previously we reported that B. contaminans LMG 23361(T) possesses the ability to inactivate BZK with high biodegradation rates. Here, we report an improved high-quality draft genome sequence of this strain. Copyright © 2017 Jung et al.

July 7, 2019

Genome sequence of microbacterium sp. strain TPU 3598, a lumichrome producer.

We report here the genome sequence of Microbacterium sp. strain TPU 3598, previously described as a producer of lumichrome. The sequenced genome size is 3,787,270 bp, the average G+C content is 68.39%, and 3,674 protein-coding sequences are predicted. Copyright © 2017 Yamamoto and Asano.

July 7, 2019

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Campylobacter sputorum is a non-thermotolerant campylobacter that is primarily isolated from food animals such as cattle and sheep. C. sputorum is also infrequently associated with human illness. Based on catalase and urease activity, three biovars are currently recognized within C. sputorum: bv. sputorum (catalase negative, urease negative), bv. fecalis (catalase positive, urease negative), and bv. paraureolyticus (catalase negative, urease positive). A multi-locus sequence typing (MLST) method was recently constructed for C. sputorum. MLST typing of several cattle-associated C. sputorum isolates suggested that they are members of a divergent C. sputorum clade. Although catalase positive, and thus technically bv. fecalis, the taxonomic position of these strains could not be determined solely by MLST. To further characterize C. sputorum, the genomes of four strains, representing all three biovars and the divergent clade, were sequenced to completion. Here we present a comparative genomic analysis of the four C. sputorum genomes. This analysis indicates that the three biovars and the cattle-associated strains are highly-related at the genome level with similarities in gene content. Furthermore, the four genomes are strongly syntenic with one or two minor inversions. However, substantial differences in gene content were observed among the three biovars. Finally, although the strain representing the cattle-associated isolates was shown to be C. sputorum, it is possible that this strain is a member of a novel C. sputorum subspecies; thus, these cattle-associated strains may form a second taxon within C. sputorum. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

Auto Tag: PacBio RSII

Complete genome sequence of the extremely thermoacidophilic archaeon Acidianus manzaensis YN-25.

Comparative genomic and phylogenetic analysis of a toxigenic clinical isolate of Corynebacterium diphtheriae strain B-D-16-78 from Malaysia.

Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters.

ConcatSeq: A method for increasing throughput of single molecule sequencing by concatenating short DNA fragments.

Evidence for contemporary switching of the O-antigen gene cluster between Shiga toxin-producing Escherichia coli strains colonizing cattle.

Updated reference genome sequence and annotation of Mycobacterium bovis AF2122/97.

Whole-genome restriction mapping by “subhaploid”-based RAD sequencing: An efficient and flexible approach for physical mapping and genome scaffolding.

Reclassification of the specialized metabolite producer Pseudomonas mesoacidophila ATCC 31433 as a member of the Burkholderia cepacia complex.

Genome sequence of Acinetobacter lactucae OTEC-02, isolated from hydrocarbon-contaminated soil.

Genome sequence of Pasteurella multocida Razi 0002 of avian origin.

Hybrid assembly with long and short reads improves discovery of gene family expansions.

Multiple genome sequences of Lactobacillus plantarum strains.

Improved high-quality draft genome sequence and annotation of Burkholderia contaminans LMG 23361T.

Genome sequence of microbacterium sp. strain TPU 3598, a lumichrome producer.

Comparative genomics of all three Campylobacter sputorum biovars and a novel cattle-associated C. sputorum clade.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert