Hybrid assembly Archives - Page 17 of 30

July 7, 2019

Genome sequence of Weissella ceti NC36, an emerging pathogen of farmed rainbow trout in the United States.

Novel Weissella sp. bacteria have recently been reported to be associated with disease outbreaks in cultured rainbow trout (Oncorhynchus mykiss) at commercial farms in China, Brazil, and the United States. Here we present the first genome sequence of this novel Weissella species, isolated from the southeastern United States.

July 7, 2019

Genome sequence of the human abscess isolate Streptococcus intermedius BA1.

Streptococcus intermedius is a human pathogen with a propensity for abscess formation. We report a high-quality draft genome sequence of S. intermedius strain BA1, an isolate from a human epidural abscess. This sequence provides insight into the biology of S. intermedius and will aid investigations of pathogenicity.

July 7, 2019

Noncontiguous finished genome sequence of Staphylococcus aureus KLT6, a staphylococcal enterotoxin B-positive strain involved in a food poisoning outbreak in Switzerland.

We present the first complete genome sequence of a Staphylococcus aureus strain assigned to clonal complex 12. The strain was isolated in a food poisoning outbreak due to contaminated potato salad in Switzerland in 2009, and it produces staphylococcal enterotoxin B.

July 7, 2019

Complete genome sequence of the encephalomyelitic Burkholderia pseudomallei strain MSHR305.

We describe the complete genome sequence of Burkholderia pseudomallei MSHR305, a clinical isolate taken from a fatal encephalomyelitis case, a rare form of melioidosis. This sequence will be used for comparisons to identify the genes that are involved in neurological cases.

July 7, 2019

Complete genome sequence of Bacillus subtilis strain PY79.

Bacillus subtilis is a Gram-positive soil-dwelling and endospore-forming bacterium in the phylum Firmicutes. B. subtilis strain PY79 is a prototrophic laboratory strain that has been highly used for studying a wide variety of cellular pathways. Here, we announce the complete whole-genome sequence of B. subtilis PY79.

July 7, 2019

The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader.

Anaerobic gut fungi represent a distinct early-branching fungal phylum (Neocallimastigomycota) and reside in the rumen, hindgut, and feces of ruminant and nonruminant herbivores. The genome of an anaerobic fungal isolate, Orpinomyces sp. strain C1A, was sequenced using a combination of Illumina and PacBio single-molecule real-time (SMRT) technologies. The large genome (100.95 Mb, 16,347 genes) displayed extremely low G+C content (17.0%), large noncoding intergenic regions (73.1%), proliferation of microsatellite repeats (4.9%), and multiple gene duplications. Comparative genomic analysis identified multiple genes and pathways that are absent in Dikarya genomes but present in early-branching fungal lineages and/or nonfungal Opisthokonta. These included genes for posttranslational fucosylation, the production of specific intramembrane proteases and extracellular protease inhibitors, the formation of a complete axoneme and intraflagellar trafficking machinery, and a near-complete focal adhesion machinery. Analysis of the lignocellulolytic machinery in the C1A genome revealed an extremely rich repertoire, with evidence of horizontal gene acquisition from multiple bacterial lineages. Experimental analysis indicated that strain C1A is a remarkable biomass degrader, capable of simultaneous saccharification and fermentation of the cellulosic and hemicellulosic fractions in multiple untreated grasses and crop residues examined, with the process significantly enhanced by mild pretreatments. This capability, acquired during its separate evolutionary trajectory in the rumen, along with its resilience and invasiveness compared to prokaryotic anaerobes, renders anaerobic fungi promising agents for consolidated bioprocessing schemes in biofuels production.

July 7, 2019

Hammondia hammondi, an avirulent relative of Toxoplasma gondii, has functional orthologs of known T. gondii virulence genes.

Toxoplasma gondii is a ubiquitous protozoan parasite capable of infecting all warm-blooded animals, including humans. Its closest extant relative, Hammondia hammondi, has never been found to infect humans and, in contrast to T. gondii, is highly attenuated in mice. To better understand the genetic bases for these phenotypic differences, we sequenced the genome of a H. hammondi isolate (HhCatGer041) and found the genomic synteny between H. hammondi and T. gondii to be >95%. We used this genome to determine the H. hammondi primary sequence of two major T. gondii mouse virulence genes, TgROP5 and TgROP18. When we expressed these genes in T. gondii, we found that H. hammondi orthologs of TgROP5 and TgROP18 were functional. Similar to T. gondii, the HhROP5 locus is expanded, and two distinct HhROP5 paralogs increased the virulence of a T. gondii TgROP5 knockout strain. We also identified a 107 base pair promoter region, absent only in type III TgROP18, which is necessary for TgROP18 expression. This result indicates that the ROP18 promoter was active in the most recent common ancestor of these two species and that it was subsequently inactivated in progenitors of the type III lineage. Overall, these data suggest that the virulence differences between these species are not solely due to the functionality of these key virulence factors. This study provides evidence that other mechanisms, such as differences in gene expression or the lack of currently uncharacterized virulence factors, may underlie the phenotypic differences between these species.

July 7, 2019

Cerulean: A hybrid assembly using high throughput short and long reads

Genome assembly using high throughput data with short reads, arguably, remains an unresolvable task in repetitive genomes, since when the length of a repeat exceeds the read length, it becomes difficult to unambiguously connect the flanking regions. The emergence of third generation sequencing (Pacific Biosciences) with long reads enables the opportunity to resolve complicated repeats that could not be resolved by the short read data. However, these long reads have high error rate and it is an uphill task to assemble the genome without using additional high quality short reads. Recently, Koren et al. 2012 proposed an approach to use high quality short reads data to correct these long reads and, thus, make the assembly from long reads possible. However, due to the large size of both dataset (short and long reads), error-correction of these long reads requires excessively high computational resources, even on small bacterial genomes. In this work, instead of error correction of long reads, we first assemble the short reads and later map these long reads on the assembly graph to resolve repeats.

July 7, 2019

Complete genome sequence of Staphylococcus aureus Tager 104, a sequence type 49 ancestor.

We report here the complete genome sequence of Staphylococcus aureus Tager 104, originally isolated from a cutaneous abscess in 1947 by Morris Tager. Sequence typing of the strain revealed its membership in sequence type 49 (ST49), a previously unknown multilocus sequence type (MLST) in clinical samples.

July 7, 2019

Finished bacterial genomes from shotgun sequence data.

Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been “finished” at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.

July 7, 2019

A hybrid approach for the automated finishing of bacterial genomes.

Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.

July 7, 2019

Genomic epidemiology of the Escherichia coli O104:H4 outbreaks in Europe, 2011.

The degree to which molecular epidemiology reveals information about the sources and transmission patterns of an outbreak depends on the resolution of the technology used and the samples studied. Isolates of Escherichia coli O104:H4 from the outbreak centered in Germany in May-July 2011, and the much smaller outbreak in southwest France in June 2011, were indistinguishable by standard tests. We report a molecular epidemiological analysis using multiplatform whole-genome sequencing and analysis of multiple isolates from the German and French outbreaks. Isolates from the German outbreak showed remarkably little diversity, with only two single nucleotide polymorphisms (SNPs) found in isolates from four individuals. Surprisingly, we found much greater diversity (19 SNPs) in isolates from seven individuals infected in the French outbreak. The German isolates form a clade within the more diverse French outbreak strains. Moreover, five isolates derived from a single infected individual from the French outbreak had extremely limited diversity. The striking difference in diversity between the German and French outbreak samples is consistent with several hypotheses, including a bottleneck that purged diversity in the German isolates, variation in mutation rates in the two E. coli outbreak populations, or uneven distribution of diversity in the seed populations that led to each outbreak.

July 7, 2019

Complete genome sequence of Liberibacter crescens BT-1.

Liberibacter crescens BT-1, a Gram-negative, rod-shaped bacterial isolate, was previously recovered from mountain papaya to gain insight on Huanglongbing (HLB) and Zebra Chip (ZC) diseases. The genome of BT-1 was sequenced at the Interdisciplinary Center for Biotechnology Research (ICBR) at the University of Florida. A finished assembly and annotation yielded one chromosome with a length of 1,504,659 bp and a G+C content of 35.4%. Comparison to other species in the Liberibacter genus, L. crescens has many more genes in thiamine and essential amino acid biosynthesis. This likely explains why L. crescens BT-1 is culturable while the known Liberibacter strains have not yet been cultured. Similar to Candidatus L. asiaticus psy62, the L. crescens BT-1 genome contains two prophage regions.

July 7, 2019

Draft genome sequence of Salimicrobium sp. strain MJ3, isolated from Myulchi-Jeot, Korean fermented seafood.

Salimicrobium sp. strain MJ3 was isolated from myulchi-jeot, traditional fermented seafood made from anchovy in South Korea. Here we announce the draft genome sequence of Salimicrobium sp. MJ3 with 2,717,782 bp, which consists of 45 contigs (>500 bp in size), and provide a description of their annotation.

July 7, 2019

Next generation sequencing technologies and the changing landscape of phage genomics.

The dawn of next generation sequencing technologies has opened up exciting possibilities for whole genome sequencing of a plethora of organisms. The 2nd and 3rd generation sequencing technologies, based on cloning-free, massively parallel sequencing, have enabled the generation of a deluge of genomic sequences of both prokaryotic and eukaryotic origin in the last seven years. However, whole genome sequencing of bacterial viruses has not kept pace with this revolution, despite the fact that their genomes are orders of magnitude smaller in size compared with bacteria and other organisms. Sequencing phage genomes poses several challenges; (1) obtaining pure phage genomic material, (2) PCR amplification biases and (3) complex nature of their genetic material due to features such as methylated bases and repeats that are inherently difficult to sequence and assemble. Here we describe conclusions drawn from our efforts in sequencing hundreds of bacteriophage genomes from a variety of Gram-positive and Gram-negative bacteria using Sanger, 454, Illumina and PacBio technologies. Based on our experience we propose several general considerations regarding sample quality, the choice of technology and a “blended approach” for generating reliable whole genome sequences of phages.

Asset Tag: Hybrid assembly

Genome sequence of Weissella ceti NC36, an emerging pathogen of farmed rainbow trout in the United States.

Genome sequence of the human abscess isolate Streptococcus intermedius BA1.

Noncontiguous finished genome sequence of Staphylococcus aureus KLT6, a staphylococcal enterotoxin B-positive strain involved in a food poisoning outbreak in Switzerland.

Complete genome sequence of the encephalomyelitic Burkholderia pseudomallei strain MSHR305.

Complete genome sequence of Bacillus subtilis strain PY79.

The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader.

Hammondia hammondi, an avirulent relative of Toxoplasma gondii, has functional orthologs of known T. gondii virulence genes.

Cerulean: A hybrid assembly using high throughput short and long reads

Complete genome sequence of Staphylococcus aureus Tager 104, a sequence type 49 ancestor.

Finished bacterial genomes from shotgun sequence data.

A hybrid approach for the automated finishing of bacterial genomes.

Genomic epidemiology of the Escherichia coli O104:H4 outbreaks in Europe, 2011.

Complete genome sequence of Liberibacter crescens BT-1.

Draft genome sequence of Salimicrobium sp. strain MJ3, isolated from Myulchi-Jeot, Korean fermented seafood.

Next generation sequencing technologies and the changing landscape of phage genomics.

Subscribe for blog updates:

Filter by topic

Talk with an expert

Antimicrobial resistance research

Subscribe for blog updates:

Filter by topic

Talk with an expert