Menu
July 7, 2019

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species.

The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.


July 7, 2019

Precise breakpoint localization of large genomic deletions using PacBio and Illumina next-generation sequencers.

Herein we present the applicability of single-molecule (PacBio RS) and second-generation sequencing technology (Illumina) to the characterization of large genomic deletions. By testing samples previously characterized using a Sanger approach, our methods determined that both next-generation sequencing platforms were able to identify the position of deletion breakpoints. Our results point out various advantages of next-generation sequencing platforms when characterizing genomic deletions; however, special attention must be dedicated to identical sequences flanking the breakpoints, such as poly(N) motifs.


July 7, 2019

PBSIM: PacBio reads simulator–toward accurate genome assembly.

PacBio sequencers produce two types of characteristic reads (continuous long reads: long and high error rate and circular consensus sequencing: short and low error rate), both of which could be useful for de novo assembly of genomes. Currently, there is no available simulator that targets the specific generation of PacBio libraries.Our analysis of 13 PacBio datasets showed characteristic features of PacBio reads (e.g. the read length of PacBio reads follows a log-normal distribution). We have developed a read simulator, PBSIM, that captures these features using either a model-based or sampling-based method. Using PBSIM, we conducted several hybrid error correction and assembly tests for PacBio reads, suggesting that a continuous long reads coverage depth of at least 15 in combination with a circular consensus sequencing coverage depth of at least 30 achieved extensive assembly results.PBSIM is freely available from the web under the GNU GPL v2 license (http://code.google.com/p/pbsim/).


July 7, 2019

Complete genome sequence of Leifsonia xyli subsp. cynodontis strain DSM46306, a gram-positive bacterial pathogen of grasses.

We announce the complete genome sequence of Leifsonia xyli subsp. cynodontis, a vascular pathogen of Bermuda grass. The species also comprises Leifsonia xyli subsp. xyli, a sugarcane pathogen. Since these two subspecies have genome sequences available, a comparative analysis will contribute to our understanding of the differences in their biology and host specificity.


July 7, 2019

Genomes of “Spiribacter”, a streamlined, successful halophilic bacterium.

Thalassosaline waters produced by the concentration of seawater are widespread and common extreme aquatic habitats. Their salinity varies from that of sea water (ca. 3.5%) to saturation for NaCl (ca. 37%). Obviously the microbiota varies dramatically throughout this range. Recent metagenomic analysis of intermediate salinity waters (19%) indicated the presence of an abundant and yet undescribed gamma-proteobacterium. Two strains belonging to this group have been isolated from saltern ponds of intermediate salinity in two Spanish salterns and were named “Spiribacter”.The genomes of two isolates of “Spiribacter” have been fully sequenced and assembled. The analysis of metagenomic datasets indicates that microbes of this genus are widespread worldwide in medium salinity habitats representing the first ecologically defined moderate halophile. The genomes indicate that the two isolates belong to different species within the same genus. Both genomes are streamlined with high coding densities, have few regulatory mechanisms and no motility or chemotactic behavior. Metabolically they are heterotrophs with a subgroup II xanthorhodopsin as an additional energy source when light is available.This is the first bacterium that has been proven by culture independent approaches to be prevalent in hypersaline habitats of intermediate salinity (half a way between the sea and NaCl saturation). Predictions from the proteome and analysis of transporter genes, together with a complete ectoine biosynthesis gene cluster are consistent with these microbes having the salt-out-organic-compatible solutes type of osmoregulation. All these features are also consistent with a well-adapted fully planktonic microbe while other halophiles with more complex genomes such as Salinibacter ruber might have particle associated microniches.


July 7, 2019

Structure of the type IV secretion system in different strains of Anaplasma phagocytophilum.

Anaplasma phagocytophilum is an intracellular organism in the Order Rickettsiales that infects diverse animal species and is causing an emerging disease in humans, dogs and horses. Different strains have very different cell tropisms and virulence. For example, in the U.S., strains have been described that infect ruminants but not dogs or rodents. An intriguing question is how the strains of A. phagocytophilum differ and what different genome loci are involved in cell tropisms and/or virulence. Type IV secretion systems (T4SS) are responsible for translocation of substrates across the cell membrane by mechanisms that require contact with the recipient cell. They are especially important in organisms such as the Rickettsiales which require T4SS to aid colonization and survival within both mammalian and tick vector cells. We determined the structure of the T4SS in 7 strains from the U.S. and Europe and revised the sequence of the repetitive virB6 locus of the human HZ strain.Although in all strains the T4SS conforms to the previously described split loci for vir genes, there is great diversity within these loci among strains. This is particularly evident in the virB2 and virB6 which are postulated to encode the secretion channel and proteins exposed on the bacterial surface. VirB6-4 has an unusual highly repetitive structure and can have a molecular weight greater than 500,000. For many of the virs, phylogenetic trees position A. phagocytophilum strains infecting ruminants in the U.S. and Europe distant from strains infecting humans and dogs in the U.S.Our study reveals evidence of gene duplication and considerable diversity of T4SS components in strains infecting different animals. The diversity in virB2 is in both the total number of copies, which varied from 8 to 15 in the herein characterized strains, and in the sequence of each copy. The diversity in virB6 is in the sequence of each of the 4 copies in the single locus and the presence of varying numbers of repetitive units in virB6-3 and virB6-4. These data suggest that the T4SS should be investigated further for a potential role in strain virulence of A. phagocytophilum.


July 7, 2019

Genome sequence of “Candidatus Microthrix parvicella” Bio17-1, a long-chain-fatty-acid-accumulating filamentous actinobacterium from a biological wastewater treatment plant.

Candidatus Microthrix bacteria are deeply branching filamentous actinobacteria which occur at the water-air interface of biological wastewater treatment plants, where they are often responsible for foaming and bulking. Here, we report the first draft genome sequence of a strain from this genus: “Candidatus Microthrix parvicella” strain Bio17-1.


July 7, 2019

A hybrid approach for the automated finishing of bacterial genomes.

Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.


July 7, 2019

Complete genome sequence of Liberibacter crescens BT-1.

Liberibacter crescens BT-1, a Gram-negative, rod-shaped bacterial isolate, was previously recovered from mountain papaya to gain insight on Huanglongbing (HLB) and Zebra Chip (ZC) diseases. The genome of BT-1 was sequenced at the Interdisciplinary Center for Biotechnology Research (ICBR) at the University of Florida. A finished assembly and annotation yielded one chromosome with a length of 1,504,659 bp and a G+C content of 35.4%. Comparison to other species in the Liberibacter genus, L. crescens has many more genes in thiamine and essential amino acid biosynthesis. This likely explains why L. crescens BT-1 is culturable while the known Liberibacter strains have not yet been cultured. Similar to Candidatus L. asiaticus psy62, the L. crescens BT-1 genome contains two prophage regions.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.