Menu
July 19, 2019

Single molecule sequencing and genome assembly of a clinical specimen of Loa loa, the causative agent of loiasis.

More than 20% of the world’s population is at risk for infection by filarial nematodes and >180 million people worldwide are already infected. Along with infection comes significant morbidity that has a socioeconomic impact. The eight filarial nematodes that infect humans are Wuchereria bancrofti, Brugia malayi, Brugia timori, Onchocerca volvulus, Loa loa, Mansonella perstans, Mansonella streptocerca, and Mansonella ozzardi, of which three have published draft genome sequences. Since all have humans as the definitive host, standard avenues of research that rely on culturing and genetics have often not been possible. Therefore, genome sequencing provides an important window into understanding the biology of these parasites. The need for large amounts of high quality genomic DNA from homozygous, inbred lines; the availability of only short sequence reads from next-generation sequencing platforms at a reasonable expense; and the lack of random large insert libraries has limited our ability to generate high quality genome sequences for these parasites. However, the Pacific Biosciences single molecule, real-time sequencing platform holds great promise in reducing input amounts and generating sufficiently long sequences that bypass the need for large insert paired libraries.Here, we report on efforts to generate a more complete genome assembly for L. loa using genetically heterogeneous DNA isolated from a single clinical sample and sequenced on the Pacific Biosciences platform. To obtain the best assembly, numerous assemblers and sequencing datasets were analyzed, combined, and compared. Quiver-informed trimming of an assembly of only Pacific Biosciences reads by HGAP2 was selected as the final assembly of 96.4 Mbp in 2,250 contigs. This results in ~9% more of the genome in ~85% fewer contigs from ~80% less starting material at a fraction of the cost of previous Roche 454-based sequencing efforts.The result is the most complete filarial nematode assembly produced thus far and demonstrates the utility of single molecule sequencing on the Pacific Biosciences platform for genetically heterogeneous metazoan genomes.


July 19, 2019

Comparative genomic analysis and virulence differences in closely related Salmonella enterica serotype Heidelberg isolates from humans, retail meats, and animals.

Salmonella enterica subsp. enterica serovar Heidelberg (S. Heidelberg) is one of the top serovars causing human salmonellosis. Recently, an antibiotic-resistant strain of this serovar was implicated in a large 2011 multistate outbreak resulting from consumption of contaminated ground turkey that involved 136 confirmed cases, with one death. In this study, we assessed the evolutionary diversity of 44 S. Heidelberg isolates using whole-genome sequencing (WGS) generated by the 454 GS FLX (Roche) platform. The isolates, including 30 with nearly indistinguishable (one band difference) Xbal pulsed-field gel electrophoresis patterns (JF6X01.0032, JF6X01.0058), were collected from various sources between 1982 and 2011 and included nine isolates associated with the 2011 outbreak. Additionally, we determined the complete sequence for the chromosome and three plasmids from a clinical isolate associated with the 2011 outbreak using the Pacific Biosciences (PacBio) system. Using single-nucleotide polymorphism (SNP) analyses, we were able to distinguish highly clonal isolates, including strains isolated at different times in the same year. The isolates from the recent 2011 outbreak clustered together with a mean SNP variation of only 17 SNPs. The S. Heidelberg isolates carried a variety of phages, such as prophage P22, P4, lambda-like prophage Gifsy-2, and the P2-like phage which carries the sopE1 gene, virulence genes including 62 pathogenicity, and 13 fimbrial markers and resistance plasmids of the incompatibility (Inc)I1, IncA/C, and IncHI2 groups. Twenty-one strains contained an IncX plasmid carrying a type IV secretion system. On the basis of the recent and historical isolates used in this study, our results demonstrated that, in addition to providing detailed genetic information for the isolates, WGS can identify SNP targets that can be utilized for differentiating highly clonal S. Heidelberg isolates.


July 19, 2019

Unlocking the mystery of the hard-to-sequence phage genome: PaP1 methylome and bacterial immunity.

Whole-genome sequencing is an important method to understand the genetic information, gene function, biological characteristics and survival mechanisms of organisms. Sequencing large genomes is very simple at present. However, we encountered a hard-to-sequence genome of Pseudomonas aeruginosa phage PaP1. Shotgun sequencing method failed to complete the sequence of this genome.After persevering for 10 years and going over three generations of sequencing techniques, we successfully completed the sequence of the PaP1 genome with a length of 91,715 bp. Single-molecule real-time sequencing results revealed that this genome contains 51?N-6-methyladenines and 152?N-4-methylcytosines. Three significant modified sequence motifs were predicted, but not all of the sites found in the genome were methylated in these motifs. Further investigations revealed a novel immune mechanism of bacteria, in which host bacteria can recognise and repel modified bases containing inserts in a large scale. This mechanism could be accounted for the failure of the shotgun method in PaP1 genome sequencing. This problem was resolved using the nfi- mutant of Escherichia coli DH5a as a host bacterium to construct a shotgun library.This work provided insights into the hard-to-sequence phage PaP1 genome and discovered a new mechanism of bacterial immunity. The methylome of phage PaP1 is responsible for the failure of shotgun sequencing and for bacterial immunity mediated by enzyme Endo V activity; this methylome also provides a valuable resource for future studies on PaP1 genome replication and modification, as well as on gene regulation and host interaction.


July 19, 2019

Recently published Streptomyces genome sequences.

Many readers of this journal will need no introduction to the bacterial genus Streptomyces, which includes several hundred species, many of which produce biotechnologically useful secondary metabolites. The last 2 years have seen numerous publications describing Streptomyces genome sequences (Table?1), mostly as short genome announcements restricted to just 500 words and therefore allowing little description and analysis. Our aim in this current manuscript is to survey these recent publications and to dig a little deeper where appropriate. The genus Streptomyces is now one of the most highly sequenced, with 19 finished genomic sequences (Table?2) and a further 125 draft assemblies available in the GenBank database as of 3rd of May 2014; by the time this is published, no doubt there will be more. The reasons given for sequencing this latest crop of Streptomyces include production of industrially important enzymes, degradation of lignin, antibiotic production, rapid growth and halo-tolerance and an endophytic lifestyle (Table?1).


July 19, 2019

The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication.

The cultivation of rice in Africa dates back more than 3,000 years. Interestingly, African rice is not of the same origin as Asian rice (Oryza sativa L.) but rather is an entirely different species (i.e., Oryza glaberrima Steud.). Here we present a high-quality assembly and annotation of the O. glaberrima genome and detailed analyses of its evolutionary history of domestication and selection. Population genomics analyses of 20 O. glaberrima and 94 Oryza barthii accessions support the hypothesis that O. glaberrima was domesticated in a single region along the Niger river as opposed to noncentric domestication events across Africa. We detected evidence for artificial selection at a genome-wide scale, as well as with a set of O. glaberrima genes orthologous to O. sativa genes that are known to be associated with domestication, thus indicating convergent yet independent selection of a common set of genes during two geographically and culturally distinct domestication processes.


July 19, 2019

Reducing assembly complexity of microbial genomes with single-molecule sequencing.

The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem.To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads.Automated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.


July 19, 2019

Population structure of KPC-producing Klebsiella pneumoniae isolates from midwestern U.S. hospitals.

Genome sequencing of carbapenem-resistant Klebsiella pneumoniae isolates from regional U.S. hospitals was used to characterize strain diversity and the bla(KPC) genetic context. A phylogeny based on core single-nucleotide variants (SNVs) supports a division of sequence type 258 (ST258) into two distinct groups. The primary differences between the groups are in the capsular polysaccharide locus (cps) and their plasmid contents. A strict association between clade and KPC variant was found. The bla(KPC) gene was found on variants of two plasmid backbones. This study indicates that highly similar K. pneumoniae subpopulations coexist within the same hospitals over time. Copyright © 2014, American Society for Microbiology. All Rights Reserved.


July 19, 2019

Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae.

Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment. Copyright © 2014, American Association for the Advancement of Science.


July 19, 2019

Technology: SMRT move?

One of the major challenges of de novo mammalian genome assembly arises from the presence of large, interspersed segmental duplications with high levels of sequence identity. These regions are particularly difficult to assemble using current short-read high-throughput sequencing methods. Combining long-read single-molecule, real-time (SMRT) sequencing with a hierarchical genome-assembly process (HGAP), as well as the consensus and variant caller Quiver, enabled these complex genomic regions to be resolved in a more cost-and time-effective manner than previously possible.


July 19, 2019

Reconstructing complex regions of genomes using long-read sequencing technology.

Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state.


July 19, 2019

Whole genome complete resequencing of Bacillus subtilis natto by combining long reads with high-quality short reads.

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome.


July 19, 2019

The extant World War 1 dysentery bacillus NCTC1: a genomic analysis.

Shigellosis (previously bacillary dysentery) was the primary diarrhoeal disease of World War 1, but outbreaks still occur in military operations, and shigellosis causes hundreds of thousands of deaths per year in developing nations. We aimed to generate a high-quality reference genome of the historical Shigella flexneri isolate NCTC1 and to examine the isolate for resistance to antimicrobials.In this genomic analysis, we sequenced the oldest extant Shigella flexneri serotype 2a isolate using single-molecule real-time (SMRT) sequencing technology. Isolated from a soldier with dysentery from the British forces fighting on the Western Front in World War 1, this bacterium, NCTC1, was the first isolate accessioned into the National Collection of Type Cultures. We created a reference sequence for NCTC1, investigated the isolate for antimicrobial resistance, and undertook comparative genetics with S flexneri reference strains isolated during the 100 years since World War 1.We discovered that NCTC1 belonged to a 2a lineage of S flexneri, with which it shares common characteristics and a large core genome. NCTC1 was resistant to penicillin and erythromycin, and contained a complement of chromosomal antimicrobial resistance genes similar to that of more recent isolates. Genomic islands gained in the S flexneri 2a lineage over time were predominately associated with additional antimicrobial resistances, virulence, and serotype conversion.This S flexneri 2a lineage is a well adapted pathogen that has continued to respond to selective pressures. We have created a valuable historical benchmark for shigellae in the form of a high-quality reference sequence for a publicly available isolate.The Wellcome Trust. Copyright © 2014 Baker et al. Open Access article distributed under the terms of CC BY. Published by Elsevier Ltd. All rights reserved.


July 19, 2019

Resolving the complexity of the human genome using single-molecule sequencing.

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome–78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.


July 19, 2019

Evolution of mosquito preference for humans linked to an odorant receptor.

Female mosquitoes are major vectors of human disease and the most dangerous are those that preferentially bite humans. A ‘domestic’ form of the mosquito Aedes aegypti has evolved to specialize in biting humans and is the main worldwide vector of dengue, yellow fever, and chikungunya viruses. The domestic form coexists with an ancestral, ‘forest’ form that prefers to bite non-human animals and is found along the coast of Kenya. We collected the two forms, established laboratory colonies, and document striking divergence in preference for human versus non-human animal odour. We further show that the evolution of preference for human odour in domestic mosquitoes is tightly linked to increases in the expression and ligand-sensitivity of the odorant receptor AaegOr4, which we found recognizes a compound present at high levels in human odour. Our results provide a rare example of a gene contributing to behavioural evolution and provide insight into how disease-vectoring mosquitoes came to specialize on humans.


July 19, 2019

Comparative genome analysis of Wolbachia strain wAu

BACKGROUND:Wolbachia intracellular bacteria can manipulate the reproduction of their arthropod hosts, including inducing sterility between populations known as cytoplasmic incompatibility (CI). Certain strains have been identified that are unable to induce or rescue CI, including wAu from Drosophila. Genome sequencing and comparison with CI-inducing related strain wMel was undertaken in order to better understand the molecular basis of the phenotype.RESULTS:Although the genomes were broadly similar, several rearrangements were identified, particularly in the prophage regions. Many orthologous genes contained single nucleotide polymorphisms (SNPs) between the two strains, but a subset containing major differences that would likely cause inactivation in wAu were identified, including the absence of the wMel ortholog of a gene recently identified as a CI candidate in a proteomic study. The comparative analyses also focused on a family of transcriptional regulator genes implicated in CI in previous work, and revealed numerous differences between the strains, including those that would have major effects on predicted function.CONCLUSIONS:The study provides support for existing candidates and novel genes that may be involved in CI, and provides a basis for further functional studies to examine the molecular basis of the phenotype.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.