Plasmodium knowlesi, a common parasite of macaques, is recognised as a significant cause of human malaria in Malaysia. The P. knowlesi A1H1 line has been adapted to continuous culture in human erythrocytes, successfully providing an in vitro model to study the parasite. We have assembled a reference genome for the PkA1-H.1 line using PacBio long read combined with Illumina short read sequence data. Compared with the H-strain reference, the new reference has improved genome coverage and a novel description of methylation sites. The PkA1-H.1 reference will enhance the capabilities of the in vitro model to improve the understanding of P.…
Bacteria use quorum sensing (QS) to regulate gene expression. We identified a group A Streptococcus (GAS) strain possessing the QS system sil, which produces functional bacteriocins, through a sequential signaling pathway integrating host and bacterial signals. Host cells infected by GAS release asparagine (ASN), which is sensed by the bacteria to alter its gene expression and rate of proliferation. We show that upon ASN sensing, GAS upregulates expression of the QS autoinducer peptide SilCR. Initial SilCR expression activates the autoinduction cycle for further SilCR production. The autoinduction process propagates throughout the GAS population, resulting in bacteriocin production. Subcutaneous co-injection of…
Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species.We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule…
Type 1 pili (T1P) are major virulence factors for uropathogenic Escherichia coli (UPEC), which cause both acute and recurrent urinary tract infections. T1P expression therefore is of direct relevance for disease. T1P are phase variable (both piliated and nonpiliated bacteria exist in a clonal population) and are controlled by an invertible DNA switch (fimS), which contains the promoter for the fim operon encoding T1P. Inversion of fimS is stochastic but may be biased by environmental conditions and other signals that ultimately converge at fimS itself. Previous studies of fimS sequences important for T1P phase variation have focused on laboratory-adapted E.…
DNA methylation is an epigenetic modification of the genome involved in regulating crucial cellular processes, including transcription and chromosome stability. Advances in PacBio sequencing technologies can be used to robustly reveal methylation sites. The methylome of the Mycobacterium tuberculosis complex is poorly understood but may be involved in virulence, hypoxic survival and the emergence of drug resistance. In the most extensive study to date, we characterise the methylome across the 4 major lineages of M. tuberculosis and 2 lineages of M. africanum, the leading causes of tuberculosis disease in humans. We reveal lineage-specific methylated motifs and strain-specific mutations that are…
Malaria infection during pregnancy, caused by the sequestering of Plasmodium falciparum parasites in the placenta, leads to high infant mortality and maternal morbidity. The parasite-placenta adherence mechanism is mediated by the VAR2CSA protein, a target for natural occurring immunity. Currently, vaccine development is based on its ID1-DBL2Xb domain however little is known about the global genetic diversity of the encoding var2csa gene, which could influence vaccine efficacy. In a comprehensive analysis of the var2csa gene in >2,000?P. falciparum field isolates across 23 countries, we found that var2csa is duplicated in high prevalence (>25%), African and Oceanian populations harbour a much…
Escherichia coli represents the primary etiological agent responsible for urinary tract infections, one of the most common infections in humans. We report here the complete genome sequence of uropathogenic Escherichia coli strain CI5, a clinical pyelonephritis isolate used for studying pathogenesis. Copyright © 2015 Mehershahi et al.
Streptococcus agalactiae (group B Streptococcus) is a common commensal strain in the human gastrointestinal tract that can also cause invasive disease in humans and other animals. We report here the complete genome sequence of S. agalactiae SG-M1, a serotype III, multilocus sequence type 283 strain, isolated from a Singaporean patient suffering from meningitis. Copyright © 2015 Mehershahi et al.
Escherichia coli is the most well-studied bacterium and a common colonizer of the lower mammalian gastrointestinal tract. We report here the complete genome sequence of the original Escherichia coli isolate, strain NCTC86, which was described by Theodor Escherich, for whom the genus is named. Copyright © 2017 Khetrapal et al.
Escherichia coli is the most common bacterium causing urinary tract infections in humans. We report here the complete genome sequence of the uropathogenic Escherichia coli strain NU14, a clinical pyelonephritis isolate used for studying pathogenesis. Copyright © 2017 Mehershahi and Chen.
Streptococcus agalactiae (group B Streptococcus [GBS]) has not been described as a foodborne pathogen. However, in 2015, a large outbreak of severe invasive sequence type (ST) 283 GBS infections in adults epidemiologically linked to the consumption of raw freshwater fish occurred in Singapore. We attempted to determine the scale of the outbreak, define the clinical spectrum of disease, and link the outbreak to contaminated fish.Time-series analysis was performed on microbiology laboratory data. Food handlers and fishmongers were screened for enteric carriage of GBS. A retrospective cohort study was conducted to assess differences in demographic and clinical characteristics of patients with…
The mycalesine butterfly Bicyclus anynana , the ‘Squinting bush brown’, is a model organism in the study of lepidopteran ecology, development and evolution. Here, we present a draft genome sequence for B. anynana to serve as a genomics resource for current and future studies of this important model species.Seven libraries with insert sizes ranging from 350 bp to 20 kb were constructed using DNA from an inbred female and sequenced using both Illumina and PacBio technology. 128 Gb raw Illumina data were filtered to 124 Gb and assembled to a final size of 475 Mb (~260X assembly coverage). Contigs were…
Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations…
Viral populations are complex, dynamic, and fast evolving. The evolution of groups of closely related viruses in a competitive environment is termed quasispecies. To fully understand the role that quasispecies play in viral evolution, characterizing the trajectories of viral genotypes in an evolving population is the key. In particular, long-range haplotype information for thousands of individual viruses is critical; yet generating this information is non-trivial. Popular deep sequencing methods generate relatively short reads that do not preserve linkage information, while third generation sequencing methods have higher error rates that make detection of low frequency mutations a bioinformatics challenge. Here we…
The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.