The newer hierarchical genome assembly process (HGAP) performs de novo assembly using data from a single PacBio long insert library. To assess the benefits of this method, DNA from several Salmonella enterica serovars was isolated from a pure culture. Genome sequencing was performed using Pacific Biosciences RS sequencing technology. The HGAP process enabled us to close sixteen Salmonella subsp. enterica genomes and their associated mobile elements: The ten serotypes include: Salmonella enterica subsp. enterica serovar Enteritidis (S. Enteritidis) S. Bareilly, S. Heidelberg, S. Cubana, S. Javiana and S. Typhimurium, S. Newport, S. Montevideo, S. Agona, and S. Tennessee. In addition, we were able to detect novel methyltransferases (MTases) by using the Pacific Biosciences kinetic score distributions showing that each serovar appears to have a novel methylation pattern. For example while all Salmonella serovars examined so far have methylase specific activity for 5’-GATC-3’/3’-CTAG-5’ and 5’-CAGAG-3’/3’-GTCTC-5’ (underlined base indicates a modification), S. Heidelberg is uniquely specific for 5’-ACCANCC-3’/3’-TGGTNGG-5’, while S. Typhimurium has uniquely methylase specific for 5′-GATCAG-3’/3′- CTAGTC-5′ sites, for the samples examined so far. We believe that this may be due to the unique environments and phages that these serotypes have been exposed to. Furthermore, our analysis identified and closed a variety of plasmids such as mobilization plasmids, antimicrobial resistance plasmids and IncX plasmids carrying a Type IV secretion system (T4SS). The VirB/D4 T4SS apparatus is important in that it assists with rapid dissemination of antibiotic resistance and virulence determinants. Presently, only limited information exists regarding the genotypic characterization of drug resistance in S. Heidelberg isolates derived from various host species. Here, we characterize two S. Heidelberg outbreak isolates from two different outbreaks. Both isolates contain the IncX plasmid of approximately 35 kb, and carried the genes virB1, virB2, virB3/4, virB5, virB6, virB7, virB8, virB9, virB10, virB11, virD2, and virD4, that are associated with the T4SS. In addition, the outbreak isolate associated with ground turkey carries a 4,473 bp mobilization plasmid and an incompatibility group (Inc) I1 antimicrobial resistance plasmid encoding resistance to gentamicin (aacC2), beta-lactam (bl2b_tem), streptomycin (aadAI) and tetracycline (tetA, tetR) while the outbreak isolate associated with chicken breast carries the IncI1 plasmid encoding resistance to gentamicin (aacC2), streptomycin (aadAI) and sulfisoxazole (sul1). Using this new technology we explored the genetic elements present in resistant pathogens which will achieve a better understanding of the evolution of Salmonella.
Haplotyping of full-length transcript reads from long-read sequencing can reveal allelic imbalances in isoform expression
The Pacific Biosciences Iso-Seq method, which can produce high-quality isoform sequences of 10 kb and longer, has been used to annotate many important plant and animal genomes. Here, we develop an algorithm called IsoPhase that postprocesses Iso-Seq data to retrieve allele specific isoform information. Using simulated data, we show that for both diploid and tetraploid genomes, IsoPhase results in good SNP recovery with low FDR at error rates consistent with CCS reads. We apply IsoPhase to a haplotyperesolved genome assembly and multiple fetal tissue Iso-Seq dataset from a F1 cross of Angus x Brahman cattle subspecies. IsoPhase-called haplotypes were validated by the phased assembly and demonstrate the potential for revealing allelic imbalances in isoform expression.
PAG Conference: Using cattle subspecies crosses to explore chromosome of origin expression through Iso-seq analysis
In this PAG 2018 presentation, John Williams of University of Adelaide, presents research on using PacBio SMRT Sequencing to explore the genetic origins of cattle subspecies, Angus (Bos taurus taurus)…
Background Assemblies of diploid genomes are generally unphased, pseudo-haploid representations that do not correctly reconstruct the two parental haplotypes present in the individual sequenced. Instead, the assembly alternates between parental haplotypes and may contain duplications in regions where the parental haplotypes are sufficiently different. Trio binning is an approach to genome assembly that uses short reads from both parents to classify long reads from the offspring according to maternal or paternal haplotype origin, and is thus helped rather than impeded by heterozygosity. Using this approach, it is possible to derive two assemblies from an individual, accurately representing both parental contributions in their entirety with higher continuity and accuracy than is possible with other methods.Results We used trio binning to assemble reference genomes for two species from a single individual using an interspecies cross of yak (Bos grunniens) and cattle (Bos taurus). The high heterozygosity inherent to interspecies hybrids allowed us to confidently assign >99% of long reads from the F1 offspring to parental bins using unique k-mers from parental short reads. Both the maternal (yak) and paternal (cattle) assemblies contain over one third of the acrocentric chromosomes, including the two largest chromosomes, in single haplotigs.Conclusions These haplotigs are the first vertebrate chromosome arms to be assembled gap-free and fully phased, and the first time assemblies for two species have been created from a single individual. Both assemblies are the most continuous currently available for non-model vertebrates.MbmegabaseskbkilobasesMYAmillions of years agoMHCmajor histocompatibility complexSMRTsingle molecule real time
New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods, both alignment- and de novo assembly-based, from short-, linked-, and long-read sequencing, as well as optical and electronic mapping. The final benchmark set contains 12745 isolated, sequence-resolved insertion and deletion calls =50 base pairs (bp) discovered by at least 2 technologies or 5 callsets, genotyped as heterozygous or homozygous variants by long reads. The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.66 Gbp and 9641 SVs supported by at least one diploid assembly. Support for SVs was assessed using svviz with short-, linked-, and long-read sequence data. In general, there was strong support from multiple technologies for the benchmark SVs, with 90 % of the Tier 1 SVs having support in reads from more than one technology. The Mendelian genotype error rate was 0.3 %, and genotype concordance with manual curation was >98.7 %. We demonstrate the utility of the benchmark set by showing it reliably identifies both false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping.
Tigecycline is one of the last-resort antibiotics to treat complicated infections caused by both multidrug-resistant Gram-negative and Gram-positive bacteria1. Tigecycline resistance has sporadically occurred in recent years, primarily due to chromosome-encoding mechanisms, such as overexpression of efflux pumps and ribosome protection2,3. Here, we report the emergence of the plasmid-mediated mobile tigecycline resistance mechanism Tet(X4) in Escherichia coli isolates from China, which is capable of degrading all tetracyclines, including tigecycline and the US FDA newly approved eravacycline. The tet(X4)-harbouring IncQ1 plasmid is highly transferable, and can be successfully mobilized and stabilized in recipient clinical and laboratory strains of Enterobacteriaceae bacteria. It is noteworthy that tet(X4)-positive E.?coli strains, including isolates co-harbouring mcr-1, have been widely detected in pigs, chickens, soil and dust samples in China. In vivo murine models demonstrated that the presence of Tet(X4) led to tigecycline treatment failure. Consequently, the emergence of plasmid-mediated Tet(X4) challenges the clinical efficacy of the entire family of tetracycline antibiotics. Importantly, our study raises concern that the plasmid-mediated tigecycline resistance may further spread into various ecological niches and into clinical high-risk pathogens. Collective efforts are in urgent need to preserve the potency of these essential antibiotics.
Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. These assemblies can be created in various ways, such as use of tissues that contain single-haplotype (haploid) genomes, or by co-sequencing of parental genomes, but these approaches can be impractical in many situations. We present FALCON-Phase, which integrates long-read sequencing data and ultra-long-range Hi-C chromatin interaction data of a diploid individual to create high-quality, phased diploid genome assemblies. The method was evaluated by application to three datasets, including human, cattle, and zebra finch, for which high-quality, fully haplotype resolved assemblies were available for benchmarking. Phasing algorithm accuracy was affected by heterozygosity of the individual sequenced, with higher accuracy for cattle and zebra finch (>97%) compared to human (82%). In addition, scaffolding with the same Hi-C chromatin contact data resulted in phased chromosome-scale scaffolds.
Microbial diversity in the tick Argas japonicus (Acari: Argasidae) with a focus on Rickettsia pathogens.
The soft tick Argas japonicus mainly infests birds and can cause human dermatitis; however, no pathogen has been identified from this tick species in China. In the present study, the microbiota in A. japonicus collected from an epidemic community was explored, and some putative Rickettsia pathogens were further characterized. The results obtained indicated that bacteria in A. japonicus were mainly ascribed to the phyla Proteobacteria, Firmicutes and Actinobacteria. At the genus level, the male A. japonicus harboured more diverse bacteria than the females and nymphs. The bacteria Alcaligenes, Pseudomonas, Rickettsia and Staphylococcus were common in nymphs and adults. The abundance of bacteria belonging to the Rickettsia genus in females and males was 7.27% and 10.42%, respectively. Furthermore, the 16S rRNA gene of Rickettsia was amplified and sequenced, and phylogenetic analysis revealed that 13 sequences were clustered with the spotted fever group rickettsiae (Rickettsia heilongjiangensis and Rickettsia japonica) and three were clustered with Rickettsia limoniae, which suggested that the characterized Rickettsia in A. japonicus were novel putative pathogens and also that the residents were at considerable risk for infection by tick-borne pathogens. © 2019 The Royal Entomological Society.
Complete genome sequence of Bacillus velezensis JT3-1, a microbial germicide isolated from yak feces
Bacillus velezensis JT3-1 is a probiotic strain isolated from feces of the domestic yak (Bos grunniens) in the Gansu province of China. It has strong antagonistic activity against Listeria monocytogenes, Staphylococcus aureus, Escherichia coli, Salmonella Typhimurium, Mannheimia haemolytica, Staphylococcus hominis, Clostridium perfringens, and Mycoplasma bovis. These properties have made the JT3-1 strain the focus of commercial interest. In this study, we describe the complete genome sequence of JT3-1, with a genome size of 3,929,799 bp, 3761 encoded genes and an average GC content of 46.50%. Whole genome sequencing of Bacillus velezensis JT3-1 will lay a good foundation for elucidation of the mechanisms of its antimicrobial activity, and for its future application.
Genome sequence analysis of 91 Salmonella Enteritidis isolates from mice caught on poultry farms in the mid 1990s.
A total of 91 draft genome sequences were used to analyze isolates of Salmonella enterica serovar Enteritidis obtained from feral mice caught on poultry farms in Pennsylvania. One objective was to find mutations disrupting open reading frames (ORFs) and another was to determine if ORF-disruptive mutations were present in isolates obtained from other sources. A total of 83 mice were obtained between 1995-1998. Isolates separated into two genomic clades and 12 subgroups due to 742 mutations. Nineteen ORF-disruptive mutations were found, and in addition, bigA had exceptional heterogeneity requiring additional evaluation. The TRAMS algorithm detected only 6 ORF disruptions. The sefD mutation was the most frequently encountered mutation and it was prevalent in human, poultry, environmental and mouse isolates. These results confirm previous assessments of the mouse as a rich source of Salmonella enterica serovar Enteritidis that varies in genotype and phenotype. Copyright © 2019. Published by Elsevier Inc.
Evolution and global transmission of a multidrug-resistant, community-associated MRSA lineage from the Indian subcontinent
The evolution and global transmission of antimicrobial resistance has been well documented in Gram-negative bacteria and healthcare-associated epidemic pathogens, often emerging from regions with heavy antimicrobial use. However, the degree to which similar processes occur with Gram-positive bacteria in the community setting is less well understood. Here, we trace the recent origins and global spread of a multidrug resistant, community-associated Staphylococcus aureus lineage from the Indian subcontinent, the Bengal Bay clone (ST772). We generated whole genome sequence data of 340 isolates from 14 countries, including the first isolates from Bangladesh and India, to reconstruct the evolutionary history and genomic epidemiology of the lineage. Our data shows that the clone emerged on the Indian subcontinent in the early 1970s and disseminated rapidly in the 1990s. Short-term outbreaks in community and healthcare settings occurred following intercontinental transmission, typically associated with travel and family contacts on the subcontinent, but ongoing endemic transmission was uncommon. Acquisition of a multidrug resistance integrated plasmid was instrumental in the divergence of a single dominant and globally disseminated clade in the early 1990s. Phenotypic data on biofilm, growth and toxicity point to antimicrobial resistance as the driving force in the evolution of ST772. The Bengal Bay clone therefore combines the multidrug resistance of traditional healthcare-associated clones with the epidemiological transmission of community-associated MRSA. Our study demonstrates the importance of whole genome sequencing for tracking the evolution of emerging and resistant pathogens. It provides a critical framework for ongoing surveillance of the clone on the Indian subcontinent and elsewhere.Importance The Bengal Bay clone (ST772) is a community-acquired and multidrug-resistant Staphylococcus aureus lineage first isolated from Bangladesh and India in 2004. In this study, we show that the Bengal Bay clone emerged from a virulent progenitor circulating on the Indian subcontinent. Its subsequent global transmission was associated with travel or family contact in the region. ST772 progressively acquired specific resistance elements at limited cost to its fitness and continues to be exported globally resulting in small-scale community and healthcare outbreaks. The Bengal Bay clone therefore combines the virulence potential and epidemiology of community-associated clones with the multidrug-resistance of healthcare-associated S. aureus lineages. This study demonstrates the importance of whole genome sequencing for the surveillance of highly antibiotic resistant pathogens, which may emerge in the community setting of regions with poor antibiotic stewardship and rapidly spread into hospitals and communities across the world.
Draft Genome Sequence of Streptomyces sp. Strain RKND-216, an Antibiotic Producer Isolated from Marine Sediment in Prince Edward Island, Canada.
Streptomyces sp. strain RKND-216 was isolated from marine sediment collected in Prince Edward Island, Canada, and produces a putatively novel bioactive natural product with antitubercular activity. The genome assembly consists of two contigs covering 5.61?Mb. Genome annotation identified 4,618 predicted protein-coding sequences and 19 predicted natural product biosynthetic gene clusters.Copyright © 2019 Liang et al.
Whole-Genome Sequencing of a Brucella melitensis Strain (BMWS93) Isolated from a Bank Clerk and Exhibiting Complete Resistance to Rifampin.
Human brucellosis has become the most severe public health problem in the Ulanqab region of Inner Mongolia, China. Brucella melitensis BMWS93 was obtained from a blood sample taken from a bank clerk in the Ulanqab region of Inner Mongolia, China, and antimicrobial susceptibility testing in vitro showed no zone of inhibition, which confirmed resistance to rifampin. Therefore, whole-genome sequencing of this isolate was performed to better understand the mechanism of this resistance.Copyright © 2019 Liu et al.
The Genome Sequence of the Halobacterium salinarum Type Strain Is Closely Related to That of Laboratory Strains NRC-1 and R1.
High-coverage long-read sequencing of the Halobacterium salinarum type strain (91-R6) revealed a 2.17-Mb chromosome and two large plasmids (148 and 102 kb). Population heterogeneity and long repeats were observed. Strain 91-R6 and laboratory strain R1 showed 99.63% sequence identity in common chromosomal regions and only 38 strain-specific segments. This information resolves the previously uncertain relationship between type and laboratory strains.Copyright © 2019 Pfeiffer et al.
Genomic and Functional Analysis of Emerging Virulent and Multidrug-Resistant Escherichia coli Lineage Sequence Type 648.
The pathogenic extended-spectrum-beta-lactamase (ESBL)-producing Escherichia coli lineage ST648 is increasingly reported from multiple origins. Our study of a large and global ST648 collection from various hosts (87 whole-genome sequences) combining core and accessory genomics with functional analyses and in vivo experiments suggests that ST648 is a nascent and generalist lineage, lacking clear phylogeographic and host association signals. By including large numbers of ST131 (n?=?107) and ST10 (n?=?96) strains for comparative genomics and phenotypic analysis, we demonstrate that the combination of multidrug resistance and high-level virulence are the hallmarks of ST648, similar to international high-risk clonal lineage ST131. Specifically, our in silico, in vitro, and in vivo results demonstrate that ST648 is well equipped with biofilm-associated features, while ST131 shows sophisticated signatures indicative of adaption to urinary tract infection, potentially conveying individual ecological niche adaptation. In addition, we used a recently developed NFDS (negative frequency-dependent selection) population model suggesting that ST648 will increase significantly in frequency as a cause of bacteremia within the next few years. Also, ESBL plasmids impacting biofilm formation aided in shaping and maintaining ST648 strains to successfully emerge worldwide across different ecologies. Our study contributes to understanding what factors drive the evolution and spread of emerging international high-risk clonal lineages.Copyright © 2019 American Society for Microbiology.