We characterized 170 complete genome assemblies from clinical Bordetella pertussis isolates representing geographic and temporal diversity in the United States. These data capture genotypic shifts, including increased pertactin deficiency, occurring amid the current pertussis disease resurgence and provide a foundation for needed research to direct future public health control strategies.
The discovery of mutations associated with human genetic dis- ease is an exercise in comparative genomics (see Glossary). Although there are many different strategies and approaches, the central premise is that affected persons harbor a significant excess of pathogenic DNA variants as com- pared with a group of unaffected persons (controls) that is either clinically defined1 or established by surveying large swaths of the general population.2 The more exclu- sive the variant is to the disease, the greater its penetrance, the larger its effect size, and the more relevant it becomes to both disease diagnosis and future therapeutic investigation. The most popular approach used by researchers in human genetics is the case–control design, but there are others that can be used to track variants and disease in a family context or that consider the probability of different classes of mutations based on evolutionary patterns of divergence or de novo mutational change.3,4 Although the approaches may be straightforward, the discovery of patho- genic variation and its mechanism of action often is less trivial, and decades of research can be required in order to identify the variants underlying both mendelian and complex genetic traits.
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (=50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
A Pathovar of Xanthomonas oryzae Infecting Wild Grasses Provides Insight Into the Evolution of Pathogenicity in Rice Agroecosystems
Xanthomonas oryzae (Xo) are critical rice pathogens. Virulent lineages from Africa and Asia and less virulent strains from the US have been well characterized. X. campestris pv. leersiae (Xcl), first described in 1957, causes bacterial streak on the perennial grass, Leersia hexandra, and is a close relative of Xo. L. hexandra, a member of the Poaceae, is highly similar to rice phylogenetically, is globally ubiquitous around rice paddies, and is a reservoir of pathogenic Xo. We used long read, single molecule, real time (SMRT) genome sequences of five strains of Xcl from Burkina Faso, China, Mali and Uganda to determine the genetic relatedness of this organism with Xo. Novel Transcription Activator-Like Effectors (TALEs) were discovered in all five strains of Xcl. Predicted TALE target sequences were identified in the L. perrieri genome and compared to rice susceptibility gene homologs. Pathogenicity screening on L. hexandra and diverse rice cultivars confirmed that Xcl are able to colonize rice and produce weak but not progressive symptoms. Overall, based on average nucleotide identity, type III effector repertoires and disease phenotype, we propose to rename Xcl to X. oryzae pv. leersiae (Xol) and use this parallel system to improve understanding of the evolution of bacterial pathogenicity in rice agroecosystems.
Comparative genomics reveals structural and functional features specific to the genome of a foodborne Escherichia coli O157:H7.
Escherichia coli O157:H7 (O157) has been linked to numerous foodborne disease outbreaks. The ability to rapidly sequence and analyze genomes is important for understanding epidemiology, virulence, survival, and evolution of outbreak strains. In the current study, we performed comparative genomics to determine structural and functional features of the genome of a foodborne O157 isolate NADC 6564 and infer its evolutionary relationship to other O157 strains.The chromosome of NADC 6564 contained 5466?kb compared to reference strains Sakai (5498?kb) and EDL933 (5547?kb) and shared 41 of its 43 Linear Conserved Blocks (LCB) with the reference strains. However, 18 of 41 LCB had inverse orientation in NADC 6564 compared to the reference strains. NADC 6564 shared 18 of 19 bacteriophages with reference strains except that the chromosomal positioning of some of the phages differed among these strains. The additional phage (P19) of NADC 6564 was located on a 39-kb insertion element (IE) encoding several hypothetical proteins, an integrase, transposases, transcriptional regulators, an adhesin, and a phosphoethanolamine transferase (PEA). The complete homologs of the 39-kb?IE were found in E. coli PCN061 of porcine origin. The IE-encoded PEA showed low homology (32-33%) to four other PEA in NADC 6564 and PEA linked to mobilizable colistin resistance in E. coli but was highly homologous (95%) to a PEA of uropathogenic, avian pathogenic, and enteroaggregative E. coli. NADC 6564 showed slightly higher minimum inhibitory concentration of colistin compared to the reference strains. The 39-kb?IE also contained dndBCDE and dptFGH operons encoding DNA S-modification and a restriction pathway, linked to oxidative stress tolerance and self-defense against foreign DNA, respectively. Evolutionary tree analysis grouped NADC 6564 with lineage I O157 strains.These results indicated that differential phage counts and different chromosomal positioning of many bacteriophages and genomic islands might have resulted in recombination events causing altered chromosomal organization in NADC 6564. Evolutionary analysis grouped NADC 6564 with lineage I strains and suggested its earlier divergence from these strains. The ability to perform S-DNA modification might affect tolerance of NADC 6564 to various stressors.