To start Day 1 of the PacBio User Group Meeting, Jonas Korlach, PacBio CSO, provides an update on the latest releases and performance metrics for the Sequel II System. The longest reads generated on this system with the SMRT Cell 8M now go beyond 175,000 bases, while maintaining extremely high accuracy. HiFi mode, for example, uses circular consensus sequencing to achieve accuracy of Q40 or even Q50.
Although the accuracy of the human reference genome is critical for basic and clinical research, structural variants (SVs) have been difficult to assess because data capable of resolving them have been limited. To address potential bias, we sequenced a diversity panel of nine human genomes to high depth using long-read, single-molecule, real-time sequencing data. Systematically identifying and merging SVs =50 bp in length for these nine and one public genome yielded 83,909 sequence-resolved insertions, deletions, and inversions. Among these, 2,839 (2.0 Mbp) are shared among all discovery genomes with an additional 13,349 (6.9 Mbp) present in the majority of humans,…
Our studies reveal that the oral colonizer and cause of infective endocarditis Streptococcus oralis subsp. dentisani displays a striking monolateral distribution of surface fibrils. Furthermore, our data suggest that these fibrils impact the structure of adherent bacterial chains. Mutagenesis studies indicate that these fibrils are dependent on three serine-rich repeat proteins (SRRPs), here named fibril-associated protein A (FapA), FapB, and FapC, and that each SRRP forms a different fibril with a distinct distribution. SRRPs are a family of bacterial adhesins that have diverse roles in adhesion and that can bind to different receptors through modular nonrepeat region domains. Amino acid…
Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test.We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed…
Structural variants (SVs) in human genomes are implicated in a variety of human diseases. Long-read sequencing delivers much longer read lengths than short-read sequencing and may greatly improve SV detection. However, due to the relatively high cost of long-read sequencing, it is unclear what coverage is needed and how to optimally use the aligners and SV callers.In this study, we developed NextSV, a meta-caller to perform SV calling from low coverage long-read sequencing data. NextSV integrates three aligners and three SV callers and generates two integrated call sets (sensitive/stringent) for different analysis purposes. We evaluated SV calling performance of NextSV…
Escherichia coli NCM3722 is a prototrophic K-12 strain with robust physiologic phenotypes. We report the complete 4,678,045-bp chromosome and 67,545-bp F-like plasmid of this unique model organism. Copyright © 2015 Brown and Jun.
Amoebae are unicellular eukaryotes that consume microbial prey through phagocytosis, playing a role in shaping microbial foodwebs. Many amoebal species can be cultivated axenically in rich media or monoxenically with single bacterial prey species. Here we characterize heterolobosean amoeba LPG3, a recent natural isolate, which is unable to grow on unicellular cyanobacteria, its primary food source, in the absence of a heterotrophic bacterium, a Pseudomonas species coisolate. To investigate the molecular basis of this requirement for heterotrophic bacteria, we performed a screen using a defined non-redundant transposon library of Vibrio cholerae which implicated genes in corrinoid uptake and biosynthesis. Furthermore,…
Aliphatic compounds on plant surfaces, called epicuticular waxes, are the first line of defense against pathogens and pests, contribute to reducing water loss and determine other important phenotypes. Aliphatics can form crystals affecting light refraction, resulting in a color change and allowing identification of mutants in their synthesis or transport. The present study discloses three such Eceriferum (cer) genes in barley – Cer-c, Cer-q and Cer-u – known to be tightly linked and functioning in a biochemical pathway forming dominating amounts of ß-diketone and hydroxy-ß-diketones plus some esterified alkan-2-ols. These aliphatics are present in many Triticeae as well as dicotyledons…
Strains of the species Komagataella phaffii are the most frequently used “Pichia pastoris” strains employed for recombinant protein production as well as studies on peroxisome biogenesis, autophagy and secretory pathway analyses. Genome sequencing of several different P. pastoris strains has provided the foundation for understanding these cellular functions in recent genomics, transcriptomics and proteomics experiments. This experimentation has identified mistakes, gaps and incorrectly annotated open reading frames in the previously published draft genome sequences. Here, a refined reference genome is presented, generated with genome and transcriptome sequencing data from multiple P. pastoris strains. Twelve major sequence gaps from 20 to…
DNA methylation in prokaryotes is widespread. The most common modification of the genome is the methylation of adenine at the N-6 position. In Escherichia coli K-12 and many gammaproteobacteria, this modification is catalyzed by DNA adenine methyltransferase (Dam) at the GATC consensus sequence and is known to modulate cellular processes including transcriptional regulation of gene expression, initiation of chromosomal replication, and DNA mismatch repair. While studies thus far have focused on the motifs associated with methylated adenine (meA), the frequency of meA across the genome, and temporal dynamics during early periods of incubation, here we conduct the first study on…