To further our understanding of the genetic etiology of autism, we generated and analyzed genome sequence data from 516 idiopathic autism families (2,064 individuals). This resource includes >59 million single-nucleotide variants (SNVs) and 9,212 private copy number variants (CNVs), of which 133,992 and 88 are de novo mutations (DNMs), respectively. We estimate a mutation rate of ~1.5 × 10(-8) SNVs per site per generation with a significantly higher mutation rate in repetitive DNA. Comparing probands and unaffected siblings, we observe several DNM trends. Probands carry more gene-disruptive CNVs and SNVs, resulting in severe missense mutations and mapping to predicted fetal brain promoters…
Xenoturbella is a group of marine benthic animals lacking an anus and a centralized nervous system. Molecular phylogenetic analyses group the animal together with the Acoelomorpha, forming the Xenacoelomorpha. This group has been suggested to be either a sister group to the Nephrozoa or a deuterostome, and therefore it may provide important insights into origins of bilaterian traits such as an anus, the nephron, feeding larvae and centralized nervous systems. However, only five Xenoturbella species have been reported and the evolutionary history of xenoturbellids and Xenacoelomorpha remains obscure.Here we describe a new Xenoturbella species from the western Pacific Ocean, and…
In the basic fitness landscape metaphor for molecular evolution, evolutionary pathways are presumed to follow uphill steps of increasing fitness. How evolution can cross fitness valleys is an open question. One possibility is that environmental changes alter the fitness landscape such that low-fitness sequences reside on a hill in alternate environments. We experimentally test this hypothesis on the antibiotic resistance gene TEM-15 ß-lactamase by comparing four evolutionary strategies shaped by environmental changes. The strategy that included initial steps of selecting for low antibiotic resistance (negative selection) produced superior alleles compared with the other three strategies. We comprehensively examined possible evolutionary…
Sequencing plant genomes are often challenging because of their complex architecture and high content of repetitive sequences. Sugarcane has one of the most complex genomes. It is highly polyploid, preserves intact homeologous chromosomes from its parental species and contains >55% repetitive sequences. Although bacterial artificial chromosome (BAC) libraries have emerged as an alternative for accessing the sugarcane genome, sequencing individual clones is laborious and expensive. Here, we present a strategy for sequencing and assembly reads produced from the DNA of pooled BAC clones. A set of 178 BAC clones, randomly sampled from the SP80-3280 sugarcane BAC library, was pooled and…
The characterization of the ABO blood group status is vital for blood transfusion and solid organ transplantation. Several methods for the molecular characterization of the ABO gene, which encodes the alleles that give rise to the different ABO blood groups, have been described. However, the application of those methods has so far been restricted to selected samples and not been applied to population-scale analysis.We describe a cost-effective method for high-throughput genotyping of the ABO system by next generation sequencing. Sample specific barcodes and sequencing adaptors are introduced during PCR, rendering the products suitable for direct sequencing on Illumina MiSeq or…
Although human LINE-1 (L1) elements are actively mobilized in many cancers, a role for somatic L1 retrotransposition in tumor initiation has not been conclusively demonstrated. Here, we identify a novel somatic L1 insertion in the APC tumor suppressor gene that provided us with a unique opportunity to determine whether such insertions can actually initiate colorectal cancer (CRC), and if so, how this might occur. Our data support a model whereby a hot L1 source element on Chromosome 17 of the patient’s genome evaded somatic repression in normal colon tissues and thereby initiated CRC by mutating the APC gene. This insertion…
Transcription factors regulate their target genes by binding to regulatory regions in the genome. Although the binding preferences of TP53 are known, it remains unclear what distinguishes functional enhancers from nonfunctional binding. In addition, the genome is scattered with recognition sequences that remain unoccupied. Using two complementary techniques of multiplex enhancer-reporter assays, we discovered that functional enhancers could be discriminated from nonfunctional binding events by the occurrence of a single TP53 canonical motif. By combining machine learning with a meta-analysis of TP53 ChIP-seq data sets, we identified a core set of more than 1000 responsive enhancers in the human genome.…
The blaIMP-14 carbapenem resistance gene has largely previously been observed in Pseudomonas aeruginosa and Acinetobacter spp. As part of global surveillance and sequencing of carbapenem-resistant E. coli, we identified an ST131 strain harboring blaIMP-14 within a class 1 integron, itself nested within a ~54kb multi-drug resistance region on an epidemic IncA/C2 plasmid. The emergence of blaIMP-14 in this context in the ST131 lineage is of potential clinical concern. Copyright © 2016 Stoesser et al.
Microsatellites are DNA sequences consisting of repeated, short (1-6 bp) sequence motifs that are highly mutable by enzymatic slippage during replication. Due to their high intrinsic variability, microsatellites have important applications in population genetics, forensics, genome mapping, as well as cancer diagnostics and prognosis. The current analytical standard for microsatellites is based on length scoring by high precision electrophoresis, but due to increasing efficiency next-generation sequencing techniques may provide a viable alternative. Here, we evaluated single molecule real time (SMRT) sequencing, implemented in the PacBio series of sequencing apparatuses, as a means of microsatellite length scoring. To this end we…
Sequential expression of outer membrane protein antigenic variants is an evolutionarily convergent mechanism used by bacterial pathogens to escape host immune clearance and establish persistent infection. Variants must be sufficiently structurally distinct to escape existing immune effectors yet retain core structural elements required for localization and function within the outer membrane. We examined this balance using Anaplasma marginale, which generates antigenic variants in the outer membrane protein Msp2 using gene conversion. The overwhelming majority of Msp2 variants expressed during long-term persistent infection are mosaics, derived by recombination of oligonucleotide segments from multiple alleles to form unique hypervariable regions (HVR). As…
Gaucher disease (GD) is a genetic disease caused by mutations in the GBA1 gene which result in reduced enzymatic activity of ß-glucocerebrosidase (GCase). This study identified the progranulin (PGRN) gene (GRN) as another gene associated with GD.Serum levels of PGRN were measured from 115 GD patients and 99 healthy controls, whole GRN gene from 40 GD patients was sequenced, and the genotyping of 4 SNPs identified in GD patients was performed in 161 GD and 142 healthy control samples. Development of GD in PGRN-deficient mice was characterized, and the therapeutic effect of rPGRN on GD analyzed.Serum PGRN levels were significantly…
ST8/SCCmecIV community-associated methicillin-resistant Staphylococcus aureus (CA-MRSA) has been a common threat, with large USA300 epidemics in the United States. The global geographical structure of ST8/SCCmecIV has not yet been fully elucidated. We herein determined the complete circular genome sequence of ST8/SCCmecIVc strain OC8 from Siberian Russia. We found that 36.0% of the genome was inverted relative to USA300. Two IS256, oppositely oriented, at IS256-enriched hot spots were implicated with the one-megabase genomic inversion (MbIN) and vSaß split. The behavior of IS256 was flexible: its insertion site (att) sequences on the genome and junction sequences of extrachromosomal circular DNA were all…
The widely distributed marine cyanobacterium Synechococcus is thought to exert an influence on the marine silicon (Si) cycle through its high cellular Si relative to organic content. There are few measurements of Si in natural populations of Synechococcus, however, and the degree to which Synechococcus from various oligotrophic field sites and depths accumulate the element is unknown. We used synchrotron x-ray fluorescence to measure Si quotas in individual Synechococcus cells collected during three cruises in the western North Atlantic Ocean in the summer and fall, focusing on cells from the surface mixed layer (SML;
An ammonia-oxidizing bacterium, strain D1FHS, was enriched into pure culture from a sediment sample retrieved in Jiaozhou Bay, a hyper-eutrophic semi-closed water body hosting the metropolitan area of Qingdao, China. Based on initial 16S rRNA gene sequence analysis, strain D1FHS was classified in the genus Nitrosococcus, family Chromatiaceae, order Chromatiales, class Gammaproteobacteria; the 16S rRNA gene sequence with highest level of identity to that of D1FHS was obtained from Nitrosococcus halophilus Nc4(T). The average nucleotide identity between the genomes of strain D1FHS and N. halophilus strain Nc4 is 89.5%. Known species in the genus Nitrosococcus are obligate aerobic chemolithotrophic ammonia-oxidizing…
Multilocus sequence typing (MLST) has become the preferred method for genotyping many biological species. It can be used to identify major phylogenetic clades, molecular groups, or subpopulations of a species, as well as individual strains or clones. However, conventional MLST is costly and time consuming, which limits its power for genotyping large numbers of samples. Here, we describe a new MLST method that uses next-generation sequencing, a multiplexing protocol, and appropriate analytical software to provide accurate, rapid, and economical MLST genotyping of 96 or more isolates in a single assay.