Copy number variation (CNV) contributes to disease and has restructured the genomes of great apes. The diversity and rate of this process, however, have not been extensively explored among great ape lineages. We analyzed 97 deeply sequenced great ape and human genomes and estimate 16% (469 Mb) of the hominid genome has been affected by recent CNV. We identify a comprehensive set of fixed gene deletions (n = 340) and duplications (n = 405) as well as >13.5 Mb of sequence that has been specifically lost on the human lineage. We compared the diversity and rates of copy number and single nucleotide variation across the hominid phylogeny. We find that CNV diversity partially correlates with single nucleotide diversity (r(2) = 0.5) and recapitulates the phylogeny of apes with few exceptions. Duplications significantly outpace deletions (2.8-fold). The load of segregating duplications remains significantly higher in bonobos, Western chimpanzees, and Sumatran orangutans-populations that have experienced recent genetic bottlenecks (P = 0.0014, 0.02, and 0.0088, respectively). The rate of fixed deletion has been more clocklike with the exception of the chimpanzee lineage, where we observe a twofold increase in the chimpanzee-bonobo ancestor (P = 4.79 × 10(-9)) and increased deletion load among Western chimpanzees (P = 0.002). The latter includes the first genomic disorder in a chimpanzee with features resembling Smith-Magenis syndrome mediated by a chimpanzee-specific increase in segmental duplication complexity. We hypothesize that demographic effects, such as bottlenecks, have contributed to larger and more gene-rich segments being deleted in the chimpanzee lineage and that this effect, more generally, may account for episodic bursts in CNV during hominid evolution.
The field of nonhuman primate genomics is undergoing rapid change and making impressive progress. Exploiting new technologies for DNA sequencing, researchers have generated new whole-genome sequence assemblies for multiple primate species over the past 6 years. In addition, investigations of within-species genetic variation, gene expression and RNA sequences, conservation of non-protein-coding regions of the genome, and other aspects of comparative genomics are moving at an accelerating speed. This progress is opening a wide array of new research opportunities in the analysis of comparative primate genome content and evolution. It also creates new possibilities for the use of nonhuman primates as model organisms in biomedical research. This transition, based on both new technology and the new information being generated in regard to human genetics, provides an important justification for reevaluating the research goals, strategies, and study designs used in primate genetics and genomics.
Feasibility of real time next generation sequencing of cancer genes linked to drug response: results from a clinical trial.
The successes of targeted drugs with companion predictive biomarkers and the technological advances in gene sequencing have generated enthusiasm for evaluating personalized cancer medicine strategies using genomic profiling. We assessed the feasibility of incorporating real-time analysis of somatic mutations within exons of 19 genes into patient management. Blood, tumor biopsy and archived tumor samples were collected from 50 patients recruited from four cancer centers. Samples were analyzed using three technologies: targeted exon sequencing using Pacific Biosciences PacBio RS, multiplex somatic mutation genotyping using Sequenom MassARRAY and Sanger sequencing. An expert panel reviewed results prior to reporting to clinicians. A clinical laboratory verified actionable mutations. Fifty patients were recruited. Nineteen actionable mutations were identified in 16 (32%) patients. Across technologies, results were in agreement in 100% of biopsy specimens and 95% of archival specimens. Profiling results from paired archival/biopsy specimens were concordant in 30/34 (88%) patients. We demonstrated that the use of next generation sequencing for real-time genomic profiling in advanced cancer patients is feasible. Additionally, actionable mutations identified in this study were relatively stable between archival and biopsy samples, implying that cancer mutations that are good predictors of drug response may remain constant across clinical stages. Copyright © 2012 UICC.
Mutation in the C-di-AMP cyclase dacA affects fitness and resistance of methicillin resistant Staphylococcus aureus.
Faster growing and more virulent strains of methicillin resistant Staphylococcus aureus (MRSA) are increasingly displacing highly resistant MRSA. Elevated fitness in these MRSA is often accompanied by decreased and heterogeneous levels of methicillin resistance; however, the mechanisms for this phenomenon are not yet fully understood. Whole genome sequencing was used to investigate the genetic basis of this apparent correlation, in an isogenic MRSA strain pair that differed in methicillin resistance levels and fitness, with respect to growth rate. Sequencing revealed only one single nucleotide polymorphism (SNP) in the diadenylate cyclase gene dacA in the faster growing but less resistant strain. Diadenylate cyclases were recently discovered to synthesize the new second messenger cyclic diadenosine monophosphate (c-di-AMP). Introduction of this mutation into the highly resistant but slower growing strain reduced resistance and increased its growth rate, suggesting a direct connection between the dacA mutation and the phenotypic differences of these strains. Quantification of cellular c-di-AMP revealed that the dacA mutation decreased c-di-AMP levels resulting in reduced autolysis, increased salt tolerance and a reduction in the basal expression of the cell wall stress stimulon. These results indicate that c-di-AMP affects cell envelope-related signalling in S. aureus. The influence of c-di-AMP on growth rate and methicillin resistance in MRSA indicate that altering c-di-AMP levels could be a mechanism by which MRSA strains can increase their fitness levels by reducing their methicillin resistance levels.
Tiny photosynthetic microorganisms that form the picoplankton (between 0.3 and 3 µm in diameter) are at the base of the food web in many marine ecosystems, and their adaptability to environmental change hinges on standing genetic variation. Although the genomic and phenotypic diversity of the bacterial component of the oceans has been intensively studied, little is known about the genomic and phenotypic diversity within each of the diverse eukaryotic species present. We report the level of genomic diversity in a natural population of Ostreococcus tauri (Chlorophyta, Mamiellophyceae), the smallest photosynthetic eukaryote. Contrary to the expectations of clonal evolution or cryptic species, the spectrum of genomic polymorphism observed suggests a large panmictic population (an effective population size of 1.2 × 10(7)) with pervasive evidence of sexual reproduction. De novo assemblies of low-coverage chromosomes reveal two large candidate mating-type loci with suppressed recombination, whose origin may pre-date the speciation events in the class Mamiellophyceae. This high genetic diversity is associated with large phenotypic differences between strains. Strikingly, resistance of isolates to large double-stranded DNA viruses, which abound in their natural environment, is positively correlated with the size of a single hypervariable chromosome, which contains 44 to 156 kb of strain-specific sequences. Our findings highlight the role of viruses in shaping genome diversity in marine picoeukaryotes.
Next-generation sequencing is radically changing how DNA diagnostic laboratories operate. What started as a single-gene profession is now developing into gene panel sequencing and whole-exome and whole-genome sequencing (WES/WGS) analyses. With further advances in sequencing technology and concomitant price reductions, WGS will soon become the standard and be routinely offered. Here, we focus on the critical steps involved in performing WGS, with a particular emphasis on points where WGS differs from WES, the important variables that should be taken into account, and the quality control measures that can be taken to monitor the process. The points discussed here, combined with recent publications on guidelines for reporting variants, will facilitate the routine implementation of WGS into a diagnostic setting.© 2017 Wiley Periodicals, Inc.
The identification of genetic variation with next-generation sequencing is confounded by the complexity of the human genome sequence and by biases that arise during library preparation, sequencing and analysis. We have developed a set of synthetic DNA standards, termed ‘sequins’, that emulate human genetic features and constitute qualitative and quantitative spike-in controls for genome sequencing. Sequencing reads derived from sequins align exclusively to an artificial in silico reference chromosome, rather than the human reference genome, which allows them them to be partitioned for parallel analysis. Here we use this approach to represent common and clinically relevant genetic variation, ranging from single nucleotide variants to large structural rearrangements and copy-number variation. We validate the design and performance of sequin standards by comparison to examples in the NA12878 reference genome, and we demonstrate their utility during the detection and quantification of variants. We provide sequins as a standardized, quantitative resource against which human genetic variation can be measured and diagnostic performance assessed.
Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools. The learned model generalizes across genome builds and mammalian species, allowing nonhuman sequencing projects to benefit from the wealth of human ground-truth data. We further show that DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, including deep whole genomes from 10X Genomics and Ion Ampliseq exomes, highlighting the benefits of using more automated and generalizable techniques for variant calling.