Menu
September 22, 2019

Glyphosate resistance and EPSPS gene duplication: Convergent evolution in multiple plant species.

One of the increasingly widespread mechanisms of resistance to the herbicide glyphosate is copy number variation (CNV) of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene. EPSPS gene duplication has been reported in eight weed species, ranging from 3-5 extra copies to more than 150 extra copies. In the case of Palmer amaranth (Amaranthus palmeri), a section of >300 kb containing EPSPS and many other genes has been replicated and inserted at new loci throughout the genome, resulting in significant increase in total genome size. The replicated sequence contains several classes of mobile genetic elements including helitrons, raising the intriguing possibility of extra-chromosomal replication of the EPSPS-containing sequence. In kochia (Kochia scoparia), from three to more than 10 extra EPSPS copies are arranged as a tandem gene duplication at one locus. In the remaining six weed species that exhibit EPSPS gene duplication, little is known about the underlying mechanisms of gene duplication or their entire sequence. There is mounting evidence that adaptive gene amplification is an important mode of evolution in the face of intense human-mediated selection pressure. The convergent evolution of CNVs for glyphosate resistance in weeds, through at least two different mechanisms, may be indicative of a more general importance for this mechanism of adaptation in plants. CNVs warrant further investigation across plant functional genomics for adaptation to biotic and abiotic stresses, particularly for adaptive evolution on rapid time scales.© The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.


September 21, 2019

Mistranslation drives the evolution of robustness in TEM-1 ß-lactamase.

How biological systems such as proteins achieve robustness to ubiquitous perturbations is a fundamental biological question. Such perturbations include errors that introduce phenotypic mutations into nascent proteins during the translation of mRNA. These errors are remarkably frequent. They are also costly, because they reduce protein stability and help create toxic misfolded proteins. Adaptive evolution might reduce these costs of protein mistranslation by two principal mechanisms. The first increases the accuracy of translation via synonymous “high fidelity” codons at especially sensitive sites. The second increases the robustness of proteins to phenotypic errors via amino acids that increase protein stability. To study how these mechanisms are exploited by populations evolving in the laboratory, we evolved the antibiotic resistance gene TEM-1 in Escherichia coli hosts with either normal or high rates of mistranslation. We analyzed TEM-1 populations that evolved under relaxed and stringent selection for antibiotic resistance by single molecule real-time sequencing. Under relaxed selection, mistranslating populations reduce mistranslation costs by reducing TEM-1 expression. Under stringent selection, they efficiently purge destabilizing amino acid changes. More importantly, they accumulate stabilizing amino acid changes rather than synonymous changes that increase translational accuracy. In the large populations we study, and on short evolutionary timescales, the path of least resistance in TEM-1 evolution consists of reducing the consequences of translation errors rather than the errors themselves.


September 21, 2019

Potato late blight field resistance from QTL dPI09c is conferred by the NB-LRR gene R8.

Following the often short-lived protection that major nucleotide binding, leucine-rich-repeat (NB-LRR) resistance genes offer against the potato pathogen Phytophthora infestans, field resistance was thought to provide a more durable alternative to prevent late blight disease. We previously identified the QTL dPI09c on potato chromosome 9 as a more durable field resistance source against late blight. Here, the resistance QTL was fine-mapped to a 186 kb region. The interval corresponds to a larger, 389 kb, genomic region in the potato reference genome of Solanum tuberosum Group Phureja doubled monoploid clone DM1-3 (DM) and from which functional NB-LRRs R8, R9a, Rpi-moc1, and Rpi_vnt1 have arisen independently in wild species. dRenSeq analysis of parental clones alongside resistant and susceptible bulks of the segregating population B3C1HP showed full sequence representation of R8. This was independently validated using long-range PCR and screening of a bespoke bacterial artificial chromosome library. The latter enabled a comparative analysis of the sequence variation in this locus in diverse Solanaceae. We reveal for the first time that broad spectrum and durable field resistance against P. infestans is conferred by the NB-LRR gene R8, which is thought to provide narrow spectrum race-specific resistance.


September 21, 2019

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.


September 21, 2019

The advantages of SMRT sequencing.

Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.


September 21, 2019

Identification of a novel RASD1 somatic mutation in a USP8-mutated corticotroph adenoma.

Cushing’s disease (CD) is caused by pituitary corticotroph adenomas that secrete excess adrenocorticotropic hormone (ACTH). In these tumors, somatic mutations in the gene USP8 have been identified as recurrent and pathogenic and are the sole known molecular driver for CD. Although other somatic mutations were reported in these studies, their contribution to the pathogenesis of CD remains unexplored. No molecular drivers have been established for a large proportion of CD cases and tumor heterogeneity has not yet been investigated using genomics methods. Also, even in USP8-mutant tumors, a possibility may exist of additional contributing mutations, following a paradigm from other neoplasm types where multiple somatic alterations contribute to neoplastic transformation. The current study utilizes whole-exome discovery sequencing on the Illumina platform, followed by targeted amplicon-validation sequencing on the Pacific Biosciences platform, to interrogate the somatic mutation landscape in a corticotroph adenoma resected from a CD patient. In this USP8-mutated tumor, we identified an interesting somatic mutation in the gene RASD1, which is a component of the corticotropin-releasing hormone receptor signaling system. This finding may provide insight into a novel mechanism involving loss of feedback control to the corticotropin-releasing hormone receptor and subsequent deregulation of ACTH production in corticotroph tumors.


September 21, 2019

A Sequel to Sanger: amplicon sequencing that scales.

Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.


September 21, 2019

Detecting AGG interruptions in females with a FMR1 premutation by long-read Single-Molecule Sequencing: A 1 year clinical experience.

The fragile X syndrome arises from the FMR1 CGG expansion of a premutation (55-200 repeats) to a full mutation allele (>200 repeats) and is the most frequent cause of inherited X-linked intellectual disability. The risk for a premutation to expand to a full mutation allele depends on the repeat length and AGG triplets interrupting this repeat. In genetic counseling it is important to have information on both these parameters to provide an accurate risk estimate to women carrying a premutation allele and weighing up having children. For example, in case of a small risk a woman might opt for a natural pregnancy followed up by prenatal diagnosis while she might choose for preimplantation genetic diagnosis (PGD) if the risk is high. Unfortunately, the detection of AGG interruptions was previously hampered by technical difficulties complicating their use in diagnostics. Therefore we recently developed, validated and implemented a new methodology which uses long-read single-molecule sequencing to identify AGG interruptions in females with a FMR1 premutation. Here we report on the assets of AGG interruption detection by sequencing and the impact of implementing the assay on genetic counseling.


September 21, 2019

Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements.

CRISPR-Cas9 is poised to become the gene editing tool of choice in clinical contexts. Thus far, exploration of Cas9-induced genetic alterations has been limited to the immediate vicinity of the target site and distal off-target sequences, leading to the conclusion that CRISPR-Cas9 was reasonably specific. Here we report significant on-target mutagenesis, such as large deletions and more complex genomic rearrangements at the targeted sites in mouse embryonic stem cells, mouse hematopoietic progenitors and a human differentiated cell line. Using long-read sequencing and long-range PCR genotyping, we show that DNA breaks introduced by single-guide RNA/Cas9 frequently resolved into deletions extending over many kilobases. Furthermore, lesions distal to the cut site and crossover events were identified. The observed genomic damage in mitotically active cells caused by CRISPR-Cas9 editing may have pathogenic consequences.


July 19, 2019

Differing patterns of selection and geospatial genetic diversity within two leading Plasmodium vivax candidate vaccine antigens.

Although Plasmodium vivax is a leading cause of malaria around the world, only a handful of vivax antigens are being studied for vaccine development. Here, we investigated genetic signatures of selection and geospatial genetic diversity of two leading vivax vaccine antigens–Plasmodium vivax merozoite surface protein 1 (pvmsp-1) and Plasmodium vivax circumsporozoite protein (pvcsp). Using scalable next-generation sequencing, we deep-sequenced amplicons of the 42 kDa region of pvmsp-1 (n?=?44) and the complete gene of pvcsp (n?=?47) from Cambodian isolates. These sequences were then compared with global parasite populations obtained from GenBank. Using a combination of statistical and phylogenetic methods to assess for selection and population structure, we found strong evidence of balancing selection in the 42 kDa region of pvmsp-1, which varied significantly over the length of the gene, consistent with immune-mediated selection. In pvcsp, the highly variable central repeat region also showed patterns consistent with immune selection, which were lacking outside the repeat. The patterns of selection seen in both genes differed from their P. falciparum orthologs. In addition, we found that, similar to merozoite antigens from P. falciparum malaria, genetic diversity of pvmsp-1 sequences showed no geographic clustering, while the non-merozoite antigen, pvcsp, showed strong geographic clustering. These findings suggest that while immune selection may act on both vivax vaccine candidate antigens, the geographic distribution of genetic variability differs greatly between these two genes. The selective forces driving this diversification could lead to antigen escape and vaccine failure. Better understanding the geographic distribution of genetic variability in vaccine candidate antigens will be key to designing and implementing efficacious vaccines.


July 19, 2019

A benchmark study on error assessment and quality control of CCS reads derived from the PacBio RS.

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.


July 19, 2019

Genome rearrangements and pervasive meiotic drive cause hybrid infertility in fission yeast.

Hybrid sterility is one of the earliest postzygotic isolating mechanisms to evolve between two recently diverged species. Here we identify causes underlying hybrid infertility of two recently diverged fission yeast species Schizosaccharomyces pombe and S. kambucha, which mate to form viable hybrid diploids that efficiently complete meiosis, but generate few viable gametes. We find that chromosomal rearrangements and related recombination defects are major but not sole causes of hybrid infertility. At least three distinct meiotic drive alleles, one on each S. kambucha chromosome, independently contribute to hybrid infertility by causing nonrandom spore death. Two of these driving loci are linked by a chromosomal translocation and thus constitute a novel type of paired meiotic drive complex. Our study reveals how quickly multiple barriers to fertility can arise. In addition, it provides further support for models in which genetic conflicts, such as those caused by meiotic drive alleles, can drive speciation.DOI: http://dx.doi.org/10.7554/eLife.02630.001. Copyright © 2014, Zanders et al.


July 19, 2019

Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene.

The human fragile X mental retardation 1 (FMR1) gene contains a (CGG)(n) trinucleotide repeat in its 5′ untranslated region (5’UTR). Expansions of this repeat result in a number of clinical disorders with distinct molecular pathologies, including fragile X syndrome (FXS; full mutation range, greater than 200 CGG repeats) and fragile X-associated tremor/ataxia syndrome (FXTAS; premutation range, 55-200 repeats). Study of these diseases has been limited by an inability to sequence expanded CGG repeats, particularly in the full mutation range, with existing DNA sequencing technologies. Single-molecule, real-time (SMRT) sequencing provides an approach to sequencing that is fundamentally different from other “next-generation” sequencing platforms, and is well suited for long, repetitive DNA sequences. We report the first sequence data for expanded CGG-repeat FMR1 alleles in the full mutation range that reveal the confounding effects of CGG-repeat tracts on both cloning and PCR. A unique feature of SMRT sequencing is its ability to yield real-time information on the rates of nucleoside addition by the tethered DNA polymerase; for the CGG-repeat alleles, we find a strand-specific effect of CGG-repeat DNA on the interpulse distance. This kinetic signature reveals a novel aspect of the repeat element; namely, that the particular G bias within the CGG/CCG-repeat element influences polymerase activity in a manner that extends beyond simple nearest-neighbor effects. These observations provide a baseline for future kinetic studies of repeat elements, as well as for studies of epigenetic and other chemical modifications thereof.


July 19, 2019

Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing.

Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2-6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100 bp, the typical length of short reads.We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT(TM) sequencing (Pacific Biosciences), determined 2.3-3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads.Our TRhist software is available at http://trhist.gi.k.u-tokyo.ac.jp/.moris@cb.k.u-tokyo.ac.jpSupplementary data are available at Bioinformatics online.


July 19, 2019

The origin of the Haitian cholera outbreak strain.

Although cholera has been present in Latin America since 1991, it had not been epidemic in Haiti for at least 100 years. Recently, however, there has been a severe outbreak of cholera in Haiti.We used third-generation single-molecule real-time DNA sequencing to determine the genome sequences of 2 clinical Vibrio cholerae isolates from the current outbreak in Haiti, 1 strain that caused cholera in Latin America in 1991, and 2 strains isolated in South Asia in 2002 and 2008. Using primary sequence data, we compared the genomes of these 5 strains and a set of previously obtained partial genomic sequences of 23 diverse strains of V. cholerae to assess the likely origin of the cholera outbreak in Haiti.Both single-nucleotide variations and the presence and structure of hypervariable chromosomal elements indicate that there is a close relationship between the Haitian isolates and variant V. cholerae El Tor O1 strains isolated in Bangladesh in 2002 and 2008. In contrast, analysis of genomic variation of the Haitian isolates reveals a more distant relationship with circulating South American isolates.The Haitian epidemic is probably the result of the introduction, through human activity, of a V. cholerae strain from a distant geographic source. (Funded by the National Institute of Allergy and Infectious Diseases and the Howard Hughes Medical Institute.).


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.