June 1, 2021  |  

Amplification-free, CRISPR-Cas9 targeted enrichment and SMRT Sequencing of repeat-expansion disease causative genomic regions

Targeted sequencing has proven to be economical for obtaining sequence information for defined regions of the genome. However, most target enrichment methods are reliant upon some form of amplification which can negatively impact downstream analysis. For example, amplification removes epigenetic marks present in native DNA, including nucleotide methylation, which are hypothesized to contribute to disease mechanisms in some disorders. In addition, some genomic regions known to be causative of many genetic disorders have extreme GC content and/or repetitive sequences that tend to be recalcitrant to faithful amplification. We have developed a novel, amplification-free enrichment technique that employs the CRISPR/Cas9 system to target individual genes. This method, in conjunction with the long reads, high consensus accuracy, and uniform coverage of SMRT Sequencing, allows accurate sequence analysis of complex genomic regions that cannot be investigated with other technologies. Using this strategy, we have successfully targeted a number of repeat expansion disorder loci (HTT, FMR1, ATXN10, C9orf72).With this data, we demonstrate the ability to isolate thousands of individual on-target molecules and, using the Sequel System, accurately sequence through long repeats regardless of the extreme GC-content. The method is compatible with multiplexing of multiple target loci and multiple samples in a single reaction. Furthermore, because there is no amplification step, this technique also preserves native DNA molecules for sequencing, allowing for the direct detection and characterization of epigenetic signatures. To this end, we demonstrate the detection of 5-mC in the CGG repeat of the FMR1 gene that is responsible for Fragile X syndrome.

April 21, 2020  |  

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

April 21, 2020  |  

Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions.

Chlorella vulgaris is a fast-growing fresh-water microalga cultivated at the industrial scale for applications ranging from food to biofuel production. To advance our understanding of its biology and to establish genetics tools for biotechnological manipulation, we sequenced the nuclear and organelle genomes of Chlorella vulgaris 211/11P by combining next generation sequencing and optical mapping of isolated DNA molecules. This hybrid approach allowed to assemble the nuclear genome in 14 pseudo-molecules with an N50 of 2.8 Mb and 98.9% of scaffolded genome. The integration of RNA-seq data obtained at two different irradiances of growth (high light-HL versus low light -LL) enabled to identify 10,724 nuclear genes, coding for 11,082 transcripts. Moreover 121 and 48 genes were respectively found in the chloroplast and mitochondrial genome. Functional annotation and expression analysis of nuclear, chloroplast and mitochondrial genome sequences revealed peculiar features of Chlorella vulgaris. Evidence of horizontal gene transfers from chloroplast to mitochondrial genome was observed. Furthermore, comparative transcriptomic analyses of LL vs HL provide insights into the molecular basis for metabolic rearrangement in HL vs. LL conditions leading to enhanced de novo fatty acid biosynthesis and triacylglycerol accumulation. The occurrence of a cytosolic fatty acid biosynthetic pathway can be predicted and its upregulation upon HL exposure is observed, consistent with increased lipid amount under HL. These data provide a rich genetic resource for future genome editing studies, and potential targets for biotechnological manipulation of Chlorella vulgaris or other microalgae species to improve biomass and lipid productivity.This article is protected by copyright. All rights reserved.

April 21, 2020  |  

RNA sequencing: the teenage years.

Over the past decade, RNA sequencing (RNA-seq) has become an indispensable tool for transcriptome-wide analysis of differential gene expression and differential splicing of mRNAs. However, as next-generation sequencing technologies have developed, so too has RNA-seq. Now, RNA-seq methods are available for studying many different aspects of RNA biology, including single-cell gene expression, translation (the translatome) and RNA structure (the structurome). Exciting new applications are being explored, such as spatial transcriptomics (spatialomics). Together with new long-read and direct RNA-seq technologies and better computational tools for data analysis, innovations in RNA-seq are contributing to a fuller understanding of RNA biology, from questions such as when and where transcription occurs to the folding and intermolecular interactions that govern RNA function.

April 21, 2020  |  

Large Fragment Deletions Induced by Cas9 Cleavage While Not in BEs System in Rabbit

CRISPR-Cas9 and BEs system are poised to become the gene editing tool of choice in clinical contexts, however large fragment deletion was found in Cas9-mediated mutation cells without animal level validation. By analyzing 16 gene-edited rabbit lines (including 112 rabbits) generated using SpCas9, BEs, xCas9 and xCas9-BEs with long-range PCR genotyping and long-read sequencing by PacBio platform, we show that extending thousands of bases fragment deletions in single-guide RNA/Cas9 and xCas9 system mutation rabbit, but few large deletions were found in BEs-induced mutation rabbits. We firstly validated that no large fragment deletion induced by BEs system at animal level, suggesting that BE systems can be beneficial tools for the further development of highly accurate and secure gene therapy for the clinical treatment of human genetic disorders

April 21, 2020  |  

Short translational ramp determines efficiency of protein synthesis

It is generally assumed that translation efficiency is governed by translation initiation. However, the efficiency of protein synthesis is regulated by multiple factors including tRNA abundance, codon composition, mRNA motifs and amino-acid sequence1textendash4. These factors influence the rate of protein synthesis beyond the initiation phase of translation, typically by modulating the rate of peptide-bond formation and to a lesser extent that of translocation. The slowdown in translation during the early elongation phase, known as the 5textquoteright translational ramp, likely contributes to the efficiency of protein synthesis 5textendash9. Multiple mechanisms, which could explain the molecular basis for this translational ramp, have been proposed that include tRNA abundance bias6,9, the rate of translation initiation10textendash15, mRNA and ribosome structure 11,12,14,16textendash18, or retention of initiation factors during early elongation events 19. Here, we show that the amount of synthesized protein (translation efficiency) depends on a short translational ramp that comprises the first 5 codons in mRNA. Using a library of more than 250,000 reporter sequences combined with in vitro and in vivo protein expression assays, we show that differences in the short ramp can lead to 3 to 4 orders of magnitude changes in protein abundance. The observed difference is not dependent on tRNA abundance, efficiency of translation initiation, or overall mRNA structure. Instead, we show that translation is regulated by amino-acid-sequence composition and local mRNA sequence. Single-molecule measurements of translation kinetics indicate substantial pausing of ribosome and abortion of protein synthesis on the 4th or 5th codon for distinct amino acid or nucleotide compositions. Introduction of preferred sequence motifs, only at the exact positions within the mRNA, improves protein synthesis for recombinant proteins, indicating an evolutionarily conserved mechanism for controlling translational efficiency.

April 21, 2020  |  

Membrane proteomic analysis reveals overlapping and independent functions of Streptococcus mutans Ffh, YidC1, and YidC2.

A comparative proteomic analysis was utilized to evaluate similarities and differences in membrane samples derived from the cariogenic bacterium Streptococcus mutans, including the wild-type strain and four mutants devoid of protein translocation machinery components, specifically ?ffh, ?yidC1, ?yidC2, or ?ffh/yidC1. The purpose of this work was to determine the extent to which the encoded proteins operate individually or in concert with one another and to identify the potential substrates of the respective pathways. Ffh is the principal protein component of the signal recognition particle (SRP), while yidC1 and yidC2 are dual paralogs encoding members of the YidC/Oxa/Alb family of membrane-localized chaperone insertases. Our results suggest that the co-translational SRP pathway works in concert with either YidC1 or YidC2 specifically, or with no preference for paralog, in the insertion of most membrane-localized substrates. A few instances were identified in which the SRP pathway alone, or one of the YidCs alone, appeared to be most relevant. These data shed light on underlying reasons for differing phenotypic consequences of ffh, yidC1 or yidC2 deletion. Our data further suggest that many membrane proteins present in a ?yidC2 background may be non-functional, that ?yidC1 is better able to adapt physiologically to the loss of this paralog, that shared phenotypic properties of ?ffh and ?yidC2 mutants can stem from impacts on different proteins, and that independent binding to ribosomal proteins is not a primary functional activity of YidC2. Lastly, genomic mutations accumulate in a ?yidC2 background coincident with phenotypic reversion, including an apparent W138R suppressor mutation within yidC1. © 2019 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

April 21, 2020  |  

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.