June 1, 2021  |  

Evaluating the potential of new sequencing technologies for genotyping and variation discovery in human data.

A first look at Pacific Biosciences RS data Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome these limitations by providing significantly longer reads (now averaging >1kb), enabling more unique seeds for reference alignment. In addition, the lack of amplification in the library construction step avoids a common source of base composition bias. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical resequencing projects by assessing the quality of the raw sequencing data, as well as its use for SNP discovery and genotyping using the Genome Analysis Toolkit (GATK).


June 1, 2021  |  

SMRT Sequencing of whole mitochondrial genomes and its utility in association studies of metabolic disease.

In this study we demonstrate the utility of Single-Molecule Real Time SMRT sequencing to detect variants and to recapitulate whole mitochondrial genomes in an association study of Metabolic syndrome using samples from a well-studied cohort from Micronesia. The Micronesian island of Kosrae is a rare genetic isolate that offers significant advantages for genetic studies of human disease. Kosrae suffers from one of the highest rates of MetS (41%), obesity (52%), and diabetes (17%) globally and has a homogeneous environment making this an excellent population in which to study these significant health problems. We are conducting family-based association analyses aimed at identifying specific mitochondrial variants that contribute to obesity and other co-morbid conditions. We sequenced whole mitochondrial genomes from 10 Kosraen individuals who represent greater than 25 % of the mitochondrial genetic diversity for the entire Kosraen population. Using Pacific Biosciences C2 chemistry, SMRTbell libraries were constructed from pooled, full-length, unsheared 5 kb PCR amplicons, tiling the entire 16.6 kb mtDNA genome. Average read lengths for each sample were between 2500-3000 bp, with 5% of reads between 6,000-8,000 bases, depending on movie lengths. The data generated in this study serve as proof of principle that SMRT Sequencing data can be utilized for identification of high-quality variants and complete mitochondrial genome sequences. These data will be leveraged to identify causative variants for Metabolic syndrome and associated disorders.


June 1, 2021  |  

T-cell receptor profiling using PacBio sequencing of SMARTer libraries

T-cells play a central part in the immune response in humans and related species. T-cell receptors (TCRs), heterodimers located on the T-cell surface, specifically bind foreign antigens displayed on the MHC complex of antigen-presenting cells. The wide spectrum of potential antigens is addressed by the diversity of TCRs created by V(D)J recombination. Profiling this repertoire of TCRs could be useful from, but not limited to, diagnosis, monitoring response to treatments, and examining T-cell development and diversification.


June 1, 2021  |  

Beyond Contiguity: Evaluating the accuracy of de novo genome assemblies

HiFi reads (>99% accurate, 15-20 kb) from the PacBio Sequel II System consistently provide complete and contiguous genome assemblies. In addition to completeness and contiguity, accuracy is of critical importance, as assembly errors complicate downstream analysis, particularly by disrupting gene frames. Metrics used to assess assembly accuracy include: 1) in-frame gene count, 2) kmer consistency, and 3) concordance to a benchmark, where discordances are interpreted as assembly errors. Genome in a Bottle (GIAB) provides a benchmark for the human genome with estimated accuracy of 99.9999% (Q60). Concordance for human HiFi assemblies exceeds Q50, which provides excellent genomes for downstream analysis, but presents a challenge that any new benchmark must significantly exceed Q50 or the discordance will represent the error rate of the benchmark. To establish benchmarks for Oryza sativa and Drosophila melanogaster, we collected draft references, Illumina short reads, and PacBio HiFi reads. By species, the benchmark was defined as regions of normal coverage that are not within 5 bp of a small variant or 50 bp of a structural variant. For both species, the benchmark regions span around 60% of the genome and HiFi assemblies achieve Q50 accuracy, which is notably more accurate than assemblies with other technologies and meets typical standards for a finished, reference-grade assembly. Here we present a protocol to generate benchmarks for any sample that rival the GIAB benchmark in accuracy. These benchmarks allow the comparison and improvement of genome assemblies and highlight the superior accuracy of assemblies generated with PacBio HiFi reads.


April 21, 2020  |  

Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions.

Chlorella vulgaris is a fast-growing fresh-water microalga cultivated at the industrial scale for applications ranging from food to biofuel production. To advance our understanding of its biology and to establish genetics tools for biotechnological manipulation, we sequenced the nuclear and organelle genomes of Chlorella vulgaris 211/11P by combining next generation sequencing and optical mapping of isolated DNA molecules. This hybrid approach allowed to assemble the nuclear genome in 14 pseudo-molecules with an N50 of 2.8 Mb and 98.9% of scaffolded genome. The integration of RNA-seq data obtained at two different irradiances of growth (high light-HL versus low light -LL) enabled to identify 10,724 nuclear genes, coding for 11,082 transcripts. Moreover 121 and 48 genes were respectively found in the chloroplast and mitochondrial genome. Functional annotation and expression analysis of nuclear, chloroplast and mitochondrial genome sequences revealed peculiar features of Chlorella vulgaris. Evidence of horizontal gene transfers from chloroplast to mitochondrial genome was observed. Furthermore, comparative transcriptomic analyses of LL vs HL provide insights into the molecular basis for metabolic rearrangement in HL vs. LL conditions leading to enhanced de novo fatty acid biosynthesis and triacylglycerol accumulation. The occurrence of a cytosolic fatty acid biosynthetic pathway can be predicted and its upregulation upon HL exposure is observed, consistent with increased lipid amount under HL. These data provide a rich genetic resource for future genome editing studies, and potential targets for biotechnological manipulation of Chlorella vulgaris or other microalgae species to improve biomass and lipid productivity.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

The Chinese chestnut genome: a reference for species restoration

Forest tree species are increasingly subject to severe mortalities from exotic pests, diseases, and invasive organisms, accelerated by climate change. Forest health issues are threatening multiple species and ecosystem sustainability globally. While sources of resistance may be available in related species, or among surviving trees, introgression of resistance genes into threatened tree species in reasonable time frames requires genome-wide breeding tools. Asian species of chestnut (Castanea spp.) are being employed as donors of disease resistance genes to restore native chestnut species in North America and Europe. To aid in the restoration of threatened chestnut species, we present the assembly of a reference genome with chromosome-scale sequences for Chinese chestnut (C. mollissima), the disease-resistance donor for American chestnut restoration. We also demonstrate the value of the genome as a platform for research and species restoration, including new insights into the evolution of blight resistance in Asian chestnut species, the locations in the genome of ecologically important signatures of selection differentiating American chestnut from Chinese chestnut, the identification of candidate genes for disease resistance, and preliminary comparisons of genome organization with related species.


April 21, 2020  |  

Extended haplotype phasing of de novo genome assemblies with FALCON-Phase

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. These assemblies can be created in various ways, such as use of tissues that contain single-haplotype (haploid) genomes, or by co-sequencing of parental genomes, but these approaches can be impractical in many situations. We present FALCON-Phase, which integrates long-read sequencing data and ultra-long-range Hi-C chromatin interaction data of a diploid individual to create high-quality, phased diploid genome assemblies. The method was evaluated by application to three datasets, including human, cattle, and zebra finch, for which high-quality, fully haplotype resolved assemblies were available for benchmarking. Phasing algorithm accuracy was affected by heterozygosity of the individual sequenced, with higher accuracy for cattle and zebra finch (>97%) compared to human (82%). In addition, scaffolding with the same Hi-C chromatin contact data resulted in phased chromosome-scale scaffolds.


April 21, 2020  |  

High satellite repeat turnover in great apes studied with short- and long-read technologies.

Satellite repeats are a structural component of centromeres and telomeres, and in some instances their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50?bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: (1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and (2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males vs. females; using Y chromosome assemblies or FIuorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59?kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions. © The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.


April 21, 2020  |  

A comprehensive evaluation of long read error correction methods

Motivation: Third-generation sequencing technologies can sequence long reads, which is advancing the frontiers of genomics research. However, their high error rates prohibit accurate and efficient downstream analysis. This difficulty has motivated the development of many long read error correction tools, which tackle this problem through sampling redundancy and/or leveraging accurate short reads of the same biological samples. Existing studies to asses these tools use simulated data sets, and are not sufficiently comprehensive in the range of software covered or diversity of evaluation measures used. Results: In this paper, we present a categorization and review of long read error correction methods, and provide a comprehensive evaluation of the corresponding long read error correction tools. Leveraging recent real sequencing data, we establish benchmark data sets and set up evaluation criteria for a comparative assessment which includes quality of error correction as well as run-time and memory usage. We study how trimming and long read sequencing depth affect error correction in terms of length distribution and genome coverage post-correction, and the impact of error correction performance on an important application of long reads, genome assembly. We provide guidelines for practitioners for choosing among the available error correction tools and identify directions for future research.


April 21, 2020  |  

Genome-wide selection footprints and deleterious variations in young Asian allotetraploid rapeseed.

Brassica napus (AACC, 2n = 38) is an important oilseed crop grown worldwide. However, little is known about the population evolution of this species, the genomic difference between its major genetic groups, such as European and Asian rapeseed, and the impacts of historical large-scale introgression events on this young tetraploid. In this study, we reported the de novo assembly of the genome sequences of an Asian rapeseed (B. napus), Ningyou 7, and its four progenitors and compared these genomes with other available genomic data from diverse European and Asian cultivars. Our results showed that Asian rapeseed originally derived from European rapeseed but subsequently significantly diverged, with rapid genome differentiation after hybridization and intensive local selective breeding. The first historical introgression of B. rapa dramatically broadened the allelic pool but decreased the deleterious variations of Asian rapeseed. The second historical introgression of the double-low traits of European rapeseed (canola) has reshaped Asian rapeseed into two groups (double-low and double-high), accompanied by an increase in genetic load in the double-low group. This study demonstrates distinctive genomic footprints and deleterious SNP (single nucleotide polymorphism) variants for local adaptation by recent intra- and interspecies introgression events and provides novel insights for understanding the rapid genome evolution of a young allopolyploid crop. © 2019 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.


April 21, 2020  |  

Early Sex-chromosome Evolution in the Diploid Dioecious Plant Mercurialis annua.

Suppressed recombination allows divergence between homologous sex chromosomes and the functionality of their genes. Here, we reveal patterns of the earliest stages of sex-chromosome evolution in the diploid dioecious herb Mercurialis annua on the basis of cytological analysis, de novo genome assembly and annotation, genetic mapping, exome resequencing of natural populations, and transcriptome analysis. The genome assembly contained 34,105 expressed genes, of which 10,076 were assigned to linkage groups. Genetic mapping and exome resequencing of individuals across the species range both identified the largest linkage group, LG1, as the sex chromosome. Although the sex chromosomes of M. annua are karyotypically homomorphic, we estimate that about a third of the Y chromosome has ceased recombining, containing 568 transcripts and spanning 22.3 cM in the corresponding female map. Nevertheless, we found limited evidence for Y-chromosome degeneration in terms of gene loss and pseudogenization, and most X- and Y-linked genes appear to have diverged in the period subsequent to speciation between M. annua and its sister species M. huetii which shares the same sex-determining region. Taken together, our results suggest that the M. annua Y chromosome has at least two evolutionary strata: a small old stratum shared with M. huetii, and a more recent larger stratum that is probably unique to M. annua and that stopped recombining about one million years ago. Patterns of gene expression within the non-recombining region are consistent with the idea that sexually antagonistic selection may have played a role in favoring suppressed recombination.Copyright © 2019, Genetics.


April 21, 2020  |  

A microbial factory for defensive kahalalides in a tripartite marine symbiosis.

Chemical defense against predators is widespread in natural ecosystems. Occasionally, taxonomically distant organisms share the same defense chemical. Here, we describe an unusual tripartite marine symbiosis, in which an intracellular bacterial symbiont (“Candidatus Endobryopsis kahalalidefaciens”) uses a diverse array of biosynthetic enzymes to convert simple substrates into a library of complex molecules (the kahalalides) for chemical defense of the host, the alga Bryopsis sp., against predation. The kahalalides are subsequently hijacked by a third partner, the herbivorous mollusk Elysia rufescens, and employed similarly for defense. “Ca E. kahalalidefaciens” has lost many essential traits for free living and acts as a factory for kahalalide production. This interaction between a bacterium, an alga, and an animal highlights the importance of chemical defense in the evolution of complex symbioses.Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.


April 21, 2020  |  

Rapid transcriptional responses to serum exposure are associated with sensitivity and resistance to antibody-mediated complement killing in invasive Salmonella Typhimurium ST313

Background: Salmonella Typhimurium ST313 exhibits signatures of adaptation to invasive human infection, including higher resistance to humoral immune responses than gastrointestinal isolates. Full resistance to antibody-mediated complement killing (serum resistance) among nontyphoidal Salmonellae is uncommon, but selection of highly resistant strains could compromise vaccine-induced antibody immunity. Here, we address the hypothesis that serum resistance is due to a distinct genotype or transcriptome response in S. Typhimurium ST313.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.