June 1, 2021  |  

SMRT Sequencing of whole mitochondrial genomes and its utility in association studies of metabolic disease.

In this study we demonstrate the utility of Single-Molecule Real Time SMRT sequencing to detect variants and to recapitulate whole mitochondrial genomes in an association study of Metabolic syndrome using samples from a well-studied cohort from Micronesia. The Micronesian island of Kosrae is a rare genetic isolate that offers significant advantages for genetic studies of human disease. Kosrae suffers from one of the highest rates of MetS (41%), obesity (52%), and diabetes (17%) globally and has a homogeneous environment making this an excellent population in which to study these significant health problems. We are conducting family-based association analyses aimed at identifying specific mitochondrial variants that contribute to obesity and other co-morbid conditions. We sequenced whole mitochondrial genomes from 10 Kosraen individuals who represent greater than 25 % of the mitochondrial genetic diversity for the entire Kosraen population. Using Pacific Biosciences C2 chemistry, SMRTbell libraries were constructed from pooled, full-length, unsheared 5 kb PCR amplicons, tiling the entire 16.6 kb mtDNA genome. Average read lengths for each sample were between 2500-3000 bp, with 5% of reads between 6,000-8,000 bases, depending on movie lengths. The data generated in this study serve as proof of principle that SMRT Sequencing data can be utilized for identification of high-quality variants and complete mitochondrial genome sequences. These data will be leveraged to identify causative variants for Metabolic syndrome and associated disorders.


June 1, 2021  |  

Harnessing kinetic information in Single-Molecule, Real-Time Sequencing.

Single-Molecule Real-Time (SMRT) DNA sequencing is unique in that nucleotide incorporation events are monitored in real time, leading to a wealth of kinetic information in addition to the extraction of the primary DNA sequence. The dynamics of the DNA polymerase that is observed adds an additional dimension of sequence-dependent information, and can be used to learn more about the molecule under study. First, the primary sequence itself can be determined more accurately. The kinetic data can be used to corroborate or overturn consensus calls and even enable calling bases in problematic sequence contexts. Second, using the kinetic information, we can detect and discriminate numerous chemical base modifications as a by-product of ordinary sequencing. Examples of applying these capabilities include (i) the characterization of the epigenome of microorganisms by directly sequencing the three common prokaryotic epigenetic base modifications of 4-methylcytosine, 5- methylcytosine and 6-methyladenine; (ii) the characterization of known and novel methyltransferase activities; (iii) the direct sequencing and differentiation of the four eukaryotic epigenetic forms of cytosine (5-methyl, 5-hydroxymethyl, 5-formyl, and 5-carboxylcytosine) with first applications to map them with single base-pair and DNA strand resolution across mammalian genomes; (iv) the direct sequencing and identification of numerous modified DNA bases arising from DNA damage; and (v) an exploration of the mitochondrial genome for known and novel base modifications. We will show our progress towards a generic, open-source algorithm for exploiting kinetic information for any of these purposes.


June 1, 2021  |  

Draft genome of horseweed illuminates expansion of gene families that might endow herbicide resistance.

Conyza canadensis (horseweed), a member of the Compositae (Asteraceae) family, was the first broadleaf weed to evolve resistance to glyphosate. Horseweed, one of the most problematic weeds in the world, is a true diploid (2n=2X=18) with the smallest genome of any known agricultural weed (335 Mb). Thus, it is an appropriate candidate to help us understand the genetic and genomic basis of weediness. We undertook a draft de novo genome assembly of horseweed by combining data from multiple sequencing platforms (454 GS-FLX, Illumina HiSeq 2000 and PacBio RS) using various libraries with different insertion sizes (~350 bp, ~600 bp, ~3 kb and ~10 kb) of a Tennessee-accessed, glyphosate-resistant horseweed biotype. From 116.3 Gb (~350× coverage) of data, the genome was assembled into 13,966 scaffolds with N50 =33,561 bp. The assembly covered 92.3% of the genome, including the complete chloroplast genome (~153 kb) and a nearly-complete mitochondrial genome (~450 kb in 120 scaffolds). The nuclear genome is comprised of 44,592 protein-coding genes. Genome re-sequencing of seven additional horseweed biotypes was performed. These sequence data were assembled and used to analyze genome variation. Simple sequence repeat and single nucleotide polymorphisms were surveyed. Genomic patterns were detected that associated with glyphosate-resistant or –susceptible biotypes. The draft genome will be useful to better understand weediness, the evolution of herbicide resistance, and to devise new management strategies. The genome will also be useful as another reference genome in the Compositae. To our knowledge, this paper represents the first published draft genome of an agricultural weed.


June 1, 2021  |  

An update on goat genomics

Goats are specialized in dairy, meat and fiber production, being adapted to a wide range of environmental conditions and having a large economic impact in developing countries. In the last years, there have been dramatic advances in the knowledge of the structure and diversity of the goat genome/transcriptome and in the development of genomic tools, rapidly narrowing the gap between goat and related species such as cattle and sheep. Major advances are: 1) publication of a de novo goat genome reference sequence; 2) Development of whole genome high density RH maps, and; 3) Design of a commercial 50K SNP array. Moreover, there are currently several projects aiming at improving current genomic tools and resources. An improved assembly of the goat genome using PacBio reads is being produced, and the design of new SNP arrays is being studied to accommodate the specific needs of this species in the context of very large scale genotyping projects (i.e. breed characterization at an international scale and genomic selection) and parentage analysis. As in other species, the focus has now turned to the identification of causative mutations underlying the phenotypic variation of traits. In addition, since 2014, the ADAPTmap project (www.goatadaptmap.org) has gathered data to explore the diversity of caprine populations at a worldwide scale by using a wide variety of approaches and data.


June 1, 2021  |  

Mitochondrial DNA sequencing using PacBio SMRT technology

Mitochondrial DNA (mtDNA) is a compact, double-stranded circular genome of 16,569 bp with a cytosine-rich light (L) chain and a guanine-rich heavy (H) chain. mtDNA mutations have been increasingly recognized as important contributors to an array of human diseases such as Parkinson’s disease, Alzheimer’s disease, colorectal cancer and Kearns–Sayre syndrome. mtDNA mutations can affect all of the 1000-10,000 copies of the mitochondrial genome present in a cell (homoplasmic mutation) or only a subset of copies (heteroplasmic mutation). The ratio of normal to mutant mtDNAs within cells is a significant factor in whether mutations will result in disease, as well as the clinical presentation, penetrance, and severity of the phenotype. Over time, heteroplasmic mutations can become homoplastic due to differential replication and random assortment. Full characterization of the mitochondrial genome would involve detection of not only homoplastic but heteroplasmic mutations, as well as complete phasing. Previously, we sequenced human mtDNA on the PacBio RS II System with two partially overlapping amplicons. Here, we present amplification-free, full-length sequencing of linearized mtDNA using the Sequel System. Full-length sequencing allows variant phasing along the entire mitochondrial genome, identification of heteroplasmic variants, and detection of epigenetic modifications that are lost in amplicon-based methods.


June 1, 2021  |  

High-quality de novo genome assembly and intra-individual mitochondrial instability in the critically endangered kakapo

The kakapo (Strigops habroptila) is a large, flightless parrot endemic to New Zealand. It is highly endangered with only ~150 individuals remaining, and intensive conservation efforts are underway to save this iconic species from extinction. These include genetic studies to understand critical genes relevant to fertility, adaptation and disease resistance, and genetic diversity across the remaining population for future breeding program decisions. To aid with these efforts, we have generated a high-quality de novo genome assembly using PacBio long-read sequencing. Using the new diploid-aware FALCON-Unzip assembler, the resulting genome of 1.06 Gb has a contig N50 of 5.6 Mb (largest contig 29.3 Mb), >350-times more contiguous compared to a recent short-read assembly of a closely related parrot (kea) species. We highlight the benefits of the higher contiguity and greater completeness of the kakapo genome assembly through examples of fully resolved genes important in wildlife conservation (contrasted with fragmented and incomplete gene resolution in short-read assemblies), in some cases even providing sequence for regions orthologous to gaps of missing sequence in the chicken reference genome. We also highlight the complete resolution of the kakapo mitochondrial genome, fully containing the mitochondrial control region which is missing from the previous dedicated kakapomitochondrial genome NCBI entry. For this region, we observed a marked heterogeneity in the number of tandem repeats in different mtDNAmolecules from a single bird tissue, highlighting the enhanced molecular resolution uniquely afforded by long-read, single-molecule PacBio sequencing.


June 1, 2021  |  

High-throughput SMRT Sequencing of clinically relevant targets

Targeted sequencing with Sanger as well as short read based high throughput sequencing methods is standard practice in clinical genetic testing. However, many applications beyond SNP detection have remained somewhat obstructed due to technological challenges. With the advent of long reads and high consensus accuracy, SMRT Sequencing overcomes many of the technical hurdles faced by Sanger and NGS approaches, opening a broad range of untapped clinical sequencing opportunities. Flexible multiplexing options, highly adaptable sample preparation method and newly improved two well-developed analysis methods that generate highly-accurate sequencing results, make SMRT Sequencing an adept method for clinical grade targeted sequencing. The Circular Consensus Sequencing (CCS) analysis pipeline produces QV 30 data from each single intra-molecular multi-pass polymerase read, making it a reliable solution for detecting minor variant alleles with frequencies as low as 1 %. Long Amplicon Analysis (LAA) makes use of insert spanning full-length subreads originating from multiple individual copies of the target to generate highly accurate and phased consensus sequences (>QV50), offering a unique advantage for imputation free allele segregation and haplotype phasing. Here we present workflows and results for a range of SMRT Sequencing clinical applications. Specifically, we illustrate how the flexible multiplexing options, simple sample preparation methods and new developments in data analysis tools offered by PacBio in support of Sequel System 5.1 can come together in a variety of experimental designs to enable applications as diverse as high throughput HLA typing, mitochondrial DNA sequencing and viral vector integrity profiling of recombinant adeno-associated viral genomes (rAAV).


April 21, 2020  |  

Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions.

Chlorella vulgaris is a fast-growing fresh-water microalga cultivated at the industrial scale for applications ranging from food to biofuel production. To advance our understanding of its biology and to establish genetics tools for biotechnological manipulation, we sequenced the nuclear and organelle genomes of Chlorella vulgaris 211/11P by combining next generation sequencing and optical mapping of isolated DNA molecules. This hybrid approach allowed to assemble the nuclear genome in 14 pseudo-molecules with an N50 of 2.8 Mb and 98.9% of scaffolded genome. The integration of RNA-seq data obtained at two different irradiances of growth (high light-HL versus low light -LL) enabled to identify 10,724 nuclear genes, coding for 11,082 transcripts. Moreover 121 and 48 genes were respectively found in the chloroplast and mitochondrial genome. Functional annotation and expression analysis of nuclear, chloroplast and mitochondrial genome sequences revealed peculiar features of Chlorella vulgaris. Evidence of horizontal gene transfers from chloroplast to mitochondrial genome was observed. Furthermore, comparative transcriptomic analyses of LL vs HL provide insights into the molecular basis for metabolic rearrangement in HL vs. LL conditions leading to enhanced de novo fatty acid biosynthesis and triacylglycerol accumulation. The occurrence of a cytosolic fatty acid biosynthetic pathway can be predicted and its upregulation upon HL exposure is observed, consistent with increased lipid amount under HL. These data provide a rich genetic resource for future genome editing studies, and potential targets for biotechnological manipulation of Chlorella vulgaris or other microalgae species to improve biomass and lipid productivity.This article is protected by copyright. All rights reserved.


April 21, 2020  |  

The Genome of the Zebra Mussel, Dreissena polymorpha: A Resource for Invasive Species Research

The zebra mussel, Dreissena polymorpha, continues to spread from its native range in Eurasia to Europe and North America, causing billions of dollars in damage and dramatically altering invaded aquatic ecosystems. Despite these impacts, there are few genomic resources for Dreissena or related bivalves, with nearly 450 million years of divergence between zebra mussels and its closest sequenced relative. Although the D. polymorpha genome is highly repetitive, we have used a combination of long-read sequencing and Hi-C-based scaffolding to generate the highest quality molluscan assembly to date. Through comparative analysis and transcriptomics experiments we have gained insights into processes that likely control the invasive success of zebra mussels, including shell formation, synthesis of byssal threads, and thermal tolerance. We identified multiple intact Steamer-Like Elements, a retrotransposon that has been linked to transmissible cancer in marine clams. We also found that D. polymorpha have an unusual 67 kb mitochondrial genome containing numerous tandem repeats, making it the largest observed in Eumetazoa. Together these findings create a rich resource for invasive species research and control efforts.


April 21, 2020  |  

Evidence of extensive intraspecific noncoding reshuffling in a 169-kb mitochondrial genome of a basidiomycetous fungus

Comparative genomics of fungal mitochondrial genomes (mitogenomes) have revealed a remarkable pattern of rearrangement between and within major phyla owing to horizontal gene transfer (HGT) and recombination. The role of recombination was exemplified at a finer evolutionary time scale in basidiomycetes group of fungi as they display a diversity of mitochondrial DNA (mtDNA) inheritance patterns. Here, we assembled mitogenomes of six species from the Hymenochaetales order of basidiomycetes and examined 59 mitogenomes from two genetic lineages of Pyrrhoderma noxium. Gene order is largely colinear while intergene regions are major determinants of mitogenome size variation. Substantial sequence divergence was found in shared introns consistent with high HGT frequency observed in yeasts, but we also identified a rare case where an intron was retained in five species since speciation. In contrast to the hyperdiversity observed in nuclear genomes of P. noxium, mitogenomes’ intraspecific polymorphisms at protein coding sequences are extremely low. Phylogeny based on introns revealed turnover as well as exchange of introns between two lineages. Strikingly, some strains harbor a mosaic origin of introns from both lineages. Analysis of intergenic sequence indicated substantial differences between and within lineages, and an expansion may be ongoing as a result of exchange between distal intergenes. These findings suggest that the evolution in mtDNAs is usually lineage specific but chimeric mitotypes are frequently observed, thus capturing the possible evolutionary processes shaping mitogenomes in a basidiomycete. The large mitogenome sizes reported in various basidiomycetes appear to be a result of interspecific reshuffling of intergenes.


April 21, 2020  |  

Insect genomes: progress and challenges.

In the wake of constant improvements in sequencing technologies, numerous insect genomes have been sequenced. Currently, 1219 insect genome-sequencing projects have been registered with the National Center for Biotechnology Information, including 401 that have genome assemblies and 155 with an official gene set of annotated protein-coding genes. Comparative genomics analysis showed that the expansion or contraction of gene families was associated with well-studied physiological traits such as immune system, metabolic detoxification, parasitism and polyphagy in insects. Here, we summarize the progress of insect genome sequencing, with an emphasis on how this impacts research on pest control. We begin with a brief introduction to the basic concepts of genome assembly, annotation and metrics for evaluating the quality of draft assemblies. We then provide an overview of genome information for numerous insect species, highlighting examples from prominent model organisms, agricultural pests and disease vectors. We also introduce the major insect genome databases. The increasing availability of insect genomic resources is beneficial for developing alternative pest control methods. However, many opportunities remain for developing data-mining tools that make maximal use of the available insect genome resources. Although rapid progress has been achieved, many challenges remain in the field of insect genomics. © 2019 The Royal Entomological Society.


April 21, 2020  |  

Morphological and genomic characterisation of the hybrid schistosome infecting humans in Europe reveals a complex admixture between Schistosoma haematobium and Schistosoma bovis parasites

Schistosomes cause schistosomiasis, the worldtextquoterights second most important parasitic disease after malaria. A peculiar feature of schistosomes is their ability to produce viable and fertile hybrids. Originally only present in the tropics, schistosomiasis is now also endemic in Europe. Based on two genetic markers the European species had been identified as a hybrid between the ruminant-infective Schistosoma bovis and the human-infective Schistosoma haematobium.Here we describe for the first time the genomic composition of the European schistosome hybrid (77% of S. haematobium and 23% of S. bovis origins), its morphometric parameters and its compatibility with the European vector snail and intermediate host Compatibility is a key parameter for the parasites life cycle progression. We also show that egg morphology (a classical diagnostic parameter) does not allow for differential diagnosis while genetic tests do so. Additionally, we performed genome assembly improvement and annotation of S. bovis, the parental species for which no satisfactory genome assembly was available.For the first time since the discovery of hybrid schistosomes, these results reveal at the whole genomic level a complex admixture of parental genomes highlighting (i) the high permeability of schistosomes to other speciestextquoteright alleles, and (ii) the importance of hybrid formation for pushing species boundaries not only conceptionally but also geographically.


April 21, 2020  |  

Resequencing the Genome of Malassezia restricta Strain KCTC 27527.

The draft genome sequence of Malassezia restricta KCTC 27527, a clinical isolate from a patient with dandruff, was previously reported. Using the PacBio Sequel platform, we completed and reannotated the genome of M. restricta KCTC 27527 for a better understanding of the genome of this fungus.Copyright © 2019 Cho et al.


April 21, 2020  |  

A draft nuclear-genome assembly of the acoel flatworm Praesagittifera naikaiensis.

Acoels are primitive bilaterians with very simple soft bodies, in which many organs, including the gut, are not developed. They provide platforms for studying molecular and developmental mechanisms involved in the formation of the basic bilaterian body plan, whole-body regeneration, and symbiosis with photosynthetic microalgae. Because genomic information is essential for future research on acoel biology, we sequenced and assembled the nuclear genome of an acoel, Praesagittifera naikaiensis.To avoid sequence contamination derived from symbiotic microalgae, DNA was extracted from embryos that were free of algae. More than 290x sequencing coverage was achieved using a combination of Illumina (paired-end and mate-pair libraries) and PacBio sequencing. RNA sequencing and Iso-Seq data from embryos, larvae, and adults were also obtained. First, a preliminary ~17-kilobase pair (kb) mitochondrial genome was assembled, which was deleted from the nuclear sequence assembly. As a result, a draft nuclear genome assembly was ~656 Mb in length, with a scaffold N50 of 117 kb and a contig N50 of 57 kb. Although ~70% of the assembled sequences were likely composed of repetitive sequences that include DNA transposons and retrotransposons, the draft genome was estimated to contain 22,143 protein-coding genes, ~99% of which were substantiated by corresponding transcripts. We could not find horizontally transferred microalgal genes in the acoel genome. Benchmarking Universal Single-Copy Orthologs analyses indicated that 77% of the conserved single-copy genes were complete. Pfam domain analyses provided a basic set of gene families for transcription factors and signaling molecules.Our present sequencing and assembly of the P. naikaiensis nuclear genome are comparable to those of other metazoan genomes, providing basic information for future studies of genic and genomic attributes of this animal group. Such studies may shed light on the origins and evolution of simple bilaterians. © The Author(s) 2019. Published by Oxford University Press.


Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.