Interested to learn about pangenomes? Explore this guide to learn how they provide a more complete picture of the core genes of a given species and how that can provide better biological understanding.
Learn how Single Molecule, Real-Time (SMRT) Sequencing and the Sequel IIe System will accelerate your research by delivering highly accurate long reads to provide the most comprehensive view of genomes, transcriptomes and epigenomes.
Most of the basepairs that differ between two human genomes are in intermediate-sized structural variants (50 bp to 5 kb), which are too small to detect with array CGH but too large to reliably discover with short-read NGS. PacBio Single Molecule, Real-Time (SMRT) Sequencing fills this technology gap. SMRT Sequencing detects tens of thousands of structural variants in a human genome, approximately five times the sensitivity of short-read NGS. To discover variants using SMRT Sequencing, we have developed pbsv, which is available in version 5 of the PacBio SMRT Link software suite. The pbsv algorithm applies a sequence of stages:…
This webinar highlights global initiatives currently underway to use Single Molecule, Real-Time (SMRT) Sequencing to de novo assemble genomes of individuals representing multiple ethnic populations, thereby extending the diversity of available human reference genomes. In their presentations, Tina Graves-Lindsay from Washington University and Adam Ameur from Uppsala University spoke about diploid assemblies, discovering novel sequence and improving diversity of the current human reference genome. Finally, Paul Peluso of PacBio presented data from the recent effort to sequence a Puerto Rican genome and shared a SMRT Sequencing technology roadmap showing the next several upgrades for the Sequel System.
Rapidly evolving RNA viruses prevail within a host as a collection of closely related variants, referred to as viral quasispecies. Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of such virus populations at an unprecedented level of detail. However, analysis of HTS data from virus populations is challenging due to short, error-prone reads. In order to account for uncertainties originating from these limitations, several computational and statistical methods have been developed for studying the genetic heterogeneity of virus population. Here, we review methods for the analysis of HTS reads, including approaches to local diversity…
In contrast to other available next-generation sequencing platforms, PacBio single-molecule, real-time (SMRT) sequencing has the advantage of generating long reads albeit with a relatively higher error rate in unprocessed data. Using this platform, we longitudinally sampled and sequenced the hepatitis C virus (HCV) envelope genome region (1,680 nucleotides [nt]) from individuals belonging to a cluster of sexually transmitted cases. All five subjects were coinfected with HIV-1 and a closely related strain of HCV genotype 4d. In total, 50 samples were analyzed by using SMRT sequencing. By using 7 passes of circular consensus sequencing, the error rate was reduced to 0.37%,…
Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate the determination of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here we report the assembly and annotation of a reference genome of maize, a genetic and agricultural model species, using single-molecule real-time sequencing and high-resolution optical mapping. Relative to the previous reference genome, our assembly features a 52-fold increase…
The fungal genus ofAspergillusis highly interesting, containing everything from industrial cell factories, model organisms, and human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverseAspergillusspecies (A. campestris,A. novofumigatus,A. ochraceoroseus, andA. steynii) have been whole-genome PacBio sequenced to provide genetic references in threeAspergillussections.A. taichungensisandA. candidusalso were sequenced for SM elucidation. ThirteenAspergillusgenomes were analyzed with comparative genomics to determine phylogeny and genetic diversity, showing that each presented genome contains 15-27% genes not found in other sequenced Aspergilli. In particular,A. novofumigatuswas compared with the pathogenic speciesA. fumigatusThis suggests thatA. novofumigatuscan produce most of…
Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives.A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei Due to core genome reduction…