The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (
Members of the genus Juglans are monecious wind-pollinated trees in the family Juglandaceae with highly heterozygous genomes, which greatly complicates genome sequence assembly. The genomes of interspecific hybrids are usually comprised of haploid genomes of parental species. We exploited this attribute of interspecific hybrids to avoid heterozygosity and sequenced an interspecific hybrid Juglans microcarpa?×?J. regia using a novel combination of single-molecule sequencing and optical genome mapping technologies. The resulting assemblies of both genomes were remarkably complete including chromosome termini and centromere regions. Chromosome termini consisted of arrays of telomeric repeats about 8?kb long and heterochromatic subtelomeric regions about 10?kb long.…
Allen Van Deynze from UC Davis presents the genome sequencing and assembly project for spinach, an organism of 980 Mb. Results indicate a high-accuracy assembly with significantly higher N50 contig length than a previous short-read assembly. The PacBio assembly has allowed for filling gaps in the prior assembly.
In this webinar, Kristin Mars, Sequencing Specialist, PacBio, presents an introduction to PacBio’s technology and its applications followed by a panel discussion among sequencing experts. The panel discussion addresses such things as what long reads are and how are they useful, what differentiates PacBio long-read sequencing from other technologies, and the applications PacBio offers and how they can benefit scientific research.
Lizzie Wilbanks formerly from UC Davis, discusses how longs read from SMRT Sequencing allow accurate assembly of members from the complex pink berry salt marsh community.
Paul Hagerman, MD/PhD, a professor in the biochemistry and molecular medicine department at UC Davis discusses the use of PacBio SMRT sequencing technology for the fragile X gene. Hagerman says the PacBio RS is able to sequence through more than a kilobase of the CGG trinucleotide repeat element underlying Fragile X Syndrome — something no other sequencing platform has achieved. He also plans to use the data to study methylation of this gene, which tends to occur in cases where there are more than 200 copies of the CGG element.
Bart Weimer, a professor at the University of California, Davis, who is leading the 100K Foodborne Pathogen Genome Project, talks about using PacBio sequencing to produce long reads for microbial genomes as well as to study how bacteria use epigenetics to regulate gene expression.
UC Davis’s Bart Weimer describes foodborne pathogens and their proclivity for rapid genome rearrangement. The 100K Pathogen Genome Project he leads is using PacBio long-read sequencing to close genomes and analyze methylation; Weimer reports that his team has already discovered new epigenetic modifications in Salmonella and Listeria with the technology.
Simon Chan, UC Davis on how PacBio long read sequencing revealed higher order repeats in centromeres of switchgrass which would have been hidden if you are restricted by the much shorter Sanger reads.
Grant Cramer from the University of Nevada, Reno, and Dario Cantu from the Univeristy of Callifornia, Davis, discuss past challenges with sequencing Clone 8 of Cabernet Sauvignon (Vitis vinifera). An assembly of the genome was attempted with approximately 110x Illumina reads and 5x PacBio reads. The PacBio SMRT Sequencing read made major improvements in the assembly compared with the results of Illumina reads only. However, the assembly results were still unsatisfactory, so an additional 100-fold SMRT Sequencing coverage had been generated. An update on the current sequencing results and status of the assembly are presented.
In this ASHG 2016 virtual poster, Flora Tassone from UC Davis describes her study of the molecular mechanisms linked to fragile X syndrome and associated disorders, such as FXTAS. She is using SMRT Sequencing to resolve the FMR1 gene in premutation carriers because it’s the only technology that can generate full-length transcripts with the causative CGG repeat expansion. Plus: direct confirmation of predicted isoform configurations.
Recent improvements in sequencing chemistry and instrument performance combine to create a new PacBio data type, Single Molecule High-Fidelity reads (HiFi reads). Increased read length and improvement in library construction enables average read lengths of 10-20 kb with average sequence identity greater than 99% from raw single molecule reads. The resulting reads have the accuracy comparable to short read NGS but with 50-100 times longer read length. Here we benchmark the performance of this data type by sequencing and genotyping the Genome in a Bottle (GIAB) HG0002 human reference sample from the National Institute of Standards and Technology (NIST). We…
High-quality insect genomes are essential resources to understand insect biology and to combat them as disease vectors and agricultural pests. It is desirable to sequence a single individual for a reference genome to avoid complications from multiple alleles during de novo assembly. However, the small body size of many insects poses a challenge for the use of long-read sequencing technologies which often have high DNA-input requirements. The previously described PacBio Low DNA Input Protocol starts with ~100 ng of DNA and allows for high-quality assemblies of single mosquitoes among others and represents a significant step in reducing such requirements. Here,…
Alleles of the FMR1 gene with more than 200 CGG repeats generally undergo methylation-coupled gene silencing, resulting in fragile X syndrome, the leading heritable form of cognitive impairment. Smaller expansions (55-200 CGG repeats) result in elevated levels of FMR1 mRNA, which is directly responsible for the late-onset neurodegenerative disorder, fragile X-associated tremor/ataxia syndrome (FXTAS). For mechanistic studies and genetic counseling, it is important to know with precision the number of CGG repeats; however, no existing DNA sequencing method is capable of sequencing through more than ~100 CGG repeats, thus limiting the ability to precisely characterize the disease-causing alleles. The recent…