Error correction Archives - Page 3 of 9

February 5, 2021 |

Webinar: Sequence with Confidence – Introducing the Sequel II System

In this webinar, Jonas Korlach, Chief Scientific Officer, PacBio provides an overview of the features and the advantages of the new Sequel II System. Kiran Garimella, Senior Computational Scientist, Broad…

February 5, 2021 |

User Group Meeting: Unbiased characterization of metagenome composition and function using HiFi sequencing on the PacBio Sequel II System

In this PacBio User Group Meeting presentation, PacBio scientist Meredith Ashby shared several examples of analysis — from full-length 16S sequencing to shotgun sequencing — showing how SMRT Sequencing enables…

February 5, 2021 |

Webinar: Unbiased, efficient characterization of metagenome functions with PacBio HiFi sequencing

Understanding interactions among plants and the complex communities of organisms living on, in and around them requires more than one experimental approach. A new method for de novo metagenome assembly,…

February 5, 2021 |

Webinar: Bioinformatics lunch & learn – Better assemblies of bacterial genomes and plasmids with the new microbial assembly pipeline in SMRT Link v8.0

Microbial Assembly is our latest pipeline, specifically designed to assemble bacterial genomes (between 2 and 10 Mb) and plasmids. This pipeline includes the implementation of a new, circular-aware read alignment…

February 5, 2021 |

Webinar: Sequencing 101 – How long-read sequencing improves access to genetic information

In this webinar, Kristin Mars, Sequencing Specialist, PacBio, presents an introduction to PacBio’s technology and its applications followed by a panel discussion among sequencing experts. The panel discussion addresses such…

February 5, 2021 |

AGBT Presentation: Generating high quality human reference assemblies with PacBio sequencing

Tina Graves-Lindsay from the McDonnell Genome Institute reports at AGBT 2020 on how her team is using PacBio sequencing to produce reference-grade human genome assemblies. With highly accurate HiFi reads,…

February 5, 2021 |

Webinar: Bioinformatics lunch & learn – HiFi assembly

The release of the PacBio Sequel II System in 2019 brought dramatic throughput improvements and protocols for producing a new data type, highly accurate long reads or HiFi reads. PacBio…

February 5, 2021 |

Webinar: Long HiFi reads for high-quality genome assemblies

In this LabRoots webinar, Jonas Korlach the CSO of PacBio provides an introduction to PacBio HiFi sequence reads, which are both long (up to 25 kb currently) and accurate (>99%)…

February 5, 2021 |

Webinar: SMRT Sequencing applications for human genomics and medicine

In this webinar, Adam Ameur of SciLifeLab at Uppsala University shares how he uses Single Molecule, Real-Time (SMRT) Sequencing applications for medical diagnostics and human genetics research, including sequencing of…

February 5, 2021 |

Video Poster: Improving long-read assembly of microbial genomes and plasmids

Complete, high-quality microbial genomes are very valuable across a broad array of fields, from environmental studies, to human microbiome health, food pathogen surveillance, etc. Long-read sequencing enables accurate resolution of…

April 21, 2020 |

Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads.

The sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of tandem repeats, as validated with orthogonal analyses. As a result, an additional 5 Mbp of pericentromeric sequences are recovered in the HiFi assembly, resulting in a 2.5-fold increase in the NG50 within 1 Mbp of the centromere (HiFi 480.6 kbp, CLR 191.5 kbp). Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Despite these shortcomings, our results suggest that HiFi may be the most effective standalone technology for de novo assembly of human genomes. © 2019 John Wiley & Sons Ltd/University College London.

April 21, 2020 |

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others. © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

April 21, 2020 |

A robust benchmark for germline structural variant detection

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods, both alignment- and de novo assembly-based, from short-, linked-, and long-read sequencing, as well as optical and electronic mapping. The final benchmark set contains 12745 isolated, sequence-resolved insertion and deletion calls =50 base pairs (bp) discovered by at least 2 technologies or 5 callsets, genotyped as heterozygous or homozygous variants by long reads. The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.66 Gbp and 9641 SVs supported by at least one diploid assembly. Support for SVs was assessed using svviz with short-, linked-, and long-read sequence data. In general, there was strong support from multiple technologies for the benchmark SVs, with 90 % of the Tier 1 SVs having support in reads from more than one technology. The Mendelian genotype error rate was 0.3 %, and genotype concordance with manual curation was >98.7 %. We demonstrate the utility of the benchmark set by showing it reliably identifies both false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping.

April 21, 2020 |

Plasmid-encoded tet(X) genes that confer high-level tigecycline resistance in Escherichia coli.

Tigecycline is one of the last-resort antibiotics to treat complicated infections caused by both multidrug-resistant Gram-negative and Gram-positive bacteria1. Tigecycline resistance has sporadically occurred in recent years, primarily due to chromosome-encoding mechanisms, such as overexpression of efflux pumps and ribosome protection2,3. Here, we report the emergence of the plasmid-mediated mobile tigecycline resistance mechanism Tet(X4) in Escherichia coli isolates from China, which is capable of degrading all tetracyclines, including tigecycline and the US FDA newly approved eravacycline. The tet(X4)-harbouring IncQ1 plasmid is highly transferable, and can be successfully mobilized and stabilized in recipient clinical and laboratory strains of Enterobacteriaceae bacteria. It is noteworthy that tet(X4)-positive E.?coli strains, including isolates co-harbouring mcr-1, have been widely detected in pigs, chickens, soil and dust samples in China. In vivo murine models demonstrated that the presence of Tet(X4) led to tigecycline treatment failure. Consequently, the emergence of plasmid-mediated Tet(X4) challenges the clinical efficacy of the entire family of tetracycline antibiotics. Importantly, our study raises concern that the plasmid-mediated tigecycline resistance may further spread into various ecological niches and into clinical high-risk pathogens. Collective efforts are in urgent need to preserve the potency of these essential antibiotics.

April 21, 2020 |

Variant Phasing and Haplotypic Expression from Single-molecule Long-read Sequencing in Maize

Haplotype phasing of genetic variants is important for interpretation of the maize genome, population genetic analysis, and functional genomic analysis of allelic activity. Accordingly, accurate methods for phasing full-length isoforms are essential for functional genomics study. In this study, we performed an isoform-level phasing study in maize, using two inbred lines and their reciprocal crosses, based on single-molecule full-length cDNA sequencing. To phase and analyze full-length transcripts between hybrids and parents, we developed a tool called IsoPhase. Using this tool, we validated the majority of SNPs called against matching short read data and identified cases of allele-specific, gene-level, and isoform-level expression. Our results revealed that maize parental and hybrid lines exhibit different splicing activities. After phasing 6,847 genes in two reciprocal hybrids using embryo, endosperm and root tissues, we annotated the SNPs and identified large-effect genes. In addition, based on single-molecule sequencing, we identified parent-of-origin isoforms in maize hybrids, different novel isoforms between maize parent and hybrid lines, and imprinted genes from different tissues. Finally, we characterized variation in cis- and trans-regulatory effects. Our study provides measures of haplotypic expression that could increase power and accuracy in studies of allelic expression.

Auto Tag: Error correction

Webinar: Sequence with Confidence – Introducing the Sequel II System

User Group Meeting: Unbiased characterization of metagenome composition and function using HiFi sequencing on the PacBio Sequel II System

Webinar: Unbiased, efficient characterization of metagenome functions with PacBio HiFi sequencing

Webinar: Bioinformatics lunch & learn – Better assemblies of bacterial genomes and plasmids with the new microbial assembly pipeline in SMRT Link v8.0

Webinar: Sequencing 101 – How long-read sequencing improves access to genetic information

AGBT Presentation: Generating high quality human reference assemblies with PacBio sequencing

Webinar: Bioinformatics lunch & learn – HiFi assembly

Webinar: Long HiFi reads for high-quality genome assemblies

Webinar: SMRT Sequencing applications for human genomics and medicine

Video Poster: Improving long-read assembly of microbial genomes and plasmids

Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads.

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.

A robust benchmark for germline structural variant detection

Plasmid-encoded tet(X) genes that confer high-level tigecycline resistance in Escherichia coli.

Variant Phasing and Haplotypic Expression from Single-molecule Long-read Sequencing in Maize

Subscribe for blog updates:

Filter by topic

Talk with an expert

ALS case study

Subscribe for blog updates:

Filter by topic

Talk with an expert