Menu
July 7, 2019  |  

LRCstats, a tool for evaluating long reads correction methods.

Authors: La, Sean and Haghshenas, Ehsan and Chauve, Cedric

Third-generation sequencing (TGS) platforms that generate long reads, such as PacBio and Oxford Nanopore technologies, have had a dramatic impact on genomics research. However, despite recent improvements, TGS reads suffer from high-error rates and the development of read correction methods is an active field of research. This motivates the need to develop tools that can evaluate the accuracy of noisy long reads correction tools.We introduce LRCstats, a tool that measures the accuracy of long reads correction tools. LRCstats takes advantage of long reads simulators that provide each simulated read with an alignment to the reference genome segment they originate from, and does not rely on a step of mapping corrected reads onto the reference genome. This allows for the measurement of the accuracy of the correction while being consistent with the actual errors introduced in the simulation process used to generate noisy reads. We illustrate the usefulness of LRCstats by analyzing the accuracy of four hybrid correction methods for PacBio long reads over three datasets.https://github.com/cchauve/[email protected] or [email protected] data are available at Bioinformatics online.© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: [email protected]

Journal: Bioinformatics
DOI: 10.1093/bioinformatics/btx489
Year: 2017

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.