Menu
April 10, 2026  |  Human genetics research

Genome-wide classification of tumor-derived reads from bulk long-read sequencing

Authors: Toby M. Baker1, Nedas Matulionis1 , Cassidy Andrasz1,2 , Dan Gerke1 Natalia Garcia-Dutton1,2 , David Atkinson1 , Kami Chiotti1 , Selina Wu1,2 , Suchita Lulla1,2 , Jieun Oh1,2, Helena Winata1,2 , Rong-Rong Huang3 , Jenny Lester4 , Beth Y. Karlan4, Paul T. Spellman1,2 1) Division of Hematology Oncology, Department of Medicine David Geffen School of Medicine, University of California Los Angeles, 2) Department of Genetics, David Geffen School of Medicine University of California Los Angeles, 3) Department of Pathology and Laboratory Medicine, David Geffen School of Medicine University of California Los Angeles, 4) Department of Obstetrics and Gynecology, David Geffen School of Medicine University of California Los Angeles

DNA extracted from tissue samples typically derives from a complex mixture of cell types. Without single cell analysis, it has been generally impossible to determine the cell type of origin for most molecules. One clear example of this is in the complex milieu of a human neoplasm. Here, we develop ROCIT (https://github.com/tobybaker/rocit), a transformerbased model to classify the tumor or non-tumor origin of individual reads from bulk tumor samples sequenced with long-read whole genome sequencing. Using somatic mutations to derive training data, ROCIT uses read-level methylation patterns to accurately classify reads from anywhere in the genome without requiring the adjacent normal tissue or the explicit identification of tumor differentially methylated regions. We apply ROCIT to a cohort of prostate and ovarian tumors and demonstrate high classification accuracy across the entire genome. We then demonstrate the potential of ROCIT predictions to improve somatic variant calling. ROCIT represents a major step forward in the analysis of bulk tumors with long-reads, enabling the accurate and sensitive identification of reads with specific cell types of origin genome-wide.

Journal: bioRxiv
DOI: 10.64898/2026.03.03.709085
Year: 2026

Read publication

Talk with an expert

If you have a question, need to check the status of an order, or are interested in purchasing an instrument, we're here to help.