Unlearning's Blind Spots: Over-Unlearning and Prototypical Relearning Attack 文章

ArXiv CS.AI2026-06-01NEWSen作者: SeungBum Ha, Saerom Park, Sung Whan Yoon

摘要

arXiv:2506.01318v4 Announce Type: replace-cross Abstract: Machine unlearning (MU) aims to expunge a designated forget set from a trained model without costly retraining, yet the existing techniques overlook two critical blind spots: "over-unlearning" that deteriorates retained data near the forget set, and post-hoc "relearning" attacks that aim to resurrect the forgotten knowledge. Focusing on class-level unlearning, we first derive an over-unlearning metric, OU@epsilon, which quantifies collateral damage in regions proximal to the forget set, where over-unlearning mainly occurs. Next, we expose an unforeseen relearning threat on MU, i.e., the Prototypical Relearning Attack, which exploits the per-class prototype of the forget class with just a few samples, and easily restores the pre-unlearning performance.

Unlearning's Blind Spots: Over-Unlearning and Prototypical Relearning Attack 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (4)