MAAT: Multi-phase Adapter-Aware Targeted Unlearning 文章

ArXiv CS.CL2026-06-01NEWSen作者: Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain, Aman Chadha, Amitava Das

摘要

arXiv:2605.30514v1 Announce Type: cross Abstract: Machine unlearning evaluation is structurally skewed: Why-type questions, which probe causal and relational knowledge, comprise less than 0.06% of CounterFact, 0.6% of ZSRE, and less than 1.3% of TOFU, MUSE, and WMDP-Cyber. This near-zero representation means that methods that fail on causal knowledge can score highly in aggregate, and this failure is undetectable without balanced evaluation. We present 5WBENCH, a balanced 5,000-sample benchmark with 1,000 examples per 5W category (Who, What, When, Where, Why), making causal unlearning failures quantifiable for the first time. Using 5WBENCH, we show that no existing baseline simultaneously achieves high forgetting and high retention on Why-type questions: aggressive forgetting degrades retained knowledge, while conservative methods fail to forget causal facts. Why-type difficulty stems from multi-hop reasoning chains (44% of Why entries vs.

MAAT: Multi-phase Adapter-Aware Targeted Unlearning 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (12)

相关技术查看全部 (1)