On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning arXiv:2605.27083v1 Announce Type: new Abstract: Counterfactual tuning (CFT) has emerged as a promising paradigm for Large Language Model (LLM) unlearning by training models to generate alternative fictitious knowledge in place of undesired content. However, in this work, we find that this paradigm still underperforms other paradigms in some aspects, and identify two previously overlooked pitfalls underlying this gap: (1)
相关产品查看全部 (10)
相关报道查看全部 (1)
On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning
ArXiv CS.CL2026-05-27