SAEmnesia: Erasing Concepts in Diffusion Models with Supervised Sparse Autoencoders 文章

ArXiv CS.CV2026-06-01NEWSen作者: Enrico Cassano, Riccardo Renzulli, Marco Nurisso, Mirko Zaffaroni, Alan Perotti, Marco Grangetto

查看原文 →

关系图谱

详细信息

来源站点: ArXiv CS.CV
作者: Enrico Cassano, Riccardo Renzulli, Marco Nurisso, Mirko Zaffaroni, Alan Perotti, Marco Grangetto
文章类型: NEWS
语言: en
发布日期: 2026-06-01

原文

摘要

arXiv:2509.21379v3 Announce Type: replace Abstract: Concept unlearning in diffusion models is hampered by feature splitting, where concepts are distributed across many latent features, making their removal challenging and computationally expensive. We introduce SAEmnesia, a supervised sparse autoencoder framework that overcomes this by enforcing one-to-one concept-neuron mappings. By systematically labeling concepts during training, our method achieves feature centralization, binding each concept to a single, interpretable neuron. This enables highly targeted and efficient concept erasure. Compared to the state-of-the-art sparse autoencoder-based unlearning approach, SAEmnesia reduces hyperparameter search by 96.67% and achieves a 9.22% improvement on the UnlearnCanvas benchmark for objects. Our method also shows superior scalability in sequential unlearning, improving accuracy by 28.

SAEmnesia: Erasing Concepts in Diffusion Models with Supervised Sparse Autoencoders 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (2)

相关技术查看全部 (9)