Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models 文章

ArXiv CS.CL2026-05-29NEWSen作者: Leijiang Gu, Zhen Zeng, Feng Li, Xinjian Gao, Zenglin Shi

摘要

arXiv:2605.29826v1 Announce Type: new Abstract: Existing methods in Multimodal Knowledge Editing (MKE) have advanced the ability to correct outdated or inaccurate knowledge in Multimodal Large Language Models (MLLMs). However, they exhibit a critical limitation: while effectively modifying target factual pairs, they fail to generalize edits to logically related queries and often cause unintended alterations to unrelated but visually or semantically linked information. We identify and formalize two underlying failure modes causing this issue: Causal Misalignment, which confines edits to the specific sample, and Feature Entanglement, which causes unintended alterations to coupled but irrelevant information. To address these issues, we propose Localized and Disentangled Knowledge Editing (LDKE), a new framework that achieves precise and generalized editing by localizing fact-specific model layers and disentangling target-relevant inputs from irrelevant ones.