Mitigating Hallucinations in Large Vision-Language Models via Causal Route Gating 文章

ArXiv CS.CV2026-05-26NEWSen作者: Zhe Cheng, Wenyu Chen, Fode Zhang, Dehuan Shen

详细信息

来源站点: ArXiv CS.CV
作者: Zhe Cheng, Wenyu Chen, Fode Zhang, Dehuan Shen
文章类型: NEWS
语言: en
发布日期: 2026-05-26

摘要

arXiv:2605.24024v1 Announce Type: new Abstract: Large vision-language models (LVLMs) often hallucinate content that is fluent yet unsupported by the image, limiting their reliability in real-world deployment. We show that a key failure mode arises from route competition: even when visual tokens receive attention, the final token decision can be dominated by the textual pathway, causing the decoder to follow linguistic priors over visual evidence. To mitigate this, we propose a training-free, decision-aligned intervention that decomposes each attention head into a visual route and a text route, and estimates their token-level effects using an efficient one-forward/one-gradient approximation. These estimates reveal route conflict within heads and identify prior-dominant ones, enabling selective suppression of only the text route while keeping the visual route intact.

Mitigating Hallucinations in Large Vision-Language Models via Causal Route Gating 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (1)