Correcting Visual Blur Induced by Attention Distraction to Reduce Hallucinations: Algorithm and Theory 文章

ArXiv CS.CV2026-06-04NEWSen作者: Quanjiang Li, Zhiming Liu, Wei Luo, Tingjin Luo, Chenping Hou

摘要

arXiv:2605.24602v2 Announce Type: replace Abstract: Multimodal large language models (MLLMs) frequently suffer from object hallucinations, yet the visual perceptual mechanism underlying this failure remains poorly understood. In this work, we reveal that hallucinations are strongly associated with a human-like attention distraction phenomenon, where humans under divided focus experience degraded visual clarity and produce inaccurate descriptions, while in models the same mechanism manifests as spatial inconsistency in multi-head attention and temporal fading of attention to image tokens during decoding. We further provide theoretical insights that attention dispersion increases model complexity and degrades classification generalization.

Correcting Visual Blur Induced by Attention Distraction to Reduce Hallucinations: Algorithm and Theory 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (1)