Are VLMs Seeing or Just Saying? Uncovering the Illusion of Visual Re-examination 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Are VLMs Seeing or Just Saying? Uncovering the Illusion of Visual Re-examination arXiv:2605.15864v2 Announce Type: replace Abstract: Vision-Language Models (VLMs) often produce self-reflective statements like "let me check the figure again" during reasoning. Do such statements trigger genuine visual re-examination, or are they merely learned textual patterns? We investigate this via VisualSwap, an image-swap probing framework: after a model reasons over an image, we replace it with a visually s