StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning arXiv:2606.00148v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) often know the rule but pick the wrong answer: on abstract visual reasoning (AVR) tasks, a model can describe what it sees and name the underlying pattern, yet still fail to choose the matching candidate. Existing AVR benchmarks cannot detect this because they collapse perception, rule induction, and answer selection into