Vision-Language Models Mistake Head Orientation for Gaze Direction: Nonverbal Conversation Cues 事件
SHUTDOWN2026-06-02影响: LOW
Vision-Language Models Mistake Head Orientation for Gaze Direction: Nonverbal Conversation Cues arXiv:2506.05412v4 Announce Type: replace Abstract: Where someone looks is a nonverbal communication cue that children and adults readily use. How well can Vision-Language Models (VLMs) infer gaze targets? To construct evaluation stimuli, we captured 1,360 real-world photos of scenes in which a person gazes at one of several objects on a table. Importantly, we also controlled the gazer's head orienta