Why Far Looks Up: Probing Spatial Representation in Vision-Language Models 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models arXiv:2605.30161v1 Announce Type: new Abstract: Vision-language models (VLMs) achieve strong performance on spatial reasoning benchmarks, yet it remains unclear whether this reflects structured 3D understanding or reliance on statistical shortcuts in natural images. We introduce a representation-level analysis framework that constructs minimal contrastive pairs to measure how spatial axes are organized and disentangled w
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models · 相关报道
相关报道
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models
ArXiv CS.CV2026-05-29