Why Far Looks Up: Probing Spatial Representation in Vision-Language Models 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models arXiv:2605.30161v1 Announce Type: new Abstract: Vision-language models (VLMs) achieve strong performance on spatial reasoning benchmarks, yet it remains unclear whether this reflects structured 3D understanding or reliance on statistical shortcuts in natural images. We introduce a representation-level analysis framework that constructs minimal contrastive pairs to measure how spatial axes are organized and disentangled w
相关产品查看全部 (10)
相关报道查看全部 (1)
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models
ArXiv CS.CV2026-05-29