PoseRefer: Pathway-Local Parameters for Semantically Grounded Reference Resolution 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
PoseRefer: Pathway-Local Parameters for Semantically Grounded Reference Resolution arXiv:2605.24622v1 Announce Type: cross Abstract: A robot resolving ``put the cup on that one'' must fuse gesture, language, and scene geometry, yet 3D grounding benchmarks only partially capture this regime: descriptions are written post-hoc, gestures are templated, or pointing is staged for the camera. MM-Conv captures natural co-speech gesture from dyadic VR interaction alongside full-body motion capture and 3
相关产品查看全部 (10)
相关报道查看全部 (1)
PoseRefer: Pathway-Local Parameters for Semantically Grounded Reference Resolution
ArXiv CS.CV2026-05-26