Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? 事件

OPEN_SOURCE2026-06-02影响: MEDIUM

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? arXiv:2606.01247v1 Announce Type: new Abstract: Humans can reproduce the viewpoint specified by a target image through active head and body motion, yet spatial intelligence in foundation models has largely been studied as passive understanding of pre-collected observations. We introduce Target Viewpoint Reproduction (TVR) -- an active task where an agent adjusts its viewpoint in a 3D environment until its

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? · 相关技术