Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge arXiv:2606.01104v1 Announce Type: new Abstract: VRR-QA evaluates whether video-language systems can infer spatial, temporal, viewpoint, depth, and visibility relations that are not always resolved by a single frame. We present an inference-only system built around adaptive test-time computation. The system first answers each question with a direct video-language model pass, then uses multiple lightweight view