WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation arXiv:2605.25874v1 Announce Type: new Abstract: Interactive world models are advancing rapidly, yet existing benchmarks cover only part of the required competencies, leaving no unified standard for systematic evaluation. To fill this gap, we introduce WBench, a comprehensive multi-turn benchmark for interactive world model evaluation along five dimensions, namely video quality, setting adherence, interacti