OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics arXiv:2606.09826v1 Announce Type: new Abstract: Vision-language model (VLM) agents are increasingly deployed in interactive game environments. Yet game benchmarks for VLM agents typically report a single first-attempt score per (agent, game) pair, focus on single-agent Solo play, and lack unified protocols for evaluating heterogeneous agent classes (commercial VLMs, open-weight VLMs, and specialized game polici

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics · 相关人物