MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents arXiv:2512.12634v4 Announce Type: replace Abstract: Mobile GUI Agents, AI agents capable of interacting with mobile applications on behalf of users, have the potential to transform human computer interaction. However, current evaluation practices for GUI agents face two fundamental limitations. First, they either rely on single path offline benchmarks or online live benchmarks. Offline benchmarks using static, single path annotate