SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent Benchmarking 事件

OPEN_SOURCE2026-05-26影响: MEDIUM

SimuWoB: Simulating Real-World Mobile Apps for Fast and Faithful GUI Agent Benchmarking arXiv:2605.25160v1 Announce Type: new Abstract: Mobile GUI agents powered by large language models have progressed rapidly, creating urgent needs for realistic and comprehensive evaluation. Existing benchmarks prioritize reproducibility but are often limited to open-source apps or file-operation tasks for the difficulty of constructing rewards on real applications, leaving a gap between benchmark settings an