MacArena: Benchmarking Computer Use Agents on an Online macOS Environment 事件
PRODUCT_LAUNCH2026-06-08影响: MEDIUM
MacArena: Benchmarking Computer Use Agents on an Online macOS Environment arXiv:2606.06560v1 Announce Type: cross Abstract: Computer-use agents (CUAs) operate graphical user interfaces (GUIs) through vision and control primitives, and their capabilities have advanced rapidly, driven in part by standardized online evaluation benchmarks such as OSWorld, which serve both as evaluation tools and as training environments for reinforcement learning. However, macOS remains underserved in this landscap
MacArena: Benchmarking Computer Use Agents on an Online macOS Environment · 相关报道
相关报道
MacArena: Benchmarking Computer Use Agents on an Online macOS Environment
ArXiv CS.AI2026-06-08