VeRO: A Harness for Agents to Optimize Agents 文章

ArXiv CS.CL2026-06-03NEWSen作者: Varun Ursekar, Apaar Shanker, Veronica Chatrath, Yuan Xue, Samuel Marc Denton

摘要

arXiv:2602.22480v4 Announce Type: replace-cross Abstract: An important emerging application of coding agents is agent harness optimization: the iterative improvement of a target agent by editing and evaluating its code. Despite its relevance, the community lacks a systematic understanding of coding agent performance on this task. Harness optimization differs from conventional software engineering: agent harnesses interleave deterministic code with stochastic LLM completions, requiring structured capture of both intermediate execution traces and downstream outcomes. To address these challenges, we introduce (1) VeRO (Versioning, Rewards, and Observations), an outer harness that provides versioned snapshots, budget-controlled evaluation, and structured execution traces of target harnesses, and (2) VeRO-Bench, a benchmark suite of target agents and tasks with reference evaluation procedures.

VeRO: A Harness for Agents to Optimize Agents 文章

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (6)

相关技术查看全部 (1)