RECAP: Regression Evaluation for Continual Adaptation of Prompts 文章

ArXiv CS.CL2026-06-08NEWSen作者: Harsh Deshpande, Kushal Chawla, Sangwoo Cho, William Campbell

摘要

arXiv:2606.06698v1 Announce Type: cross Abstract: Production agentic systems routinely face evolving constraints and must comply from the very next interaction. Scenarios like a tool-call notification changing a compliance threshold or a policy update adding disclosure requirements fit this criteria, having close to no room for errors in production. This proactive adaptation setting is common in deployment, but absent from current benchmarks, which assume either static constraint sets or reactive protocols with evaluation feedback. We introduce RECAP, a benchmark that measures continual-learning phenomena (forgetting, regression, forward transfer) at the constraint level under a strictly proactive adapt-then-test protocol: prompt optimization methods receive only the constraint specification and must generalize before seeing any test data.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据