ReflexGrad: Within-Episode Failure Recovery in LLM Agents via Progress-Gated Dual-Process Routing 文章

ArXiv CS.AI2026-05-28NEWSen作者: Ankush Kadu, Aswanth Krishnan

摘要

arXiv:2511.14584v3 Announce Type: replace-cross Abstract: We present ReflexGrad, a dual-process architecture for within-episode failure recovery in LLM agents without demonstrations. When agents commit to a wrong approach early and exhaust the step budget, the post-failure trajectory contains the information to escape -- but no published architecture acts on it within a single episode. ReflexGrad routes between a fast process (TextGrad-style continuous refinement every $k{=}3$ steps) and a slow process (Reflexion-style causal diagnosis when $m{=}5$ consecutive low-progress scores fire a routing gate). A deterministic priority merge keeps the natural-language policy coherent, and each slow activation emits three observable artifacts: a reproducible trigger, a causal diagnostic, and a verified fix. On ALFWorld 134 tasks, $n{=}10$ seeds, no demonstrations, ReflexGrad lifts Qwen-3-8B from $35.1\%$ to $75.4\%$ ($+40.3$pp), beating compute-matched 1-shot LATS by $+2.7$pp ($p{\approx}0.

ReflexGrad: Within-Episode Failure Recovery in LLM Agents via Progress-Gated Dual-Process Routing 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (4)

相关技术查看全部 (5)