Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents 文章

ArXiv CS.AI2026-06-06NEWSen作者: Dae Yon Hwang, Raunaq Suri, Valentin Villecroze, Anthony L. Caterini, Jesse C. Cresswell, No\"el Vouitsis, Brendan Leigh Ross

查看原文 →

关系图谱

详细信息

来源站点: ArXiv CS.AI
作者: Dae Yon Hwang, Raunaq Suri, Valentin Villecroze, Anthony L. Caterini, Jesse C. Cresswell, No\"el Vouitsis, Brendan Leigh Ross
文章类型: NEWS
语言: en
发布日期: 2026-06-06

原文

摘要

arXiv:2606.05296v1 Announce Type: cross Abstract: LLM agents operate in two distinct regimes: open-weight agents amenable to reinforcement learning (RL) and black-box agents whose behaviour must be controlled purely at test time. Although black-box agents are often backed by state-of-the-art proprietary LLMs, API-only access precludes parameter-level optimization, rendering most RL methods inapplicable. To address this limitation, we turn to a known equivalence between RL and Bayesian inference. We propose Agentic Monte Carlo (AMC) to directly sample from the optimal policy of a black-box agent rather than training it through RL. The optimal policy is a posterior over trajectories whose prior we define as the fixed black-box LLM agent. We employ Sequential Monte Carlo to sample from this posterior by learning a value function to steer the agent while leaving the underlying black-box model unchanged.

Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (5)