Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning 事件

Name: Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning arXiv:2606.02132v1 Announce Type: new Abstract: Agentic reinforcement learning can induce tool abuse, where models overuse external tools even for queries solvable by internal reasoning. Existing approaches mitigate this issue with uniform tool-use penalties or hard limits, which reduce tool frequency but may also suppress useful tool-assisted exploration. We propose EAPO, an Efficient Agentic Policy Optimization

人工智能

关系图谱