Tool Calling is Linearly Readable and Steerable in Language Models 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Tool Calling is Linearly Readable and Steerable in Language Models arXiv:2605.07990v2 Announce Type: replace Abstract: When a tool-calling agent picks the wrong tool, the failure is invisible until execution: the email gets sent, the meeting gets missed. As agents take on consequential actions, one bad tool call can do real damage. We currently have no way to look inside the model and catch the mistake before it happens; this paper shows that we can. Inside the model, the choice of tool is carr