SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence arXiv:2606.02380v1 Announce Type: new Abstract: As LLM-based agents expand their operational scope, reliability becomes a prerequisite for real-world deployment. However, in practical applications, human users cannot monitor every immediate behavior; instead, the execution process often remains a black box, leaving users dependent solely on the agent's self-reported updates. This opacity creates a criti

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence · 相关技术