Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning 事件

Name: Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning arXiv:2605.24216v1 Announce Type: cross Abstract: Monitoring autonomous large language model (LLM) agents for covert malicious behavior is challenging due to delayed, context-dependent, and long-horizon attack patterns. Agents may pursue hidden objectives while maintaining superficially benign behavior, making detection difficult even with full trajectory access. Prior monitoring approaches improve scaffolding or

人工智能

关系图谱

Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning 事件

Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning · 相关技术

相关技术