Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw 文章

ArXiv CS.AI2026-06-16NEWSen作者: Hongwei Yao, Yiming Liu, Yiling He, Bingrun Yang

详细信息

来源站点: ArXiv CS.AI
作者: Hongwei Yao, Yiming Liu, Yiling He, Bingrun Yang
文章类型: NEWS
语言: en
发布日期: 2026-06-16

摘要

arXiv:2605.11047v2 Announce Type: replace-cross Abstract: Agentic language-model systems increasingly rely on mutable execution contexts, including files, memory, tools, skills, and auxiliary artifacts, creating security risks beyond explicit user prompts. This paper presents DeepTrap, an automated framework for discovering contextual vulnerabilities in OpenClaw. DeepTrap formulates adversarial context manipulation as a black-box trajectory-level optimization problem that balances risk realization, benign-task preservation, and stealth. It combines risk-conditioned evaluation, multi-objective trajectory scoring, reward-guided beam search, and reflection-based deep probing to identify high-value compromised contexts. We construct a 42-case benchmark spanning six vulnerability classes and seven operational scenarios, and evaluate nine target models using attack and utility grading scores.

Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (6)

相关技术