LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection 文章

ArXiv CS.AI2026-05-26NEWSen作者: Lei Zhao, Abhay Bhaskar, Edgar Dobriban

摘要

arXiv:2605.17986v2 Announce Type: replace-cross Abstract: AI agents such as OpenClaw are increasingly deployed in local workflows with access to external tools. This creates indirect prompt-injection (IPI) risk: an agent may execute harmful instructions embedded in untrusted inputs such as email, downloaded files, webpages, repositories, or group-chat messages. Existing evaluations are often small, purely simulated, or focused on a narrow set of channels. We introduce LivePI (Live Prompt Injection), a structured benchmark for IPI risk in a production-like but test-controlled environment. LivePI covers seven input surfaces, twelve attack/rendering families, and five malicious goals, including protected-information exfiltration, unauthorized security-control changes, unsafe code retrieval or execution, inbox-summary exfiltration, and cryptocurrency transfer.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据