Prompt Injection as Role Confusion 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

Prompt Injection as Role Confusion arXiv:2603.12277v5 Announce Type: replace Abstract: LLMs see the world as a single stream of text, partitioned into roles like or . We trace prompt injection to role confusion: models perceive the source of text from how it sounds, not its labeled role. A command hidden in a webpage hijacks an agent simply because it sounds like text, despite its label. We design role probes to measure how LLMs internally perceive "who is speaking," and find that injected t