Diagnosing Live Within-Policy Instruction Conflicts in LLM Agents with Witnessed Resolution Profiles 文章

ArXiv CS.AI2026-05-28NEWSen作者: Lu Yan, Xuan Chen, Xiangyu Zhang

摘要

arXiv:2605.27784v1 Announce Type: new Abstract: LLM agents are governed by long-lived natural-language prompt policies, but individually reasonable standing rules can interact in uninspected ways. We study live intra-policy rule-conflict diagnosis: finding rule pairs inside a single prompt policy that can co-govern a realistic state, and measuring how models resolve that pressure in responses or tool actions. We introduce WIRE, a Witnessed Intra-policy Rule Evaluation pipeline. WIRE extracts source-grounded rules, encodes them as PyRule clauses, uses satisfiability checks to retain same-surface hard-collision candidates, realizes those candidates as concrete co-governance witnesses, and judges model outputs against the original source-rule text.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据