Localizing Prompt Ambiguity in Large Language Models with Probe-Targeted Attribution 事件

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

Localizing Prompt Ambiguity in Large Language Models with Probe-Targeted Attribution arXiv:2606.05486v1 Announce Type: new Abstract: Prompt ambiguity is a common source of failure in large language models, but is difficult to localize because it is a latent property of the prompt, while existing attribution methods are designed to explain observable outputs such as logits or generated tokens. We introduce PRIG, a gradient attribution method that uses a probe logit to attribute latent ambiguity

Localizing Prompt Ambiguity in Large Language Models with Probe-Targeted Attribution · 相关技术