摘要
arXiv:2605.27288v1 Announce Type: new Abstract: Large language models (LLMs) are known to abandon their initial stance to conform to user pushback. While prior research largely attributes this behavior to sycophancy learned during reinforcement learning from human feedback, we hypothesize that conformity is also driven by a model's epistemic uncertainty at inference time. In this paper, we introduce MUSE, a two-stage evaluation framework to disentangle the mechanisms driving LLM conformity. Specifically, MUSE maps a model's epistemic uncertainty in responding to a query against its likelihood to yield to user pushback in a subsequent turn. We demonstrate that the mechanisms driving conformity extend beyond sycophancy alone.
相关事件查看全部 (1)
It's Not Always Sycophancy: Measuring LLM Conformity as a Function of Epistemic Uncertainty
2026-05-27PRODUCT_LAUNCH影响: MEDIUM
相关人物
暂无数据