Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Faithful or Fabricated? A Causal Framework for Rationalization Bias in LLM Judges arXiv:2605.23970v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as automatic judges for summarization and dialogue evaluation. Prior work has documented biases such as position, verbosity, and style preferences, but largely focuses on outcomes, leaving judge explanations underexplored. We instead ask whether LLM judges are cue-invariant, i.e., whether their rankings and explanati