AMEL: Accumulated Message Effects on LLM Judgments 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

AMEL: Accumulated Message Effects on LLM Judgments arXiv:2605.22714v2 Announce Type: replace-cross Abstract: Large language models are routinely used as automated evaluators: to review code, moderate content, or score outputs, often with many items passing through one conversation. We ask whether the polarity of prior conversation history biases subsequent judgments, an effect we call the accumulated message effect on LLM judgments (AMEL). Across 75,898 API calls to 11 models from 4 providers (