The Necessity of Setting Temperature in LLM-as-a-Judge 事件

PRODUCT_LAUNCH2026-06-08影响: MEDIUM

The Necessity of Setting Temperature in LLM-as-a-Judge arXiv:2603.28304v2 Announce Type: replace Abstract: Using large language models (LLMs) as judges for evaluating model outputs has emerged as an important paradigm for automated evaluation. However, the choice of decoding temperature in LLM-as-a-judge settings is still largely chosen empirically, with limited systematic evidence on its impact. To address this gap, we conduct a systematic study of how temperature affects judgment behavior acr

The Necessity of Setting Temperature in LLM-as-a-Judge · 相关技术