Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers arXiv:2602.07842v2 Announce Type: replace Abstract: Confidence calibration is essential for making large language models (LLMs) reliable, yet existing training-free methods have been primarily studied under single-answer question answering. In this paper, we show that these methods break down in the presence of multiple valid answers, where disagreement among equally correct responses leads to systematic undere

Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers · 相关公司

P
PonCOMPANY
D
DowCOMPANY
I
ISCCOMPANY
A
arXivNONPROFIT
G
GLENONPROFIT
S
SpanNONPROFIT
A
ACTNONPROFIT
A
ActuaNONPROFIT
R
RatioRESEARCH_INSTITUTE