Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers 事件
PRODUCT_LAUNCH2026-06-03影响: MEDIUM
Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers arXiv:2602.07842v2 Announce Type: replace Abstract: Confidence calibration is essential for making large language models (LLMs) reliable, yet existing training-free methods have been primarily studied under single-answer question answering. In this paper, we show that these methods break down in the presence of multiple valid answers, where disagreement among equally correct responses leads to systematic undere
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers
ArXiv CS.CL2026-06-03