Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers 事件

Name: Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers
Start: 2026-06-03

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers arXiv:2602.07842v2 Announce Type: replace Abstract: Confidence calibration is essential for making large language models (LLMs) reliable, yet existing training-free methods have been primarily studied under single-answer question answering. In this paper, we show that these methods break down in the presence of multiple valid answers, where disagreement among equally correct responses leads to systematic undere

人工智能

关系图谱

Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers · 相关公司

PonCOMPANY

DowCOMPANY

ISCCOMPANY

Abstract

arXivNONPROFIT

GLENONPROFIT

SpanNONPROFIT

ACTNONPROFIT

ActuaNONPROFIT

RatioRESEARCH_INSTITUTE