Large Language Models Are Overconfident in Their Own Responses 文章

ArXiv CS.CL2026-06-03NEWSen作者: Mario Sanz-Guerrero, Manuel Mager, Katharina von der Wense

摘要

arXiv:2606.03437v1 Announce Type: new Abstract: Prior work has shown that instruction-tuned large language models (LLMs) are less well calibrated than their base pre-trained counterparts. However, little is known about the frequently used chat template's effect on the calibration of conversational LLMs. In this work, we investigate the mechanisms driving this miscalibration by decoupling the effects of the post-training algorithm and the chat format. We find that, while instruction tuning fundamentally harms calibration, the chat template aggravates the issue through an "ownership bias" -- models are significantly more confident in their own answers than in identical answers provided by a user. Extensive experiments across six recent open-weight LLMs, three benchmarks, and three confidence elicitation methods show that models assign up to 26% higher confidence to their own responses.

Large Language Models Are Overconfident in Their Own Responses 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (1)