CaliDist: Calibrating Large Language Models via Behavioral Robustness to Distraction 文章

ArXiv CS.CL2026-06-05NEWSen作者: Mohammad Anas Jawad, Cornelia Caragea

摘要

arXiv:2606.05799v1 Announce Type: cross Abstract: Existing calibration methods for Large Language Models (LLMs) often overlook a critical dimension of trustworthiness: a model's {\em behavioral robustness} to irrelevant or misleading information. In this paper, we argue that a model's true confidence should reflect its stability under cognitive pressure. We introduce \textsc{CaliDist}, a novel post-hoc calibration approach that directly measures and penalizes a model's susceptibility to distraction. \textsc{CaliDist} quantifies how an LLM's predictions and uncertainty change when its input prompt is perturbed with semantic \textit{distractors}. This stability (or lack thereof) signal is then used to adaptively scale the model's initial confidence score.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据