Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution 文章

ArXiv CS.AI2026-06-09NEWSen作者: Xiaoou Liu, Tiejin Chen, Dengjia Zhang, Yaqing Wang, Lu Cheng, Hua Wei

详细信息

来源站点: ArXiv CS.AI
作者: Xiaoou Liu, Tiejin Chen, Dengjia Zhang, Yaqing Wang, Lu Cheng, Hua Wei
文章类型: NEWS
语言: en
发布日期: 2026-06-09

摘要

arXiv:2605.19228v2 Announce Type: replace-cross Abstract: Large Language Models have achieved strong performance on reasoning tasks with objective answers by generating step-by-step solutions, but diagnosing where a multi-step reasoning trace might fail remains difficult. Confidence estimation offers a diagnostic signal, yet existing methods are restricted to final answers or require internal model access. In this paper, we introduce Stepwise Confidence Attribution (SCA), a framework for closed-source LLMs that assigns step-level confidence based only on generated reasoning traces. SCA applies the Information Bottleneck principle: steps aligning with consensus structures across correct solutions receive high confidence, while deviations are flagged as potentially erroneous.

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (2)