Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning 文章

ArXiv CS.CL2026-05-26NEWSen作者: Jinghan Jia, Joe Benton, Eric Easley

摘要

arXiv:2605.24286v1 Announce Type: cross Abstract: Chain-of-thought (CoT) reasoning is useful for monitoring language models only when the reasoning trace faithfully reflects the computation that produces the final answer. However, models can rely on prompt-to-answer shortcuts that bypass the CoT, making the visible reasoning trace misleading even when it appears plausible. We study CoT faithfulness through a structural information-flow perspective: faithful reasoning should route answer-relevant information through the mediated path from prompt to CoT to answer, rather than through a direct prompt-to-answer shortcut. This perspective yields a task-agnostic framework based on three complementary properties, sufficiency, completeness, and necessity, which we instantiate with entropy-based, masked-KL, and gradient-based diagnostics.

Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)