From Brewing to Resolution: Tracing the Internal Lifecycle of Code Reasoning in LLMs 文章

ArXiv CS.AI2026-06-17NEWSen作者: Siyue Chen, Yifu Guo, Yuquan Lu, Zishan Xu, Jiaye Lin, Jianbo Lin, Siyu Zhang, Cheng Yang, Junxin Li, Yujia Li, Yu Huo, Ruixuan Wang

查看原文 →

关系图谱

详细信息

来源站点: ArXiv CS.AI
作者: Siyue Chen, Yifu Guo, Yuquan Lu, Zishan Xu, Jiaye Lin, Jianbo Lin, Siyu Zhang, Cheng Yang, Junxin Li, Yujia Li, Yu Huo, Ruixuan Wang
文章类型: NEWS
语言: en
发布日期: 2026-06-17

原文

摘要

arXiv:2606.17648v1 Announce Type: new Abstract: Standard accuracy metrics cannot explain why LLMs handle variable tracking but fail on semantically equivalent loops. We study an internal lifecycle of code reasoning in which models first brew the answer, making it linearly recoverable many layers before it becomes self-decodable, and then diverge into one of four resolution outcomes: Resolved, Overprocessed, Misresolved, or Unresolved. Understanding this lifecycle matters because similar task accuracies can mask fundamentally different failure modes that surface-level evaluation cannot detect. We introduce a dual diagnostic framework pairing layer-wise linear probing with Context-Stripped Decoding (CSD) and apply it to six code-reasoning task families across 16 models spanning Qwen, Llama, and DeepSeek architectures. All four outcomes carry substantial mass in every task family: overall Resolved is only 41.5%, with multiple tasks below 30%.

From Brewing to Resolution: Tracing the Internal Lifecycle of Code Reasoning in LLMs 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (12)

相关技术查看全部 (3)