Detecting Unfaithful Chain-of-Thought via Circuit-Guided Internal-External Discrepancy 文章

ArXiv CS.AI2026-05-26NEWSen作者: Xu Shen, Zhen Tan, Song Wang, Pingjun Hong, Rui Miao, Xin Wang, Tianlong Chen

Detecting Unfaithful Chain-of-Thought via Circuit-Guided Internal-External Discrepancy · 相关技术