Do Models Know Why They Changed Their Mind? Interpretability and Faithfulness of Chain-of-Thought Under Knowledge Conflict 文章

ArXiv CS.CL2026-05-28NEWSen作者: Pruthvinath Jeripity Venkata

Do Models Know Why They Changed Their Mind? Interpretability and Faithfulness of Chain-of-Thought Under Knowledge Conflict · 相关技术