Imbuing Large Language Models with Bidirectional Logic for Robust Chain Repair 文章

ArXiv CS.CL2026-06-04NEWSen作者: Zehua Cheng, Wei Dai, Jiahao Sun, Thomas Lukasiewicz

摘要

arXiv:2606.05030v1 Announce Type: new Abstract: Autoregressive chain-of-thought (CoT) reasoning in large language models (LLMs) is fundamentally forward-directed: each step conditions only on prior tokens. This unidirectional inductive bias renders even capable models susceptible to error snowballing, wherein a single logical or arithmetic mistake in an early step irreversibly corrupts the entire reasoning chain. We introduce Teleological Reasoning Infilling (\TRI{}), a training framework that endows decoder-only transformers with a native \emph{goal-conditioned bridging} capability. The key insight is to reframe erroneous reasoning segments as fill-in-the-middle (FIM) tasks: given a verified prefix premise $P$, a verified downstream milestone $S$, and the original query $Q$, the model must synthesise the logical bridge $M$ that connects $P$ to $S$ rigorously and completely.