Selective Latent Thinking: Adaptive Compression of LLM Reasoning Chains 文章

ArXiv CS.CL2026-05-26NEWSen作者: Hui Xie, Jie Liu, Ziyue Qiao, Joaquin Vanschore

摘要

arXiv:2605.25745v1 Announce Type: new Abstract: Explicit chain-of-thought (CoT) reasoning substantially improves the reasoning ability of large language models (LLMs), but incurs high inference cost due to lengthy autoregressive traces. Existing latent reasoning methods offer a promising alternative, yet they often treat reasoning as uniformly compressible, causing precision-critical intermediate steps to be overly compressed and thereby degrading reasoning accuracy. In this work, we propose Selective Latent Thinking (SLT), a framework that selectively compresses redundant reasoning spans into latent representations while preserving precision-critical spans as explicit CoT within the same reasoning trajectory. Specifically, SLT first uses a lightweight decoder to anticipate a short upcoming reasoning span, and then applies confidence-based gating to determine the longest span that can be reliably compressed.

Selective Latent Thinking: Adaptive Compression of LLM Reasoning Chains 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (6)