摘要
arXiv:2606.03768v1 Announce Type: new Abstract: Extended chain-of-thought (CoT) traces improve LLM reasoning but incur substantial computational and memory costs. While existing CoT compression methods mitigate this by condensing thought steps into compact representations via memory tokens and retaining only these representations at inference time, the loss of fine-grained information makes subsequent steps more error-prone. To alleviate this, we propose \textbf{HybridThinker}, where in addition to preserved these representations, thought steps are also temporarily retained to provide fine-grained details. However, we observe that naively keeping thought steps accessible to subsequent steps \emph{during training} lets the model bypass memory tokens by retrieving information directly from these steps, leaving the model's ability to compress and retrieve information through memory tokens insufficiently trained.
相关事件查看全部 (1)
相关公司
暂无数据
相关人 物
暂无数据