Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning 文章

ArXiv CS.AI2026-06-01NEWSen作者: Yike Zhao, Onno Eberhard, Malek Khammassi, Ali H. Sayed, Michael Muehlebach

详细信息

来源站点: ArXiv CS.AI
作者: Yike Zhao, Onno Eberhard, Malek Khammassi, Ali H. Sayed, Michael Muehlebach
文章类型: NEWS
语言: en
发布日期: 2026-06-01

摘要

arXiv:2605.31261v1 Announce Type: cross Abstract: The family of linear recurrent neural networks has shown strong performance as recurrent memory units in partially observable reinforcement learning. We provide a theoretical justification for their empirical effectiveness by constructing and studying two linear filters: (i) the first exactly reproduces the pre-softmax logits of the belief vector in a hidden Markov model (HMM) under a deterministic transition matrix, thereby serving as a sufficient statistic for optimal policy learning, (ii) the second achieves vanishing state-decoding error under a nearly deterministic transition matrix, thus reducing state ambiguity to near zero. The results extend to action-controlled HMMs, where the corresponding linear filters become time-varying with action-dependent dynamics.

Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (6)