Language Models Need Sleep 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Language Models Need Sleep arXiv:2605.26099v1 Announce Type: new Abstract: Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with context length. To handle this, we study a sleep-like consolidation mechanism in which a model periodically converts recent context into persistent fast weights before clearing its key-value cache. During sleep, the model performs $N$ offline recurrent passes over the accumulated con
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference
ArXiv CS.CL2026-05-28