Language Models Need Sleep 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Language Models Need Sleep arXiv:2605.26099v1 Announce Type: new Abstract: Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with context length. To handle this, we study a sleep-like consolidation mechanism in which a model periodically converts recent context into persistent fast weights before clearing its key-value cache. During sleep, the model performs $N$ offline recurrent passes over the accumulated con