Learning to Remember, Learn, and Forget in Attention-Based Models 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Learning to Remember, Learn, and Forget in Attention-Based Models arXiv:2602.09075v3 Announce Type: replace-cross Abstract: In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on complex sequence processing tasks. However, in gated linear attention models, this memory has a fixed capacity and is prone to interference, especially for long sequences. We propose Palimpsa, a self-attention model that views ICL as a contin