Trading Complexity for Expressivity Through Structured Generalized Linear Token Mixing 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

Trading Complexity for Expressivity Through Structured Generalized Linear Token Mixing arXiv:2605.31367v1 Announce Type: cross Abstract: Token mixing layers play a key role in how language models can learn and generate long-range dependencies. Their efficiency relies on the necessary trade-off between decoding speed and the memory requirements, along with the cache size. Considering causal generation, this paper explores new trade-offs thanks to a unified framework which separates two crucial f