Task Structure Reverses Layerwise State Encoding in Sequence Models 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Task Structure Reverses Layerwise State Encoding in Sequence Models arXiv:2606.00926v1 Announce Type: cross Abstract: Mechanistic studies of sequence models often treat layerwise state encodings as architectural traits: recurrent models concentrate readable state, attention-based models distribute it. We find that the same architecture reverses this profile when the task changes. Across Transformers, Mamba, Mamba-2, LSTMs, and GRUs, Parity is concentrated late in Mamba and the recurrent baselin