Task Structure Reverses Layerwise State Encoding in Sequence Models 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Task Structure Reverses Layerwise State Encoding in Sequence Models arXiv:2606.00926v1 Announce Type: cross Abstract: Mechanistic studies of sequence models often treat layerwise state encodings as architectural traits: recurrent models concentrate readable state, attention-based models distribute it. We find that the same architecture reverses this profile when the task changes. Across Transformers, Mamba, Mamba-2, LSTMs, and GRUs, Parity is concentrated late in Mamba and the recurrent baselin
Task Structure Reverses Layerwise State Encoding in Sequence Models · 相关报道
相关报道
Task Structure Reverses Layerwise State Encoding in Sequence Models
ArXiv CS.CL2026-06-02