Task Structure Reverses Layerwise State Encoding in Sequence Models 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Task Structure Reverses Layerwise State Encoding in Sequence Models arXiv:2606.00926v1 Announce Type: cross Abstract: Mechanistic studies of sequence models often treat layerwise state encodings as architectural traits: recurrent models concentrate readable state, attention-based models distribute it. We find that the same architecture reverses this profile when the task changes. Across Transformers, Mamba, Mamba-2, LSTMs, and GRUs, Parity is concentrated late in Mamba and the recurrent baselin
相关产品查看全部 (10)
相关报道查看全部 (1)
Task Structure Reverses Layerwise State Encoding in Sequence Models
ArXiv CS.CL2026-06-02