Where does Absolute Position come from in decoder-only Transformers? 事件
PRODUCT_LAUNCH2026-06-05影响: MEDIUM
Where does Absolute Position come from in decoder-only Transformers? arXiv:2606.06160v1 Announce Type: cross Abstract: RoPE-trained transformers distinguish absolute position in their attention patterns, even though RoPE encodes only relative offsets in the inner product. We trace this leakage to two architectural components, The causal mask is responsible for the first: its per-query softmax denominator depends on the absolute query position by construction. The residual stream supplies the se
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Where does Absolute Position come from in decoder-only Transformers?
ArXiv CS.CL2026-06-05