Attention Projection Mixing with Exogenous Anchors 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Attention Projection Mixing with Exogenous Anchors arXiv:2601.08131v4 Announce Type: replace Abstract: Cross-layer reuse of early attention projections can improve optimization and data efficiency, but it creates a structural conflict: the first layer must simultaneously act as a stable, reusable anchor for all deeper layers and as an effective computational block. We demonstrate that this tension constrains the performance of internal-anchor designs. We propose ExoFormer, which resolves the co