摘要
arXiv:2605.27465v1 Announce Type: new Abstract: The quadratic cost of self-attention in Vision Transformers (ViTs) constitutes a fundamental bottleneck for practical deployment, motivating a vibrant line of research on token reduction. Among existing approaches, token merging (ToMe) has emerged as an elegant training-free solution; yet its design rests on an unspoken premise of token equality, which contravenes the well-documented non-uniformity of self-attention and leads to information loss in high-salience tokens under aggressive compression. We address this limitation with AdaMerge, a token-merging framework based on two complementary mechanisms. First, salience-weighted similarity leverages column-wise feature-affinity centrality as a token-importance proxy and incorporates the resulting salience scores into the bipartite matching score, ensuring that pivotal tokens contribute more strongly to the merged representation.
相关事件查看全部 (1)
相关公司
暂无数据
相关人 物
暂无数据
相关产品
暂无数据