ToaSt: Token Channel Selection and Structured Pruning for Efficient ViT 文章

ArXiv CS.CV2026-06-16NEWSen作者: Hyunchan Moon, Cheonjun Park, Steven L. Waslander

详细信息

来源站点: ArXiv CS.CV
作者: Hyunchan Moon, Cheonjun Park, Steven L. Waslander
文章类型: NEWS
语言: en
发布日期: 2026-06-16

摘要

arXiv:2602.15720v3 Announce Type: replace Abstract: Vision Transformers (ViTs) have achieved remarkable success across various vision tasks, yet their deployment is often hindered by prohibitive computational costs. While structured weight pruning and token compression have emerged as promising solutions, they suffer from prolonged retraining and inter-layer dependencies that complicate optimization, respectively. We propose ToaSt, a decoupled framework applying specialized strategies to distinct ViT components. We apply coupled head-wise structured pruning to Multi-Head Self-Attention modules, leveraging attention operation characteristics to enhance robustness. For Feed-Forward Networks (over 60% of FLOPs), we introduce Token Channel Selection (TCS), a training-free method that filters redundant noise channels at inference time.

ToaSt: Token Channel Selection and Structured Pruning for Efficient ViT 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (4)

相关技术查看全部 (9)