$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Ones 事件
PRODUCT_LAUNCH2026-06-03影响: MEDIUM
$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Ones arXiv:2606.03148v1 Announce Type: new Abstract: Robust visual classification often depends on localizing the main foreground objects in an image while ignoring contextual distractors. Surprisingly, we find that the attention maps of smaller self-supervised ViTs localize foreground objects better than those of larger ViTs. However, we still need large ViTs, because they extract richer representations from each patch. To get the
相关公司查看全部 (10)
相关产品查看全部 (10)
相关报道查看全部 (1)
$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Ones
ArXiv CS.CV2026-06-03