Muon in Vision Transformers: Optimizer-Recipe Interactions and Gradient Spectra 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Muon in Vision Transformers: Optimizer-Recipe Interactions and Gradient Spectra arXiv:2605.24770v1 Announce Type: cross Abstract: Muon is a recently developed matrix-aware optimizer that has shown strong results in transformer training, but its behavior in vision transformers (ViTs) is not yet well understood. We study Muon for ViT training, largely on ImageNet-100 and Pl@ntNet-300K, comparing against AdamW under standard vision recipes involving mixup, cutmix, smoothing, and random augmentatio
相关产品查看全部 (10)
相关报道查看全部 (1)
Muon in Vision Transformers: Optimizer-Recipe Interactions and Gradient Spectra
ArXiv CS.CV2026-05-26