DynMuon: A Dynamic Spectral Shaping View of Muon 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

DynMuon: A Dynamic Spectral Shaping View of Muon arXiv:2605.17109v3 Announce Type: replace-cross Abstract: In recent years, Muon has emerged as the dominant method for training large language models, and transformers more broadly. The essential difference, when compared to standard gradient descent methods, is to replace the usual update matrix $M=U\Sigma V^\top$ with its polar factor $UV^\top$. In this work, we consider a class of Muon-like updates, where we replace the update $M$ with $U\Sigm