Spectral Scaling Laws of Muon 事件

PRODUCT_LAUNCH2026-06-04影响: MEDIUM

Spectral Scaling Laws of Muon arXiv:2606.04058v1 Announce Type: cross Abstract: Orthonormalized update rules have rapidly become a leading choice of optimizer for training large language models, with recent open-source state-of-the-art models adopting Muon. To keep these updates tractable, Muon performs the orthonormalization with the Newton--Schulz (NS) iteration. Since NS is only approximate, directions with small singular values fail to be orthonormalized. In Muon, NS is applied to the momen

Spectral Scaling Laws of Muon · 相关产品