Feature Learning Dynamics in Infinite-Depth Neural Networks 文章

ArXiv CS.AI2026-05-28NEWSen作者: Zihan Yao, Ruoyu Wu, Tianxiang Gao

摘要

arXiv:2512.21075v3 Announce Type: replace-cross Abstract: Deep neural networks have achieved remarkable success in practice, yet a mechanistic understanding of how features evolve during training remains incomplete, especially in the large-depth limit. For ResNets under depth-$\mu$P scaling, prior work treats the layer index $\ell$ as a continuous time $t_\ell = \ell/L$, yielding SDE descriptions of the training dynamics. A key unresolved issue is that backpropagation reuses each forward weight matrix $W_\ell$ through its transpose $W_\ell^\top$, creating correlations between forward features and backward gradients whose behavior and role in feature learning remain unclear. We study this reused-weight forward--backward coupling in one-layer ResNets under depth-$\mu$P. Using conditional Gaussian representations, we explicitly separate the coupling terms induced by weight reuse from decoupled Gaussian fluctuations before taking any network limit.

Feature Learning Dynamics in Infinite-Depth Neural Networks 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (4)