Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent arXiv:2605.27078v1 Announce Type: cross Abstract: Training loss and accuracy are the standard signals used to monitor generalization during deep neural network training. Two well-documented phenomena complicate this picture: in grokking, train loss falls rapidly while test performance improves abruptly only after a long delay; in epoch-wise double descent, train loss decreases monotonically while test

Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent · 相关公司

A
arXivNONPROFIT
I
ISESNONPROFIT
F
FrameworkCOMPANY
E
EARNNONPROFIT
A
ACTNONPROFIT
R
RatioRESEARCH_INSTITUTE