Inverse Depth Scaling From Most Layers Being Similar 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Inverse Depth Scaling From Most Layers Being Similar arXiv:2602.05970v2 Announce Type: replace-cross Abstract: Neural scaling laws relate loss to model size in large language models (LLMs), yet depth and width may contribute to performance differently, requiring more detailed studies. Here, we quantify how depth affects loss via analysis of LLMs and toy residual networks. We find loss scales inversely proportional to depth in LLMs, probably due to functionally similar layers reducing error thro
相关产品查看全部 (10)
相关报道查看全部 (1)
Inverse Depth Scaling From Most Layers Being Similar
ArXiv CS.AI2026-06-02