Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition arXiv:2604.18128v2 Announce Type: replace Abstract: We study post-training W4A4 quantization in a controlled 300M-parameter SwiGLU decoder-only language model trained on 5B tokens of FineWeb-Edu, and ask which input-activation sites dominate the error. Naive round-to-nearest W4A4 collapses validation perplexity from FP16 23.6 to 1727. A simple residual-axis training-time intervention -- Depth Registers with a register-magni
相关产品查看全部 (10)
相关报道查看全部 (1)
Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition
ArXiv CS.CL2026-05-26