PowLU: An Activation Function for Stable Pre-Training of LLMs 事件
REGULATION2026-05-26影响: MEDIUM
PowLU: An Activation Function for Stable Pre-Training of LLMs arXiv:2605.25704v1 Announce Type: new Abstract: In contemporary large language models (LLMs), the swish-gated linear unit (SwiGLU) activation function is widely adopted to regulate the information flow and introduce non-linearity. For large positive inputs, SwiGLU approximates the quadratic function $x^2$, providing strong nonlinearity and expressive capacity. However, this property also causes numerical instability as the input or m
相关公司查看全部 (10)
相关产品查看全部 (10)
相关报道查看全部 (1)
PowLU: An Activation Function for Stable Pre-Training of LLMs
ArXiv CS.CL2026-05-26