Confidence-Adaptive SwiGLU for Mixture-of-Experts 事件

Name: Confidence-Adaptive SwiGLU for Mixture-of-Experts
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Confidence-Adaptive SwiGLU for Mixture-of-Experts arXiv:2606.00761v1 Announce Type: cross Abstract: SwiGLU has become a standard gated activation in modern Transformer MLPs, yet its gate sharpness -- the smoothness and selectivity of the gating function -- is typically fixed throughout training. In this work, we propose Confidence-Aware SwiGLU ($\kappa$-SwiGLU), a variant of SwiGLU for Mixture-of-Experts (MoE) models that adjusts expert gate sharpness according to token-level routing confidence

人工智能

关系图谱

Confidence-Adaptive SwiGLU for Mixture-of-Experts 事件

相关公司查看全部 (7)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)