Understanding Safety-Sensitive Expert Behavior in Mixture-of-Experts LLMs 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

Understanding Safety-Sensitive Expert Behavior in Mixture-of-Experts LLMs arXiv:2605.29708v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) LLMs rely on sparse, router-driven expert activation, yet how safety alignment interacts with routed expert specialization remains underexplored. A common intuition is that safety behavior may be controlled by routing harmful requests to distinct refusal-oriented experts. In this work, we provide empirical evidence for a different picture: routing pa