Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling arXiv:2605.26496v1 Announce Type: cross Abstract: The Mixture of Experts MoE architecture is highly promising for resource constrained on device deployments yet training these models from scratch incurs prohibitive costs Current methods attempt to alleviate this by upcycling dense models into MoEs however they often introduce parameter redundancy that degrades inference efficiency Alternatively standard l