Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling arXiv:2605.26496v1 Announce Type: cross Abstract: The Mixture of Experts MoE architecture is highly promising for resource constrained on device deployments yet training these models from scratch incurs prohibitive costs Current methods attempt to alleviate this by upcycling dense models into MoEs however they often introduce parameter redundancy that degrades inference efficiency Alternatively standard l