Grouter: Decoupling Routing from Representation for Accelerated MoE Training 文章

ArXiv CS.AI2026-05-26NEWSen作者: Yuqi Xu, Rizhen Hu, Zihan Liu, Mou Sun, Kun Yuan

Grouter: Decoupling Routing from Representation for Accelerated MoE Training · 相关技术