Pruning and Distilling Mixture-of-Experts into Dense Language Models 文章

ArXiv CS.CL2026-05-28NEWSen作者: Junhyuck Kim, Jihun Yun, Haechan Kim, Gyeongman Kim, Joonghyun Bae, Jaewoong Cho

Pruning and Distilling Mixture-of-Experts into Dense Language Models · 相关人物

暂无数据