DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training 文章
ArXiv CS.AI2026-06-01NEWSen作者: Can Jin, Hongwu Peng, Mingcan Xiang, Qixin Zhang, Xiangchi Yuan, Amit Hasan, Ohiremen Dibua, Yifan Gong, Yan Kang, Dimitris N. Metaxas