Reasoning-preserved Efficient Distillation of Large Language Models via Activation-aware Initialization 文章

ArXiv CS.CL2026-05-29NEWSen作者: Junlin He, Yihong Tang, Tong Nie, Guilong Li, Binyu Yang, Jinxiao Du, Lijun Sun, Wei Ma

Reasoning-preserved Efficient Distillation of Large Language Models via Activation-aware Initialization · 相关人物

暂无数据