Distillation of Large Language Models via Concrete Score Matching 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Distillation of Large Language Models via Concrete Score Matching arXiv:2509.25837v3 Announce Type: replace-cross Abstract: Large language models (LLMs) deliver remarkable performance but are costly to deploy, motivating knowledge distillation (KD) for efficient inference. Existing KD objectives typically match student and teacher probabilities via softmax, which blurs valuable logit information. While direct logit distillation (DLD) mitigates softmax smoothing, it fails to account for logit sh
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Distillation of Large Language Models via Concrete Score Matching
ArXiv CS.AI2026-06-02