Distillation of Large Language Models via Concrete Score Matching 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Distillation of Large Language Models via Concrete Score Matching arXiv:2509.25837v3 Announce Type: replace-cross Abstract: Large language models (LLMs) deliver remarkable performance but are costly to deploy, motivating knowledge distillation (KD) for efficient inference. Existing KD objectives typically match student and teacher probabilities via softmax, which blurs valuable logit information. While direct logit distillation (DLD) mitigates softmax smoothing, it fails to account for logit sh
Distillation of Large Language Models via Concrete Score Matching · 相关人物
暂无数据