Distillation of Large Language Models via Concrete Score Matching 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Distillation of Large Language Models via Concrete Score Matching arXiv:2509.25837v3 Announce Type: replace-cross Abstract: Large language models (LLMs) deliver remarkable performance but are costly to deploy, motivating knowledge distillation (KD) for efficient inference. Existing KD objectives typically match student and teacher probabilities via softmax, which blurs valuable logit information. While direct logit distillation (DLD) mitigates softmax smoothing, it fails to account for logit sh

Distillation of Large Language Models via Concrete Score Matching · 相关报道