Distillation of Large Language Models via Concrete Score Matching 事件

Name: Distillation of Large Language Models via Concrete Score Matching
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Distillation of Large Language Models via Concrete Score Matching arXiv:2509.25837v3 Announce Type: replace-cross Abstract: Large language models (LLMs) deliver remarkable performance but are costly to deploy, motivating knowledge distillation (KD) for efficient inference. Existing KD objectives typically match student and teacher probabilities via softmax, which blurs valuable logit information. While direct logit distillation (DLD) mitigates softmax smoothing, it fails to account for logit sh

人工智能

关系图谱

Distillation of Large Language Models via Concrete Score Matching · 相关人物

暂无数据