Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation 事件

Name: Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation
Start: 2026-06-05

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation arXiv:2606.05988v1 Announce Type: cross Abstract: Reasoning models produce long chain-of-thought traces that are costly to distill and encourage verbose student outputs. We study post-hoc compression of such traces before knowledge distillation. Two teachers, Qwen3.5-397B-A17B and gpt-oss-120B, generate about 283k correct traces each; two instruction-tuned models then compress them to 8.6-21.0% of their original

人工智能

关系图谱

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation 事件

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation · 相关报道

相关报道