Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation 事件
PRODUCT_LAUNCH2026-06-05影响: MEDIUM
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation arXiv:2606.05988v1 Announce Type: cross Abstract: Reasoning models produce long chain-of-thought traces that are costly to distill and encourage verbose student outputs. We study post-hoc compression of such traces before knowledge distillation. Two teachers, Qwen3.5-397B-A17B and gpt-oss-120B, generate about 283k correct traces each; two instruction-tuned models then compress them to 8.6-21.0% of their original
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation · 相关报道
相关报道
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation
ArXiv CS.CL2026-06-05