Counteraction-Aware Multi-Teacher On-Policy Distillation for General Capability Recovery with Domain Preservation 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Counteraction-Aware Multi-Teacher On-Policy Distillation for General Capability Recovery with Domain Preservation arXiv:2605.27115v1 Announce Type: new Abstract: Domain specialization can improve LLM behavior in vertical domains, but often weakens the general capabilities inherited from the original model. Recent Multi-Teacher On-Policy Distillation (MOPD) pipelines recover model capabilities by supervising student-generated trajectories with teacher feedback, but typically assume teacher-align