Curriculum Learning for Safety Alignment 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Curriculum Learning for Safety Alignment arXiv:2605.26315v1 Announce Type: cross Abstract: Direct Preference Optimisation (DPO) is widely used for safety alignment in large language models. However, prior work shows it is brittle and exhibits poor out-of-distribution (OOD) generalisation. In this paper, we investigate whether Curriculum Learning can improve the robustness of DPO-based safety alignment. We propose Staged-Competence, a curriculum-based framework that organises preference data by
相关产品查看全部 (10)
相关报道查看全部 (1)
Curriculum Learning for Safety Alignment
ArXiv CS.AI2026-05-27