Curriculum Learning for Safety Alignment 事件

Name: Curriculum Learning for Safety Alignment
Start: 2026-05-27

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Curriculum Learning for Safety Alignment arXiv:2605.26315v1 Announce Type: cross Abstract: Direct Preference Optimisation (DPO) is widely used for safety alignment in large language models. However, prior work shows it is brittle and exhibits poor out-of-distribution (OOD) generalisation. In this paper, we investigate whether Curriculum Learning can improve the robustness of DPO-based safety alignment. We propose Staged-Competence, a curriculum-based framework that organises preference data by

人工智能

关系图谱

Curriculum Learning for Safety Alignment · 相关公司

Abstract

arXivNONPROFIT

ISESNONPROFIT

IRECNONPROFIT

FrameworkCOMPANY

EARNNONPROFIT

AnisNONPROFIT

ACTNONPROFIT

RatioRESEARCH_INSTITUTE

nearCOMPANY