Emergent alignment and the projectability of ethical personas 事件

Name: Emergent alignment and the projectability of ethical personas
Start: 2026-06-09

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

Emergent alignment and the projectability of ethical personas arXiv:2606.09475v1 Announce Type: new Abstract: Work on `emergent misalignment' shows that finetuning LLMs on narrow tasks can induce broadly misaligned behavior. This supports the `persona selection' (PSM) hypothesis: during pre-training, LLMs learn to simulate different characters and perspectives, which can be elicited and refined during post-training. This paper investigates the converse phenomenon, `emergent alignment', and uses

人工智能

关系图谱

Emergent alignment and the projectability of ethical personas 事件

相关公司查看全部 (10)

相关人物查看全部 (4)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)