Alignment Makes Language Models Normative, Not Descriptive 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Alignment Makes Language Models Normative, Not Descriptive arXiv:2603.17218v2 Announce Type: replace Abstract: Post-training alignment optimizes language models to match human preference signals, but this objective is not equivalent to modeling observed human behavior. We compare 120 base-aligned model pairs on more than 10,000 real human decisions in multi-round strategic games - bargaining, persuasion, negotiation, and repeated matrix games. In these settings, base models outperform their ali
相关公司查看全部 (10)
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Alignment Makes Language Models Normative, Not Descriptive
ArXiv CS.CL2026-05-27