Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models 事件

ACQUISITION2026-06-04影响: HIGH

Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models arXiv:2602.19101v2 Announce Type: replace Abstract: Value alignment of Large Language Models (LLMs) requires us to empirically measure these models' actual, acquired representation of value. Among the characteristics of value representation in humans is that they distinguish among value of different kinds. We investigate whether LLMs likewise distinguish three different kinds of good: moral, grammatic