Hidden Consensus:Preference-Validity Compression in Human Feedback 事件
PRODUCT_LAUNCH2026-06-10影响: MEDIUM
Hidden Consensus:Preference-Validity Compression in Human Feedback arXiv:2606.10569v1 Announce Type: new Abstract: Standard RLHF pipelines often reduce heterogeneous human judgments into a single scalar reward target. We argue that this reduction can mis-measure alignment in structurally plural societies, where disagreement may reflect culturally, historically, linguistically, regionally, or normatively grounded interpretations rather than annotation noise. We call this failure Preference-Valid
相关产品查看全部 (10)
相关报道查看全部 (1)
Hidden Consensus:Preference-Validity Compression in Human Feedback
ArXiv CS.CL2026-06-10