From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation 文章

ArXiv CS.CL2026-06-05NEWSen作者: Paloma Piot, Javier Parapar

摘要

arXiv:2606.06266v1 Announce Type: new Abstract: Hate speech detection is inherently subjective: people from different demographic groups perceive the same content very differently. Collecting enough annotations from multiple demographic groups is costly and difficult to scale. Persona-conditioned Large Language Models (models prompted to adopt a specific demographic identity) have been proposed as a way to simulate diverse perspectives at scale. But do they actually reflect how different groups disagree? We evaluate three aspects of human social judgement: (i) whether personas from different groups disagree in human-like ways (inter-group disagreement), (ii) whether they become more sensitive when content targets their own identity (in-group sensitivity), and (iii) whether they can accurately predict how another group would react (vicarious prediction).

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据