Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark 文章

ArXiv CS.CL2026-06-02NEWSen作者: Yangyang Liu, Dong Yu, Pengyuan Liu

摘要

arXiv:2606.02214v1 Announce Type: new Abstract: Large language models are increasingly used in value-sensitive decision settings, where irrelevant demographic cues should not alter judgments. We construct the Realistic Value Decision Benchmark (RVDB), a controlled benchmark that varies only the role-gender configuration while holding the scenario, ordered value pair, roles, candidate decisions, Value Distance, and Decision Severity fixed. Using a position-balanced evaluation across seven models, we test whether models preserve decision invariance under gender perturbations and whether their self-attributions reflect observed behavioral changes. We find that explicit gender cues induce bounded but systematic decision flips, including under an explicit gender-attribution prompt that asks models to report whether gender influenced their choice.

Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (2)

相关技术