Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark 文章

ArXiv CS.CL2026-06-02NEWSen作者: Yangyang Liu, Dong Yu, Pengyuan Liu

摘要

arXiv:2606.02214v1 Announce Type: new Abstract: Large language models are increasingly used in value-sensitive decision settings, where irrelevant demographic cues should not alter judgments. We construct the Realistic Value Decision Benchmark (RVDB), a controlled benchmark that varies only the role-gender configuration while holding the scenario, ordered value pair, roles, candidate decisions, Value Distance, and Decision Severity fixed. Using a position-balanced evaluation across seven models, we test whether models preserve decision invariance under gender perturbations and whether their self-attributions reflect observed behavioral changes. We find that explicit gender cues induce bounded but systematic decision flips, including under an explicit gender-attribution prompt that asks models to report whether gender influenced their choice.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据