Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges 事件

Name: Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges
Start: 2026-06-05

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges arXiv:2606.05384v1 Announce Type: cross Abstract: LLM-as-judge evaluation is widely used in benchmarking pipelines, where model outputs are compared and ranked using automated evaluators. These pipelines typically assume that judgments are stable properties of fixed inputs. We show that this assumption does not hold under interaction. We study post-decision manipulability: the extent to which an ev

人工智能

关系图谱

Stability vs. Manipulability: Evaluating Robustness Under Post-Decision Interaction in LLM Judges 事件

相关公司查看全部 (10)

相关人物查看全部 (3)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)