LLM Judges Inconsistently Disagree Across Safety Criteria and Harm Categories 事件

Name: LLM Judges Inconsistently Disagree Across Safety Criteria and Harm Categories
Start: 2026-06-01

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

LLM Judges Inconsistently Disagree Across Safety Criteria and Harm Categories arXiv:2605.31381v1 Announce Type: new Abstract: We evaluate the consistency of automated judges in conducting a multi-dimensional safety evaluation in a reference-free setup. Our results indicate that Large Language Models are unreliable judges in identifying safety issues related to machine-generated advice in regulated domains such as finance, although they are more reliable at identifying more overt forms of unsafe

人工智能

关系图谱

LLM Judges Inconsistently Disagree Across Safety Criteria and Harm Categories 事件

相关公司查看全部 (8)

相关人物查看全部 (4)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)