SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems 事件

Name: SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems
Start: 2026-06-04

BREAKTHROUGH2026-06-04影响: HIGH

SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems arXiv:2605.10246v2 Announce Type: replace Abstract: AI scientist systems are increasingly deployed for autonomous research, yet their academic integrity has never been systematically evaluated. We introduce SCIINTEGRITY-BENCH, the first benchmark designed around a dilemmatic evaluation paradigm: each of its 33 scenarios across 11 trap categories is constructed so that honest acknowledgment of failure is th

人工智能

关系图谱

SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems · 相关人物

暂无数据