SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems 事件
BREAKTHROUGH2026-06-04影响: HIGH
SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems arXiv:2605.10246v2 Announce Type: replace Abstract: AI scientist systems are increasingly deployed for autonomous research, yet their academic integrity has never been systematically evaluated. We introduce SCIINTEGRITY-BENCH, the first benchmark designed around a dilemmatic evaluation paradigm: each of its 33 scenarios across 11 trap categories is constructed so that honest acknowledgment of failure is th