The Trust Paradox: How CS Researchers Engage LLM Leaderboards 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
The Trust Paradox: How CS Researchers Engage LLM Leaderboards arXiv:2605.28966v1 Announce Type: new Abstract: Large language model (LLM) leaderboards rank AI models using standardized benchmarks and have become highly visible across computer science, despite known limitations in their reliability and robustness. Yet how they shape researchers' actual practice remains empirically uncharted. We address this gap through semi-structured interviews with eight researchers across four computer science