Knowledge Index of Noah's Ark 事件

PRODUCT_LAUNCH2026-06-04影响: MEDIUM

Knowledge Index of Noah's Ark arXiv:2606.05104v1 Announce Type: new Abstract: Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-payment annotation that permits lazy consensus; and unaudited ranking instability under bounded test budgets. We introduce KINA, an 899-item benchmark across 261 fine-grained disciplines, with two formal results. First, we cast representativeness as a coverage-style objective over ex

Knowledge Index of Noah's Ark · 相关人物

暂无数据