Beyond Questions: Evaluating What Large Language Models (Actually) Know 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

Beyond Questions: Evaluating What Large Language Models (Actually) Know arXiv:2605.26937v1 Announce Type: new Abstract: Parametric knowledge in large language models (LLMs) is a cornerstone of their success, yet remains poorly understood. Existing knowledge benchmarks typically rely on predefined questions (e.g., "What is the birth date of M.L. King?"), evaluating only knowledge that benchmark designers explicitly choose to query, a problematic availability bias. In this paper, we introduce o