Uncovering Competency Gaps in Large Language Models and Their Benchmarks 事件

Name: Uncovering Competency Gaps in Large Language Models and Their Benchmarks
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Uncovering Competency Gaps in Large Language Models and Their Benchmarks arXiv:2512.20638v2 Announce Type: replace Abstract: The evaluation of large language models relies heavily on standardized benchmarks. These benchmarks provide useful aggregated metrics, but can obscure (i) particular sub-areas where the models are weak ("model gaps") and (ii) imbalanced coverage in the benchmarks themselves ("benchmark gaps"). To automatically uncover both types of gaps, we propose a simple new method usi

人工智能

关系图谱

Uncovering Competency Gaps in Large Language Models and Their Benchmarks 事件

相关公司查看全部 (8)

相关人物查看全部 (3)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)