Can LLM Teams Play What? Where? When? 文章

ArXiv CS.CL2026-06-01NEWSen作者: Anastasia Kotelnikova, Viktor Byzov, Maria Dolzhenkova, Evgeny Kotelnikov

摘要

arXiv:2605.30459v1 Announce Type: new Abstract: Large language models (LLMs) remain limited on tasks requiring indirect reasoning, cultural knowledge, and coordinated hypothesis testing. We investigate whether team-based interaction improves LLM performance in What? Where? When? (ChGK), a quiz game designed to reward collective reasoning. We introduce three team strategies: Voting, Silent Team (the captain observes final answers), and Talkative Team (the captain observes both answers and rationales). To minimize data leakage, we evaluate these strategies on a dataset consisting of 572 ChGK questions released in 2025. Using six recent large-scale open models, we show that team-based strategies outperform single-model baselines, yielding gains of up to 20 percentage points in accuracy. The best team achieves 44.23% accuracy, and approaches human team performance on questions with available human statistics.

相关事件查看全部 (1)

Can LLM Teams Play What? Where? When?
2026-06-01PRODUCT_LAUNCH影响: MEDIUM

相关公司

暂无数据

相关人物

暂无数据