Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures arXiv:2510.24081v2 Announce Type: replace Abstract: To date, there exist almost no culturally-specific evaluation benchmarks for large language models (LLMs) that cover a large number of languages and cultures. In this paper, we present Global PIQA, a participatory commonsense reasoning benchmark for over 100 languages, constructed by hand by over 350 researchers from over 65 countries around the world. The 141 lan