Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures 事件
PRODUCT_LAUNCH2026-06-02影响: MEDIUM
Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures arXiv:2510.24081v2 Announce Type: replace Abstract: To date, there exist almost no culturally-specific evaluation benchmarks for large language models (LLMs) that cover a large number of languages and cultures. In this paper, we present Global PIQA, a participatory commonsense reasoning benchmark for over 100 languages, constructed by hand by over 350 researchers from over 65 countries around the world. The 141 lan
Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures · 相关报道
相关报道
Global PIQA: Evaluating Commonsense Reasoning Across 100+ Languages and Cultures
ArXiv CS.CL2026-06-02