XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks 事件

BREAKTHROUGH2026-06-01影响: HIGH

XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks arXiv:2605.30788v1 Announce Type: new Abstract: We introduce a set of synthetic algorithmic tasks to detect cross-lingual gaps in the abilities of large language models. Our benchmark is commensurate across languages, since it requires models to perform the same underlying task in different languages; scalable, since each task can be generated at varying levels of complexity allowing it to be adapted to models with different c

XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks · 相关产品