UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding 事件

Name: UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding
Start: 2026-06-08

OPEN_SOURCE2026-06-08影响: MEDIUM

UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding arXiv:2606.07167v1 Announce Type: new Abstract: Meaningful multilingual evaluation must test models in the target language and educational context. Urdu, spoken by more than 230 million people, lacks a broad MMLU-style benchmark built from native educational sources. We introduce UrduMMLU, a benchmark of 26,431 Urdu MCQs across 26 subjects and five domains, collected from native Urdu MCQ banks and public examination PDFs. U

人工智能

关系图谱