Thaka at KSAA-2026 Task 2: Regularized Fine-Tuning for Arabic Speech Diacritization 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Thaka at KSAA-2026 Task 2: Regularized Fine-Tuning for Arabic Speech Diacritization arXiv:2605.25928v1 Announce Type: new Abstract: We describe the winning system for Task 2 of the KSAA-2026 Shared Task on Arabic Speech Dictation with Automatic Diacritization. The task requires producing fully diacritized Arabic text from speech audio and undiacritized transcripts, with only 2,327 training samples available and no external data permitted. Our system fine-tunes CATT-Whisper, a character-level mu