Scaling Conversational Hungarian ASR: The BEA-Dialogue+ Corpus 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
Scaling Conversational Hungarian ASR: The BEA-Dialogue+ Corpus arXiv:2605.31469v1 Announce Type: new Abstract: Conversational automatic speech recognition in Hungarian is constrained by the limited amount of publicly available dialogue-style training data. The BEA-Dialogue corpus addresses this need, but its strictly speaker-disjoint train/dev/eval split reduces the usable material to only 85 hours. In this paper, we introduce BEA-Dialogue+, an expanded version of the corpus that relaxes the sp