FSA-GRPO: Teaching Auditory LLMs to Use Few-shot Demonstrations 文章

ArXiv CS.AI2026-06-03NEWSen作者: Haolong Zheng, Siyin Wang, Xulin Fan, Zengrui Jin, Mark Hasegawa-Johnson

详细信息

来源站点: ArXiv CS.AI
作者: Haolong Zheng, Siyin Wang, Xulin Fan, Zengrui Jin, Mark Hasegawa-Johnson
文章类型: NEWS
语言: en
发布日期: 2026-06-03

摘要

arXiv:2606.02615v1 Announce Type: cross Abstract: Few-shot prompting provides an effective way to adapt auditory large language models to low-resource tasks such as children's speech recognition. However, most auditory large language models are not explicitly trained to perform inference in this demonstration-conditioned format, limiting the extent to which they can benefit from few-shot prompting. To address this limitation, we introduce Few-Shot Aware GRPO (FSA-GRPO), an RL-based post-training recipe that uses a specially designed reward to encourage the model to leverage few-shot demonstrations, thereby strengthening its few-shot adaptation ability. Notably, training with only high-resource adult ASR data improves the model's general few-shot adaptation ability, yielding gains not only in children's speech recognition but also in speech translation and audio understanding. We further study data selection and auxiliary reward weighting to identify an effective training recipe.

FSA-GRPO: Teaching Auditory LLMs to Use Few-shot Demonstrations 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (1)