Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition 事件

PRODUCT_LAUNCH2026-06-08影响: MEDIUM

Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition arXiv:2606.07309v1 Announce Type: cross Abstract: Instruction-following audio language models (ALMs) can be augmented with explicit acoustic cues, yet it remains unclear whether such cues are used in a grounded way when the raw audio is already available. We study this question in speech emotion recognition (SER) by deriving six interpretable acoustic concept tokens from the standardised eGeMAPS paralinguistic featur