On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation arXiv:2601.06329v2 Announce Type: replace Abstract: Generative spoken language models pretrained on large-scale raw audio can continue a speech prompt with appropriate content while preserving attributes like speaker and emotion, serving as foundation models for spoken dialogue. In prior literature, these models are often evaluated using ``global token perplexity'', which directly applies the text perplexity formulati