MELD: Mel-Spectrogram-Based Speech Language Modeling with Discrete Latent Variables 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

MELD: Mel-Spectrogram-Based Speech Language Modeling with Discrete Latent Variables arXiv:2605.29859v1 Announce Type: cross Abstract: Recent speech language models rely on encoders that are optimized separately from autoregressive models. Since these encoders are unaware of the downstream objectives, the extracted representations may not be optimal for downstream tasks. To address this limitation, we introduce a discrete latent variable model on mel spectrograms that jointly optimizes the encod