Locality Matters for Training-Free Audio Token Compression in Audio-Language Models 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Locality Matters for Training-Free Audio Token Compression in Audio-Language Models arXiv:2605.25179v1 Announce Type: new Abstract: Audio-language models (ALMs) are increasingly used for audio captioning, question answering, and open-ended audio understanding, but their inference cost remains high when audio inputs are represented as long prefix-token sequences. These audio prefixes consume context budget, increase memory usage, and make deployment harder in resource-constrained or latency-sens