LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation arXiv:2605.27840v1 Announce Type: cross Abstract: Audio tokenizers are fundamental to unifying audio understanding and generation. Understanding requires high-level semantics, while generation demands semantic and acoustic details. Existing unified tokenizers jointly encode both in high-dimensional continuous latents, which increases the modeling burden of Diffusion Transformers (DiTs) for g

LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation · 相关公司

A
arXivNONPROFIT
P
PactNONPROFIT
E
EnsionCOMPANY
E
EATNONPROFIT
A
ANDINONPROFIT
T
TemporaRESEARCH_INSTITUTE
A
ACTNONPROFIT
R
RatioRESEARCH_INSTITUTE
D
detaCOMPANY