Diffusing in the Right Space: A Systematic Study of Latent Diffusability 文章

ArXiv CS.CV2026-06-03NEWSen作者: Tianxiong Zhong, Xingye Tian, Xuebo Wang, Xin Tao, Pengfei Wan

摘要

arXiv:2606.03578v1 Announce Type: new Abstract: Latent diffusion models leverage visual tokenizers to compress images into latent spaces for efficient generative modeling. However, better reconstruction quality of a tokenizer does not necessarily translate into better generation quality, suggesting that latent representations should be evaluated not only by fidelity but also by their diffusability. Recent studies have proposed diverse explanations for diffusion-friendly latent spaces, including semantic separability, affine equivariance, distribution uniformity, spatial structure, spectral smoothness, and manifold continuity. Yet these properties are often validated on a limited set of tokenizers, leaving it unclear which factors are most predictive of downstream generation quality and whether such conclusions hold beyond the specific settings in which they are introduced.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据