Beyond Compression: Quantifying Spectral Accessibility in Vision Representations 文章

ArXiv CS.CV2026-06-03NEWSen作者: Akayou A. Kitessa, Yijun Zhao

摘要

arXiv:2606.03795v1 Announce Type: new Abstract: Vision-language models map visual features into a shared embedding space through learned projection layers, yet it remains unclear how these transformations alter the structure of visual information. This study examines changes in representation through spatial-frequency accessibility, measured by the linear recoverability of band-limited Fourier energy from model representations. To isolate effects beyond dimensionality reduction, we introduce Residual Spectral Loss (RSL), which evaluates changes relative to a dimension-matched random projection baseline. To reduce confounding effects from optimization, the analysis uses pretrained models with all parameters frozen. The experimental results show consistent frequency-dependent changes in accessibility across CLIP and DINOv2 on ImageNet and MS-COCO datasets.

相关公司

暂无数据

相关人物

暂无数据