摘要
arXiv:2510.11296v3 Announce Type: replace Abstract: Recent approaches for vision-language models (VLMs) have shown remarkable success in achieving fast downstream adaptation. When applied to real-world downstream tasks, VLMs inevitably encounter both the in-distribution (ID) data and out-of-distribution (OOD) data. The OOD datasets often include both covariate shifts (e.g., known classes with changes in image styles) and semantic shifts (e.g., test-time unseen classes). This highlights the importance of improving VLMs' generalization ability to covariate-shifted OOD data, while effectively detecting open-set semantic-shifted OOD classes. In this paper, inspired by the substantial energy change observed in closed-set data when re-aligning vision-language modalities (specifically by directly reducing the maximum cosine similarity to a low value), we introduce a novel OOD score, named {\Delta}Energy.
相关事件查看全部 (1)
相关人物
暂无数据