Do Image-Text Metrics Respect Semantic Invariances? 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Do Image-Text Metrics Respect Semantic Invariances? arXiv:2605.24702v1 Announce Type: new Abstract: Reference-free image-to-text evaluators are now standard for scoring image-caption alignment, yet it is unclear whether they respect semantic invariances. We present an invariance probe on five popular evaluators (CLIPScore, PAC-S, UMIC, FLEUR, and a deterministic LLM judge) under semantics-preserving perturbations along three axes -- spatial (flips, context-preserving repositioning, light rotati