Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics 文章

ArXiv CS.CV2026-06-02NEWSen作者: Subhadeep Roy, Gagan Bhatia, Steffen Eger

摘要

arXiv:2601.04946v3 Announce Type: replace Abstract: Automatic metrics are widely used to evaluate text-to-image models, often replacing human judgment in benchmarking, model selection, and large-scale data filtering. Yet they may reward images that look plausible or prototypical rather than images that faithfully satisfy the prompt. We identify prototypicality bias as a systematic blindspot in multimodal evaluation: metrics can prefer a semantically incorrect but visually or socially prototypical image over a correct but less prototypical one. We introduce PROTOBIAS, a controlled diagnostic benchmark across Animals, Objects, and Demography, where semantically correct images are contrasted with plausible prototypical adversaries containing a single controlled semantic violation.

Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (2)

相关技术