Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics 文章

ArXiv CS.CV2026-06-02NEWSen作者: Subhadeep Roy, Gagan Bhatia, Steffen Eger

摘要

arXiv:2601.04946v3 Announce Type: replace Abstract: Automatic metrics are widely used to evaluate text-to-image models, often replacing human judgment in benchmarking, model selection, and large-scale data filtering. Yet they may reward images that look plausible or prototypical rather than images that faithfully satisfy the prompt. We identify prototypicality bias as a systematic blindspot in multimodal evaluation: metrics can prefer a semantically incorrect but visually or socially prototypical image over a correct but less prototypical one. We introduce PROTOBIAS, a controlled diagnostic benchmark across Animals, Objects, and Demography, where semantically correct images are contrasted with plausible prototypical adversaries containing a single controlled semantic violation.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据