Bias Leaves a Gradient Trail: Label-Free Bias Identification via Gradient Probes on Concept Decompositions 文章

ArXiv CS.CV2026-05-28NEWSen作者: Thomas Vitry, Kieran Edgeworth, Stefan Wermter, Jae Hee Lee

摘要

arXiv:2605.28780v1 Announce Type: new Abstract: Vision classifiers can exploit spurious correlations, achieving high in-distribution accuracy yet failing under distribution shift. Existing approaches to bias mitigation and analysis often depend on curated datasets, spurious-attribute or group labels, or retraining, which may be infeasible once a model is deployed or the relevant bias is unknown. We present a bias-label-free, post-hoc method for identifying spurious concepts in frozen vision models, relying only on standard class labels from a held-out audit dataset. For each target class, we collect patches from inputs predicted as that class and apply non-negative matrix factorization to intermediate activations to obtain a bank of interpretable concept vectors.

Bias Leaves a Gradient Trail: Label-Free Bias Identification via Gradient Probes on Concept Decompositions 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (1)