Multimodal Concept Bottleneck Models 文章

ArXiv CS.CV2026-06-19NEWSen作者: Tongqing Shi, Ge Yan, Tuomas Oikarinen, Tsui-Wei Weng

详细信息

来源站点
ArXiv CS.CV
作者
Tongqing Shi, Ge Yan, Tuomas Oikarinen, Tsui-Wei Weng
文章类型
NEWS
语言
en
发布日期
2026-06-19

摘要

arXiv:2606.19882v1 Announce Type: new Abstract: Concept Bottleneck Models (CBMs) enhance the interpretability of deep learning networks by aligning the features extracted from images with natural concepts. However, existing CBMs are constrained in their ability to generalize beyond a fixed set of predefined classes and the risk of non-concept information leakage, where predictive signals outside the intended concepts are inadvertently exploited. In this paper, we propose Multimodal Concept Bottleneck Model (MM-CBM) to address these issues and extend CBMs into CLIP. MM-CBM utilizes dual Concept Bottleneck Layers (CBLs) to align both the image and text embeddings into interpretable features. This allows us to perform new vision tasks like zero-shot classification or image retrieval in an interpretable way. Compared to existing methods, MM-CBM achieves up to 51.26% accuracy improvement on average across four standard benchmarks.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据