When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers 事件

Name: When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers arXiv:2605.25304v1 Announce Type: cross Abstract: Concept Bottleneck Models (CBMs) have emerged as a cornerstone approach for interpretable machine learning, providing human-understandable intermediate representations through explicit concept activations. However, this interpretability fundamentally introduces a critical, previously unexplored attack surface: the concept bottleneck layer itself. We present a co

人工智能

关系图谱

When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers 事件

When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers · 相关报道

相关报道