High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models arXiv:2512.21815v3 Announce Type: replace Abstract: Vision-language models (VLMs) achieve remarkable performance but remain vulnerable to adversarial attacks. Entropy, as a measure of model uncertainty, is highly correlated with VLM reliability. While prior entropy-based attacks maximize uncertainty at all decoding steps, implicitly assuming that every token equally contributes to model instability, we reveal that a smal

High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models · 相关人物