Investigating Adversarial Robustness of Multi-modal Large Language Models 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

Investigating Adversarial Robustness of Multi-modal Large Language Models arXiv:2606.03713v1 Announce Type: new Abstract: Multi-modal Large Language Models (MLLMs) achieve strong performance on vision-language tasks, but incorporating visual inputs through a vision encoder (e.g., CLIP) substantially expands the attack surface, making these models vulnerable to visual adversarial perturbations. Prior defenses typically preserve compatibility with pretrained MLLMs by enforcing strict alignment to