Investigating Adversarial Robustness of Multi-modal Large Language Models 文章

ArXiv CS.CV2026-06-03NEWSen作者: Hashmat Shadab Malik, Muzammal Naseer, Salman Khan

摘要

arXiv:2606.03713v1 Announce Type: new Abstract: Multi-modal Large Language Models (MLLMs) achieve strong performance on vision-language tasks, but incorporating visual inputs through a vision encoder (e.g., CLIP) substantially expands the attack surface, making these models vulnerable to visual adversarial perturbations. Prior defenses typically preserve compatibility with pretrained MLLMs by enforcing strict alignment to CLIP's original embedding space during adversarial fine-tuning; while practical, this constraint fundamentally limits achievable robustness. We present a systematic investigation of adversarial robustness in MLLMs. We first introduce a diagnostic CLIP-alignment protocol that predicts, prior to full MLLM training, which robust vision encoders will transfer effectively to the multimodal setting, revealing that large-scale multimodal adversarial pretraining, rather than unimodal scale alone, is the critical factor for strong robustness transfer.

相关公司

暂无数据

相关人物

暂无数据