Unveiling the Fragility of Vision-Language Models: Multi-Modal Adversarial Synergy via Texture-Constrained Perturbations and Cross-Modal Optimization文章
ArXiv CS.CV2026-05-27NEWSen作者: Xiang Fang, Wanlong Fang, Changshuo Wang
Unveiling the Fragility of Vision-Language Models: Multi-Modal Adversarial Synergy via Texture-Constrained Perturbations and Cross-Modal Optimization · 相关人物