Jailbreaking Multimodal Large Language Models using Multi-Clip Video 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

Jailbreaking Multimodal Large Language Models using Multi-Clip Video arXiv:2606.02111v1 Announce Type: new Abstract: As multimodal large language models (MLLMs) have advanced to process video inputs, concerns have emerged about their potential for malicious misuse. Prior jailbreak studies have shown that safety alignment in MLLMs can be bypassed through visual inputs, yet it remains unclear which properties of video inputs induce this vulnerability. To address this gap, we introduce Multi-Clip