BusterX++: Towards Unified Cross-Modal AI-Generated Content Detection and Explanation with MLLM 文章

ArXiv CS.CV2026-06-17NEWSen作者: Haiquan Wen, Tianxiao Li, Zhenglin Huang, Yiwei He, Guangliang Cheng

详细信息

来源站点: ArXiv CS.CV
作者: Haiquan Wen, Tianxiao Li, Zhenglin Huang, Yiwei He, Guangliang Cheng
文章类型: NEWS
语言: en
发布日期: 2026-06-17

摘要

arXiv:2507.14632v4 Announce Type: replace Abstract: The rapid advancement of generative AI has substantially improved image and video synthesis, amplifying the risk of multimodal visual misinformation. Recent MLLMs have shown promise for transparent AI-generated content detection through reasoning and explanation, yet existing approaches largely treat image and video forensics as isolated tasks, leaving cross-modal synergies underexplored. To address this, we present \textbf{BusterX++}, a unified MLLM for joint image and video detection with interpretable reasoning. We also introduce \textbf{GenBuster-Bench++}, a meticulously curated, difficulty-aligned benchmark containing balanced image and video samples spanning recent generation models and diverse real-world scenarios. Using this controlled setting, we revisit the widely adopted $SFT \rightarrow RL$ post-training paradigm.

BusterX++: Towards Unified Cross-Modal AI-Generated Content Detection and Explanation with MLLM 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (6)

相关技术查看全部 (4)