BusterX++: Towards Unified Cross-Modal AI-Generated Content Detection and Explanation with MLLM 文章

ArXiv CS.CV2026-06-17NEWSen作者: Haiquan Wen, Tianxiao Li, Zhenglin Huang, Yiwei He, Guangliang Cheng

详细信息

来源站点
ArXiv CS.CV
作者
Haiquan Wen, Tianxiao Li, Zhenglin Huang, Yiwei He, Guangliang Cheng
文章类型
NEWS
语言
en
发布日期
2026-06-17

摘要

arXiv:2507.14632v4 Announce Type: replace Abstract: The rapid advancement of generative AI has substantially improved image and video synthesis, amplifying the risk of multimodal visual misinformation. Recent MLLMs have shown promise for transparent AI-generated content detection through reasoning and explanation, yet existing approaches largely treat image and video forensics as isolated tasks, leaving cross-modal synergies underexplored. To address this, we present \textbf{BusterX++}, a unified MLLM for joint image and video detection with interpretable reasoning. We also introduce \textbf{GenBuster-Bench++}, a meticulously curated, difficulty-aligned benchmark containing balanced image and video samples spanning recent generation models and diverse real-world scenarios. Using this controlled setting, we revisit the widely adopted $SFT \rightarrow RL$ post-training paradigm.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据