M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection 论文

2022引用 317

Generative Adversarial Networks and Image SynthesisDigital Media Forensic DetectionAdvanced Image Processing Techniques

Generative Adversarial Networks and Image Synthesis Advanced Image Processing Techniques Digital Media Forensic Detection

关系图谱

作者

摘要

The widespread dissemination of Deepfakes demands effective approaches that can detect perceptually convincing forged images. In this paper, we aim to capture the subtle manipulation artifacts at different scales using transformer models. In particular, we introduce a Multi-modal Multi-scale TRansformer (M2TR), which operates on patches of different sizes to detect local inconsistencies in images at different spatial levels. M2TR further learns to detect forgery artifacts in the frequency domain to complement RGB information through a carefully designed cross modality fusion block. In addition, to stimulate Deepfake detection research, we introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods. We conduct extensive experiments to verify the effectiveness of the proposed method, which outperforms state-of-the-art Deepfake detection methods by clear margins.

作者查看全部 (6)

Ser-Nam Li

Yu–Gang Jiang

Jingjing Chen

Wenhao Ouyang

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection 论文

详细信息

摘要

作者查看全部 (6)

相关技术查看全部 (3)

相关事件

相关文章