BadBlocks: Low-Cost and Stealthy Backdoor Attacks Tailored for Text-to-Image Diffusion Models 文章

ArXiv CS.CV2026-05-29NEWSen作者: Jia Wu, Yu Pan, Junjun Yang, Yi Du

摘要

arXiv:2508.03221v5 Announce Type: replace-cross Abstract: Despite the remarkable progress of diffusion models in image generation, recent studies reveal their vulnerability to backdoor attacks via covert visual or textual triggers. Although evolving defense mechanisms can detect most existing threats through visual inspection or feature analysis, we introduce BadBlocks-a novel, lightweight, and highly covert attack that challenges these safeguards. By selectively poisoning specific blocks within the UNet architecture while keeping other components intact, BadBlocks requires only 30% of the computational resources and 20% of the GPU time of conventional attacks, effectively democratizing backdoor injection on consumer-grade GPUs. Empirical evaluations demonstrate that BadBlocks achieves a high attack success rate with negligible perceptual quality loss, while successfully bypassing state-of-the-art defenses, particularly attention-based detection frameworks.