Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers 事件

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers arXiv:2605.14270v2 Announce Type: replace Abstract: Multimodal Diffusion Transformers (MM-DiTs) have achieved remarkable progress in text-to-image generation, yet they frequently suffer from concept omission, where specified objects or attributes fail to emerge in the generated image. By performing linear probing on text tokens, we demonstrate that text embeddings can distinguish a characteristic `omission signal' r