Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models 事件
PRODUCT_LAUNCH2026-05-26影响: MEDIUM
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models arXiv:2603.17044v2 Announce Type: replace-cross Abstract: Unified multimodal models share a language model backbone for both understanding and generating images. Can DPO align both capabilities simultaneously? We present the first systematic study of this question, applying DPO to Janus-Pro at 1B and 7B parameters under seven training strategies and two post-hoc methods. The central finding is negativ