Representation Forcing for Bottleneck-Free Unified Multimodal Models 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

Representation Forcing for Bottleneck-Free Unified Multimodal Models arXiv:2605.31604v1 Announce Type: new Abstract: Unified multimodal models (UMMs) aim to handle perception and generation in a single model. Yet existing UMMs still rely on a frozen, separately pretrained VAE for image generation, imposing a structural bottleneck. Naively removing it introduces a quality gap, as the model must learn both high-level structure and low-level details from raw pixels. In this paper, we propose Repre