MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training arXiv:2606.08788v1 Announce Type: new Abstract: Representation alignment with pretrained vision models has recently shown strong potential for accelerating diffusion transformer training. By aligning intermediate diffusion features with clean-image representations from self-supervised vision encoders, existing methods improve convergence and generation quality. However, such alignment also introduces a non-trivial