IRPO: Boosting Image Restoration via Post-training GRPO 文章

ArXiv CS.CV2026-05-28NEWSen作者: Haoxuan Xu, Yi Liu, Tianfu Li, Ruolin Shen, Boyuan Jiang, Jinlong Peng, Donghao Luo, Xiaobin Hu, Shuicheng Yan, Haoang Li

查看原文 →

关系图谱

摘要

arXiv:2512.00814v3 Announce Type: replace Abstract: Post-training has become effective for high-level generation, but its role in low-level vision remains underexplored. Existing image restoration methods often rely on fixed pixel-wise fitting to ground-truth images, which can lead to over-smoothing and weak generalization. We propose IRPO, a GRPO-based post-training framework for deterministic restoration models. IRPO is built around two axes: data formulation and reward modeling. For data formulation, we select the 30% underperforming samples from the pre-training stage, which improves both accuracy and training efficiency. For reward modeling, we combine fidelity-oriented and quality-aware feedback with three components: a General Reward for structural fidelity, an Expert Reward that uses a Vision-Language Model as a coarse visual-quality judge, and a Restoration Reward for task-specific low-level cues.

IRPO: Boosting Image Restoration via Post-training GRPO 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (2)