Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL 文章

ArXiv CS.CV2026-05-27NEWSen作者: Junyi Wu, Weijian Luo, Haoyang Zheng, Ruizhe Zhang, Guang Lin

Diff-Instruct with Diffused Reward: Towards Principled One-step Generator RL · 相关技术