Beyond the Proxy: Trajectory-Distilled Guidance for Offline GFlowNet Training 文章

ArXiv CS.AI2026-05-26NEWSen作者: Ruishuo Chen, Xun Wang, Rui Hu, Zhuoran Li, Longbo Huang

详细信息

来源站点
ArXiv CS.AI
作者
Ruishuo Chen, Xun Wang, Rui Hu, Zhuoran Li, Longbo Huang
文章类型
NEWS
语言
en
发布日期
2026-05-26

摘要

arXiv:2505.20110v3 Announce Type: replace-cross Abstract: Generative Flow Networks (GFlowNets) excel at sampling diverse, high-reward objects. In many practical applications where active reward queries are infeasible, these models must be trained using static offline datasets. Prevailing training methods typically rely on a proxy model to provide reward feedback for online sampled trajectories. However, constructing a reliable proxy is often challenging due to data scarcity or high evaluation costs. While existing proxy-free approaches attempt to address this, they often impose coarse constraints that limit the model's ability to explore effectively. To overcome these limitations, we propose Trajectory-Distilled GFlowNet (TD-GFN), a novel proxy-free training framework. TD-GFN utilizes inverse reinforcement learning (IRL) to extract dense, transition-level edge rewards from offline trajectories, providing rich structural guidance for efficient exploration.