Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach arXiv:2605.30903v1 Announce Type: cross Abstract: Inverse reinforcement learning (IRL) typically assumes demonstrations from a single optimal demonstrator, but in many applications data come from multiple imperfect demonstrators with heterogeneous suboptimality levels. We study reward learning in this setting through a feasible-reward-set framework: for each demonstrator, we encode its declared subopt