The MAXQ Method for Hierarchical Reinforcement Learning 论文

1998引用 285

Reinforcement Learning in RoboticsEvolutionary Algorithms and ApplicationsElevator Systems and Control

机器人 Evolutionary Algorithms and Applications Reinforcement Learning in Robotics Elevator Systems and Control

作者

摘要

This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semantics---as a subroutine hierarchy---and a declarative semantics---as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling, and Dayan and Hinton. Conditions under which the MAXQ decomposition can represent the optimal value function are derived. The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary &quot;flat&quot; Q learning. Finally, the paper discusses some interesting issues that arise in hierarchical reinforcement learning including the hierarchical credit assignment problem and non-hierarchical execution of the MAXQ hierarchy. 1 Introduction Hierarchical approaches to reinforcement learning (RL) problems promise ma...

作者查看全部 (1)

Thomas G. Dietterich

The MAXQ Method for Hierarchical Reinforcement Learning 论文

详细信息

摘要

作者查看全部 (1)

相关技术查看全部 (2)

相关事件

相关文章