Contraction Mappings in the Theory Underlying Dynamic Programming 论文

1967SIAM Review引用 466
Guidance and Control SystemsGame Theory and ApplicationsReinforcement Learning in Robotics

摘要

This memorandum is concerned with the formulation and analysis of a broad class of optimization problems including many, but not all, dynamic programming problems. The formulation generalizes those of several authors and provides further insight into the class of problems which satisfy Bellman's Principle of Optimality. Three widely shared properties of optimization problems are abstracted; they are called 'contraction,' 'monotonicity,' and 'N-stage contraction' properties. Analysis is conducted of those classes of problems satisfying the first property, the first two properties, and the last two properties. In each case, a fixed-point argument guarantees that a functional equation have a unique solution; in the latter two cases that solution is the optimal return function. Policies whose return functions attain or approximate the fixed point are shown to exist, and three techniques for determining such policies are presented. The techniques are successive approximations, mathematical programming, and a generalization of one of Howard's policy improvement routines. Certain issues concerning history-remembering decision procedures and stationary processes are settled. Examples are provided, and apparent integrability issues in some of them are circumvented. (Author)

相关事件

暂无数据

相关文章

暂无数据