Robust Markov Decision Processes 论文

2012Mathematics of Operations Research引用 300

Risk and Portfolio OptimizationFuzzy Systems and OptimizationBayesian Modeling and Causal Inference

Bayesian Modeling and Causal Inference Fuzzy Systems and Optimization Risk and Portfolio Optimization

作者

摘要

Markov decision processes (MDPs) are powerful tools for decision making in uncertain dynamic environments. However, the solutions of MDPs are of limited practical use because of their sensitivity to distributional model parameters, which are typically unknown and have to be estimated by the decision maker. To counter the detrimental effects of estimation errors, we consider robust MDPs that offer probabilistic guarantees in view of the unknown parameters. To this end, we assume that an observation history of the MDP is available. Based on this history, we derive a confidence region that contains the unknown parameters with a prespecified probability 1-β. Afterward, we determine a policy that attains the highest worst-case performance over this confidence region. By construction, this policy achieves or exceeds its worst-case performance with a confidence of at least 1-β. Our method involves the solution of tractable conic programs of moderate size.

作者查看全部 (3)

Berç Rüstem

Daniel Kühn

Wolfram Wiesemann

Robust Markov Decision Processes 论文

摘要

作者查看全部 (3)

相关技术查看全部 (3)

相关事件

相关文章