papers AI Learner
The Github is limit! Click to go to the new site.

Stochastic Approximation for Risk-aware Markov Decision Processes

2019-05-09
Wenjie Huang, William B. Haskell

Abstract

In this paper, we develop a stochastic approximation type algorithm to solve finite state/action, infinite-horizon, risk-aware Markov decision processes. Our algorithm has two loops. The inner loop computes the risk by solving a stochastic saddle-point problem. We show that several widely investigated risk measures (e.g. conditional value-at-risk, optimized certainty equivalent, and absolute semi-deviation) can be expressed as stochastic saddle-point problems. The outer loop does Qlearning to compute an optimal risk-aware policy. We establish the almost sure convergence and convergence rate of our overall algorithm. For an error tolerance ϵ>0 and learning rate k(1/2,1], the overall convergence rate of our algorithm is Ω((ln(1/δϵ)/ϵ2)1/k+(ln(1/ϵ))1/(1k)) with probability at least 1δ

Abstract (translated by Google)
URL

http://arxiv.org/abs/1805.04238

PDF

http://arxiv.org/pdf/1805.04238


Similar Posts

Comments