Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

2019-05-27

Aristide Tossou, Debabrota Basu, Christos Dimitrakakis

arXiv_AI

arXiv_AI Reinforcement_Learning

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

We study model-based reinforcement learning in an unknown finite communicating Markov decision process. We propose a simple algorithm that leverages a variance based confidence interval. We show that the proposed algorithm, UCRL-V, achieves the optimal regret $\tilde{\mathcal{O}}(\sqrt{DSAT})$ up to logarithmic factors, and so our work closes a gap with the lower bound without additional assumptions on the MDP. We perform experiments in a variety of environments that validates the theoretical bounds as well as prove UCRL-V to be better than the state-of-the-art algorithms.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.12425

PDF

http://arxiv.org/pdf/1905.12425

Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments