Faster and More Accurate Learning with Meta Trace Adaptation

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Learning speed and accuracy are of universal interest for reinforcement learning problems. In this paper, we investigate meta-learning approaches for adaptation of the trace decay parameter {\lambda} used in TD({\lambda}), from the perspective of optimizing a bias-variance tradeoff. We propose an off-policy applicable method of meta-learning the {\lambda} parameters via optimizing a metaobjective with effcient incremental updates. The proposed trust-region style algorithm, under proper assumptions, is shown to be equivalent to optimizing the bias-variance tradeoff for the overall target for all states. In experiments, we validate the effectiveness of the proposed method MTA showing its significantly faster and more accurate learning patterns compared to the compared methods and baselines.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1904.11439

PDF

http://arxiv.org/pdf/1904.11439

Faster and More Accurate Learning with Meta Trace Adaptation

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments