Conservative Agency via Attainable Utility Preservation

2019-02-26

Alexander Matt Turner, Dylan Hadfield-Menell, Prasad Tadepalli

arXiv_AI

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Reward functions are often misspecified. An agent optimizing an incorrect reward function can change its environment in large, undesirable, and potentially irreversible ways. Work on impact measurement seeks a means of identifying (and thereby avoiding) large changes to the environment. We propose a novel impact measure which induces conservative, effective behavior across a range of situations. The approach attempts to preserve the attainable utility of auxiliary objectives. We evaluate our proposal on an array of benchmark tasks and show that it matches or outperforms relative reachability, the state-of-the-art in impact measurement.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.09725

PDF

http://arxiv.org/pdf/1902.09725

Conservative Agency via Attainable Utility Preservation

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments