papers AI Learner
The Github is limit! Click to go to the new site.

Conservative Agency via Attainable Utility Preservation

2019-02-26
Alexander Matt Turner, Dylan Hadfield-Menell, Prasad Tadepalli

Abstract

Reward functions are often misspecified. An agent optimizing an incorrect reward function can change its environment in large, undesirable, and potentially irreversible ways. Work on impact measurement seeks a means of identifying (and thereby avoiding) large changes to the environment. We propose a novel impact measure which induces conservative, effective behavior across a range of situations. The approach attempts to preserve the attainable utility of auxiliary objectives. We evaluate our proposal on an array of benchmark tasks and show that it matches or outperforms relative reachability, the state-of-the-art in impact measurement.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1902.09725

PDF

http://arxiv.org/pdf/1902.09725


Similar Posts

Comments