Abstract
In this article, we consider the inverse optimal control problem given incomplete observations of an optimal trajectory. We hypothesize that the cost function is constructed as a weighted sum of relevant features (or basis functions). We handle the problem by proposing the recovery matrix, which establishes a relationship between available observations of the trajectory and weights of given candidate features. The rank of the recovery matrix indicates whether a subset of relevant features can be found among the candidate features and the corresponding weights can be recovered. Additional observations tend to increase the rank of the recovery matrix, thus enabling cost function recovery. We also show that the recovery matrix can be computed iteratively. Based on the recovery matrix, a methodology for using incomplete observations of the trajectory to recover the weights of specified features is established, and an efficient algorithm for recovering the feature weights by finding the minimal required observations is developed. We apply the proposed algorithm to learning the cost function of a simulated robot manipulator conducting free-space motions. The results demonstrate the stable, accurate and robust performance of the proposed approach compared to state of the art techniques.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1803.07696