papers AI Learner
The Github is limit! Click to go to the new site.

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

2017-09-11
Bob Mankoff, Robert Nowak, Ervin Tanczos

Abstract

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.

Abstract (translated by Google)
URL

https://arxiv.org/abs/1709.03570

PDF

https://arxiv.org/pdf/1709.03570


Similar Posts

Comments