Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

2019-02-11

Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson

arXiv_AI

arXiv_AI Inference Deep_Learning

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove that our proposed learning rate schedule provides faster convergence to samples from a stationary distribution than SG-MCMC with standard decaying schedules. Moreover, we provide extensive experimental results to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.03932

PDF

http://arxiv.org/pdf/1902.03932

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments