Effective Estimation of Deep Generative Language Models

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Advances in variational inference enable parameterisation of probabilistic models by deep neural networks. This combines the statistical transparency of the probabilistic modelling framework with the representational power of deep learning. Yet, it seems difficult to effectively estimate such models in the context of language modelling. Even models based on rather simple generative stories struggle to make use of additional structure due to a problem known as posterior collapse. We concentrate on one such model, namely, a variational auto-encoder, which we argue is an important building block in hierarchical probabilistic models of language. This paper contributes a sober view of the problem, a survey of techniques to address it, novel techniques, and extensions to the model. Our experiments on modelling written English text support a number of recommendations that should help researchers interested in this exciting field.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1904.08194

PDF

http://arxiv.org/pdf/1904.08194

Effective Estimation of Deep Generative Language Models

Abstract

Abstract (translated by Google)

URL

PDF

Comments