Abstract
We revisit skip-gram negative sampling (SGNS), one of the most popular neural-network based approaches to learning distributed word representation. We first point out the ambiguity issue undermining the SGNS model, in the sense that the word vectors can be entirely distorted without changing the objective value. To resolve the issue, we investigate the intrinsic structures in solution that a good word embedding model should deliver. Motivated by this, we rectify the SGNS model with quadratic regularization, and show that this simple modification suffices to structure the solution in the desired manner. A theoretical justification is presented, which provides novel insights into quadratic regularization . Preliminary experiments are also conducted on Google’s analytical reasoning task to support the modified SGNS model.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1804.00306