Abstract
We extend the recent results of (Arora et al., 2019) by a spectral analysis of representations corresponding to kernel and neural embeddings. They showed that in a simple single layer network, the alignment of the labels to the eigenvectors of the corresponding Gram matrix determines both the convergence of the optimization during training as well as the generalization properties. We show quantitatively that kernel and neural representations improve both optimization and generalization. We give results for the Gaussian kernel and approximations by random Fourier features as well as for embeddings produced by two layer networks trained on different tasks.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1905.05095