Abstract
While neural networks have been used to classify or embed data into lower dimensional spaces, they are often regarded as black boxes with uninterpretable features. Here we propose Graph Spectral Regularization for making hidden layers more interpretable without significantly affecting the performance of their primary task. Taking inspiration from spatial organization and localization of neuron activations in biological networks, we use a graph Laplacian that encourages activations to be smooth either on a predetermined graph or on a feature-space graph learned from the data via co-activations of a hidden layer of the neural network. We show numerous uses for this including cluster indication and visualization in biological and image data sets.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1810.00424