Abstract
In this paper Gabor scattering, a feature extractor based on Gabor frames and Mallat’s scattering transform, is introduced. By using a simple signal model for audio signals, i.e. a class of tones consisting of fundamental frequency and its multiples and an according envelope, we analyse specific properties of Gabor scattering. We show that for each separate layer, different invariances to certain signal characteristics occur. Furthermore, deformation stability of the coefficient vector generated by the feature extractor is derived by using a decoupling technique which exploits the contractivity of general scattering networks. Here, we are interested in robustness with respect to changes in spectral shape and frequency modulation. Our findings are illustrated by numerical examples and experiments. We specifically give numerical evidence that the invariances encoded by the Gabor scattering transform lead to improved generalization properties in comparison with the standard Mel-spectrogram coefficients, in particular in the case of the availability of a restricted amount of data.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1706.08818