Abstract
With the aim of constructing a biologically plausible model of machine listening, we study the representation of a multicomponent stationary signal by a wavelet scattering network. First, we show that renormalizing second-order nodes by their first-order parents gives a simple numerical criterion to establish whether two neighboring components will interfere psychoacoustically. Secondly, we generalize the `one or two components’ framework to three sine waves or more, and show that a network of depth $M = \log_2 N$ suffices to characterize the relative amplitudes of the first $N$ terms in a Fourier series, while enjoying properties of invariance to frequency transposition and component-wise phase shifts.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1905.08601