Abstract
In recent years, complex valued artificial neural networks have gained increasing interest as they allow neural networks to learn richer representations while potentially incorporating less parameters. Especially in the domain of computer graphics, many traditional operations rely heavily on computations in the complex domain, thus complex valued neural networks apply naturally. In this paper, we perform frame predictions in video sequences using a complex valued gated auto-encoder. First, our method is motivated showing how the Fourier transform can be seen as the basis for translational operations. Then, we present how a complex neural network can learn such transformations and compare its performance and parameter efficiency to a real-valued gated autoencoder. Furthermore, we show how extending both - the real and the complex valued - neural networks by using convolutional units can significantly improve prediction performance and parameter efficiency. The networks are assessed on a moving noise and a bouncing ball dataset.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1903.03336