Abstract
We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.
Abstract (translated by Google)
我们提出了一个简单的正则化技术用于具有长期短期记忆(LSTM)单元的递归神经网络(RNN)。辍学是规范神经网络最成功的技术,对于RNN和LSTM来说效果不佳。在本文中,我们展示了如何正确地将丢失应用到LSTM,并显示它大大减少了对各种任务的过度拟合。这些任务包括语言建模,语音识别,图像标题生成和机器翻译。
URL
https://arxiv.org/abs/1409.2329