Negative Training for Neural Dialogue Response Generation

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Although deep learning models have brought tremendous advancements to the field of open-domain dialogue response generation, recent research results have revealed that the trained models have undesirable generation behaviors, such as malicious responses and generic (boring) responses. In this work, we propose a framework named “Negative Training” to minimize such behaviors. Given a trained model, the framework will first find generated samples that exhibit the undesirable behavior, and then use them to feed negative training signals for fine-tuning the model. Our experiments show that negative training can significantly reduce the hit rate of malicious responses (e.g. from 12.6% to 0%), or discourage frequent responses and improve response diversity (e.g. improve response entropy by over 63%).

Abstract (translated by Google)

URL

http://arxiv.org/abs/1903.02134

PDF

http://arxiv.org/pdf/1903.02134

Negative Training for Neural Dialogue Response Generation

Abstract

Abstract (translated by Google)

URL

PDF

Comments