Abstract
Distant supervision for relation extraction heavily suffers from the wrong labeling problem. To alleviate this issue in news data with the timestamp, we take a new factor time into consideration and propose a novel time-aware distant supervision framework (Time-DS). Time-DS is composed of a time series instance-popularity and two strategies. Instance-popularity is to encode the strong relevance of time and true relation mention. Therefore, instance-popularity would be an effective clue to reduce the noises generated through distant supervision labeling. The two strategies, i.e., hard filter and curriculum learning are both ways to implement instance-popularity for better relation extraction in the manner of Time-DS. The curriculum learning is a more sophisticated and flexible way to exploit instance-popularity to eliminate the bad effects of noises, thus get better relation extraction performance. Experiments on our collected multi-source news corpus show that Time-DS achieves significant improvements for relation extraction.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1903.03289