Semantic Matching of Documents from Heterogeneous Collections: A Simple and Transparent Method for Practical Applications

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

We present a very simple, unsupervised method for the pairwise matching of documents from heterogeneous collections. We demonstrate our method with the Concept-Project matching task, which is a binary classification task involving pairs of documents from heterogeneous collections. Although our method only employs standard resources without any domain- or task-specific modifications, it clearly outperforms the more complex system of the original authors. In addition, our method is transparent, because it provides explicit information about how a similarity score was computed, and efficient, because it is based on the aggregation of (pre-computable) word-level similarities.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1904.12550

PDF

http://arxiv.org/pdf/1904.12550

Semantic Matching of Documents from Heterogeneous Collections: A Simple and Transparent Method for Practical Applications

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments