Abstract
On visual analytics applications, the concept of putting the user on the loop refers to the ability to replace heuristics by user knowledge on machine learning and data mining tasks. On supervised tasks, the user engagement occurs via the manipulation of the training data. However, on unsupervised tasks, the user involvement is limited to changes in the algorithm parametrization or the input data representation, also known as features. Depending on the application domain, different types of features can be extracted from the raw data. Therefore, the result of unsupervised algorithms heavily depends on the type of employed feature. Since there is no perfect feature extractor, combining different features have been explored in a process called feature fusion. The feature fusion is straightforward when the machine learning or data mining task has a cost function. However, when such a function does not exist, user support for combination needs to be provided otherwise the process is impractical. In this paper, we present a novel feature fusion approach that uses small data samples to allows users not only to effortless control the combination of different feature sets but also to interpret the attained results. The effectiveness of our approach is confirmed by a comprehensive set of qualitative and quantitative tests, opening up different possibilities of user-guided analytical scenarios not covered yet. The ability of our approach to providing real-time feedback for the feature fusion is exploited on the context of unsupervised clustering techniques, where the composed groups reflect the semantics of the feature combination.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1901.05556