Abstract
We propose a quantitative and qualitative analysis of the performances of statistical models for frame semantic structure extraction. We report on a replication study on FrameNet 1.7 data and show that preprocessing toolkits play a major role in argument identification performances, observing gains similar in their order of magnitude to those reported by recent models for frame semantic parsing. We report on the robustness of a recent statistical classifier for frame semantic parsing to lexical configurations of predicate-argument structures, relying on an artificially augmented dataset generated using a rule-based algorithm combining valence pattern matching and lexical substitution. We prove that syntactic pre-processing plays a major role in the performances of statistical classifiers to argument identification, and discuss the core reasons of syntactic mismatch between dependency parsers output and FrameNet syntactic formalism. Finally, we suggest new leads for improving statistical models for frame semantic parsing, including joint syntax-semantic parsing relying on FrameNet syntactic formalism, latent classes inference via split-and-merge algorithms and neural network architectures relying on rich input representations of words.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1901.07475