Abstract
We provide the first computational treatment of fused-heads constructions (FH), focusing on the numeric fused-heads (NFH). FHs constructions are noun phrases (NPs) in which the head noun is missing and is said to be `fused’ with its dependent modifier. This missing information is implicit and is important for sentence understanding. The missing references are easily filled in by humans but pose a challenge for computational models. We formulate the handling of FH as a two stages process: identification of the FH construction and resolution of the missing head. We explore the NFH phenomena in large corpora of English text and create (1) a dataset and a highly accurate method for NFH identification; (2) a 10k examples (1M tokens) crowd-sourced dataset of NFH resolution; and (3) a neural baseline for the NFH resolution task. We release our code and dataset, in hope to foster further research into this challenging problem.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1905.10886