Word Sense Disambiguation: The State of the Art 论文

2005引用 294
Natural Language Processing TechniquesTopic ModelingSpeech and dialogue systems

摘要

, ANIMATE, HUMAN, etc. and encode type restrictions on nouns and adjectives and on the arguments of verbs. Subject codes use another set of primitives to classify senses of words by subject (ECONOMICS, ENGINEERING, etc.). Guthrie et al. (1991) demonstrate a typical use of this information: in addition to using the Lesk-based method of counting overlaps between definitions and contexts, they impose a correspondence of subject codes in an iterative process. No quantitative evaluation of this method is available, but Cowie et al. (1992) improve the method using simulated annealing and report results of 47% for sense distinctions and 72% for homographs. The use of LDOCE box codes, however, is problematic: the codes are not systematic (see, for example, Fontenelle, 1990); in later work, Braden-Harder (1993) showed that simply matching box or subject codes is not sufficient for disambiguation. For example, in I tipped the driver, the codes for several senses of the words in the sentence sati...