Statistical Techniques for Natural Language Parsing 论文
摘要
We review current statistical work on syntactic parsing and then consider part-of-speech tagging, which was the first syntactic problem to be successfully attacked by statistical techniques and also serves as a good warmup for the main topic, statistical parsing. Here we consider both the simplified case in which the input string is viewed as a string of parts of speech, and the more interesting case in which the parser is guided by statistical information about the particular words in the sentence. Finally we anticipate future research directions. 1 Introduction Syntactic parsing is the process of assigning a "phrase marker" to a sentence --- that is, the process that given a sentence like "The dog ate," produces a structure like that in Figure 1. In this example we adopt the standard abbreviations: np for "noun phrase," vp for "verb phrase," and det for "determiner." It is generally accepted that finding the sort of structure shown in Figure 1 is useful in determining the m...