Automatic Keyphrase Extraction via Topic Decomposition 论文
摘要
Existing graph-based ranking methods for keyphrase extraction compute a single impor-tance score for each word via a single ran-dom walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into mul-tiple random walks specific to various topics. We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. After that, given the topic distribution of the document, we fur-ther calculate the ranking scores of words and extract the top ranked ones as keyphrases. Ex-perimental results show that TPR outperforms state-of-the-art keyphrase extraction methods on two datasets under various evaluation met-rics. 1