Latent dirichlet allocation 论文

2014引用 291
Computational and Text Analysis MethodsComplex Network Analysis TechniquesAdvanced Text Analysis Techniques

详细信息

发表日期
2014-06-23
发表年份
2014

关键词

Computational and Text Analysis MethodsComplex Network Analysis TechniquesAdvanced Text Analysis Techniques

摘要

Topic modeling, in particular the Latent Dirichlet Allocation (LDA) model, has recently emerged as an important tool for understanding large datasets, in particular, user-generated datasets in social studies of the Web. In this work, we investigate the instability of LDA inference, propose a new metric of similarity between topics and a criterion of vocabulary reduction. We show the limitations of the LDA approach for the purposes of qualitative analysis in social science and sketch some ways for improvement.