Latent dirichlet allocation 论文
2014引用 291
Computational and Text Analysis MethodsComplex Network Analysis TechniquesAdvanced Text Analysis Techniques
详细信息
- 发表日期
- 2014-06-23
- 发表年份
- 2014
关键词
Computational and Text Analysis MethodsComplex Network Analysis TechniquesAdvanced Text Analysis Techniques
摘要
Topic modeling, in particular the Latent Dirichlet Allocation (LDA) model, has recently emerged as an important tool for understanding large datasets, in particular, user-generated datasets in social studies of the Web. In this work, we investigate the instability of LDA inference, propose a new metric of similarity between topics and a criterion of vocabulary reduction. We show the limitations of the LDA approach for the purposes of qualitative analysis in social science and sketch some ways for improvement.