<b>topicmodels</b>: An<i>R</i>Package for Fitting Topic Models 论文

2011Journal of Statistical Software引用 1072顶会
Topic ModelingAdvanced Text Analysis TechniquesComputational and Text Analysis Methods

摘要

Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package <b>topicmodels</b> provides basic infrastructure for fitting topic models based on data structures from the text mining package <b>tm</b>. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.