quanteda: An R package for the quantitative analysis of textual data 论文

2018The Journal of Open Source Software引用 1309顶会

Advanced Text Analysis TechniquesTopic ModelingData Mining Algorithms and Applications

Topic Modeling Data Mining Algorithms and Applications Advanced Text Analysis Techniques

作者

摘要

quanteda is an R package providing a comprehensive workflow and toolkit for natural language processing tasks such as corpus management, tokenization, analysis, and visualization. It has extensive functions for applying dictionary analysis, exploring texts using keywords-in-context, computing document and feature similarities, and discovering multi-word expressions through collocation scoring. Based entirely on sparse operations, it provides highly efficient methods for compiling document-feature matrices and for manipulating these or using them in further quantitative analysis. Using C++ and multithreading extensively, quanteda is also considerably faster and more efficient than other R and Python packages in processing large textual data.

作者查看全部 (6)

Stefan Müller

Akitaka Matsuo

Adam Obeng

Paul Nulty

quanteda: An R package for the quantitative analysis of textual data 论文

摘要

作者查看全部 (6)

相关技术查看全部 (2)

相关事件

相关文章