Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification 论文

2017Baltic Journal of Modern Computing引用 311顶会

Text and Document Classification TechnologiesAdvanced Text Analysis TechniquesSpam and Phishing Detection

Text and Document Classification Technologies Advanced Text Analysis Techniques Spam and Phishing Detection

作者

摘要

Today, a largely scalable computing environment provides a possibility of carrying out various data-intensive natural language processing and machine-learning tasks. One of these is text classification with some issues recently investigated by many data scientists. The authors of this paper investigate Nave Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression classifiers implemented in Apache Spark, i.e. the in-memory intensive computing platform. The focus of the paper is on comparing these classifiers by evaluating the classification accuracy, based on the size of training data sets, and the number of n-grams. In experiments, short texts for product-review data from Amazon 1 were analyzed.

作者查看全部 (2)

Virginijus Marcinkevičius

Tomas Pranckevičius

Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification 论文

详细信息

摘要

作者查看全部 (2)

相关技术查看全部 (2)

相关事件

相关文章