Detection of Harassment on Web 2.0 论文

2009引用 336
Hate Speech and Cyberbullying DetectionSpam and Phishing DetectionAdvanced Malware Detection Techniques

摘要

Web 2.0 has led to the development and evolution of web-based communities and applications. These communities provide places for information sharing and collaboration. They also open the door for inappropriate online activities, such as harassment, in which some users post messages in a virtual community that are intention-ally offensive to other members of the community. It is a new and challenging task to detect online harassment; currently few systems attempt to solve this problem. In this paper, we use a supervised learning approach for detect-ing harassment. Our technique employs content features, sentiment features, and contextual features of documents. The experimental results described herein show that our method achieves significant improvements over several baselines, including Term Frequency-Inverse Document Frequency (TFIDF) approaches. Identification of online harassment is feasible when TFIDF is supplemented with sentiment and contextual feature attributes.