Data set construction and exploratory experiments for cyberbullying detection
Title | Data set construction and exploratory experiments for cyberbullying detection |
Publication Type | Talk |
Year of Publication | 2014 |
Authors | Van Hee, C, Verhoeven, B, Lefever, E, De Pauw, G, Hoste, V, Daelemans, W |
Conference/Workshop/... | Presented at ATILA 2014, Ghent, Belgium |
Date Published | 11/2014 |
Abstract | In the current era of online interactions, both positive and negative experiences are abundant on the web. As in real life, these negative experiences can have quite an impact on our youngsters. Recent research report cybervictimization rates among teenagers between 3% and 24% (Olweus, 2012; Patchin & Hinduja, 2012). In the research project AMiCA (Automatic Monitoring for Cyberspace Applications), we strive to automatically detect harmful content such as cyberbullying on social networks. We collected data from social networking sites and by simulating cyberbullying events with volunteer youngsters. This dataset was annotated for a number of fine-grained categories related to cyberbullying such as insults and threats. More broadly, the severity of cyberbullying in the post, as well as the author's role in the cyberbullying event (i.e. harasser, victim or bystander) were defined. We present the results of our preliminary experiments where we try to determine whether an online utterance is harmful (i.e. contains cyberbullying) or not. Moreover, we explore the feasibility to classify online posts in four categories (threats, insults, sexual talk and defensive statements). These results have provided insights in the difficulty and learnability of this task. |
File: