Detecting Offensive Language in Social Media to Protect Adolescent Online Safety
Ying Chen,Yilu Zhou,Sencun Zhu,Heng Xu +3 more
- pp 71-80
TLDR
This work proposes the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media, and incorporates a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content.Abstract:
Since the textual contents on online social media are highly unstructured, informal, and often misspelled, existing research on message-level offensive language detection cannot accurately detect offensive content. Meanwhile, user-level offensiveness detection seems a more feasible approach but it is an under researched area. To bridge this gap, we propose the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media. We distinguish the contribution of pejoratives/profanities and obscenities in determining offensive content, and introduce hand-authoring syntactic rules in identifying name-calling harassments. In particular, we incorporate a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content. Results from experiments showed that our LSF framework performed significantly better than existing methods in offensive content detection. It achieves precision of 98.24% and recall of 94.34% in sentence offensive detection, as well as precision of 77.9% and recall of 77.8% in user offensive detection. Meanwhile, the processing speed of LSF is approximately 10msec per sentence, suggesting the potential for effective deployment in social media.read more
Citations
More filters
Proceedings ArticleDOI
Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter
Zeerak Waseem,Dirk Hovy +1 more
TL;DR: A list of criteria founded in critical race theory is provided, and these are used to annotate a publicly available corpus of more than 16k tweets and present a dictionary based the most indicative words in the data.
Proceedings ArticleDOI
A Survey on Hate Speech Detection using Natural Language Processing
Anna Schmidt,Michael Wiegand +1 more
TL;DR: A survey on hate speech detection describes key areas that have been explored to automatically recognize these types of utterances using natural language processing and discusses limits of those approaches.
Proceedings ArticleDOI
Abusive Language Detection in Online User Content
TL;DR: A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind.
Journal ArticleDOI
A Survey on Automatic Detection of Hate Speech in Text
Paula Fortuna,Sérgio Nunes +1 more
TL;DR: This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used, and provides a unifying definition of hate speech.
Journal ArticleDOI
Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
TL;DR: It is demonstrated how the results of the classifier can be robustly utilized in a statistical model used to forecast the likely spread of cyber hate in a sample of Twitter data.
References
More filters
Thumbs up? Sentiment Classiflcation using Machine Learning Techniques
TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.
Proceedings ArticleDOI
Thumbs up? Sentiment Classification using Machine Learning Techniques
TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
Posted Content
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
TL;DR: A simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (Thumbs down) if the average semantic orientation of its phrases is positive.
Proceedings Article
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
TL;DR: This article proposed an unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended(thumbs down) based on the average semantic orientation of phrases in the review that contain adjectives or adverbs.
Proceedings Article
Generating Typed Dependency Parses from Phrase Structure Parses
TL;DR: A system for extracting typed dependency parses of English sentences from phrase structure parses that captures inherent relations occurring in corpus texts that can be critical in real-world applications is described.