scispace - formally typeset
Open AccessProceedings ArticleDOI

Detecting Offensive Language in Social Media to Protect Adolescent Online Safety

TLDR
This work proposes the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media, and incorporates a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content.
Abstract
Since the textual contents on online social media are highly unstructured, informal, and often misspelled, existing research on message-level offensive language detection cannot accurately detect offensive content. Meanwhile, user-level offensiveness detection seems a more feasible approach but it is an under researched area. To bridge this gap, we propose the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media. We distinguish the contribution of pejoratives/profanities and obscenities in determining offensive content, and introduce hand-authoring syntactic rules in identifying name-calling harassments. In particular, we incorporate a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content. Results from experiments showed that our LSF framework performed significantly better than existing methods in offensive content detection. It achieves precision of 98.24% and recall of 94.34% in sentence offensive detection, as well as precision of 77.9% and recall of 77.8% in user offensive detection. Meanwhile, the processing speed of LSF is approximately 10msec per sentence, suggesting the potential for effective deployment in social media.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter

TL;DR: A list of criteria founded in critical race theory is provided, and these are used to annotate a publicly available corpus of more than 16k tweets and present a dictionary based the most indicative words in the data.
Proceedings ArticleDOI

A Survey on Hate Speech Detection using Natural Language Processing

TL;DR: A survey on hate speech detection describes key areas that have been explored to automatically recognize these types of utterances using natural language processing and discusses limits of those approaches.
Proceedings ArticleDOI

Abusive Language Detection in Online User Content

TL;DR: A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind.
Journal ArticleDOI

A Survey on Automatic Detection of Hate Speech in Text

TL;DR: This survey organizes and describes the current state of the field, providing a structured overview of previous approaches, including core algorithms, methods, and main features used, and provides a unifying definition of hate speech.
Journal ArticleDOI

Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making

TL;DR: It is demonstrated how the results of the classifier can be robustly utilized in a statistical model used to forecast the likely spread of cyber hate in a sample of Twitter data.
References
More filters

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

TL;DR: In this paper, the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, was considered and three machine learning methods (Naive Bayes, maximum entropy classiflcation, and support vector machines) were employed.
Proceedings ArticleDOI

Thumbs up? Sentiment Classification using Machine Learning Techniques

TL;DR: This work considers the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative, and concludes by examining factors that make the sentiment classification problem more challenging.
Posted Content

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

TL;DR: A simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (Thumbs down) if the average semantic orientation of its phrases is positive.
Proceedings Article

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

Peter, +1 more
TL;DR: This article proposed an unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended(thumbs down) based on the average semantic orientation of phrases in the review that contain adjectives or adverbs.
Proceedings Article

Generating Typed Dependency Parses from Phrase Structure Parses

TL;DR: A system for extracting typed dependency parses of English sentences from phrase structure parses that captures inherent relations occurring in corpus texts that can be critical in real-world applications is described.