scispace - formally typeset
Open AccessJournal ArticleDOI

Hate speech detection: Challenges and solutions.

Reads0
Chats0
TLDR
This work identifies and examines challenges faced by online automatic approaches for hate speech detection in text, and proposes a multi-view SVM approach that achieves near state-of-the-art performance, while being simpler and producing more easily interpretable decisions than neural methods.
Abstract
As online content continues to grow, so does the spread of hate speech. We identify and examine challenges faced by online automatic approaches for hate speech detection in text. Among these difficulties are subtleties in language, differing definitions on what constitutes hate speech, and limitations of data availability for training and testing of these systems. Furthermore, many recent approaches suffer from an interpretability problem-that is, it can be difficult to understand why the systems make the decisions that they do. We propose a multi-view SVM approach that achieves near state-of-the-art performance, while being simpler and producing more easily interpretable decisions than neural methods. We also discuss both technical and practical challenges that remain for this task.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

What about Hate Speech

TL;DR: In echten (analogen) Leben sind wir eher selten offenen Beleidigungen oder Hass ausgesetzt.
Journal ArticleDOI

Developing an online hate classifier for multiple social media platforms

TL;DR: While all the models significantly outperform the keyword-based baseline classifier, XGBoost using all features performs the best and feature importance analysis indicates that BERT features are the most impactful for the predictions.
Journal ArticleDOI

Detecting weak and strong Islamophobic hate speech on social media

TL;DR: Islamophobic hate speech on social media is a growing concern in contemporary Western politics and society as discussed by the authors, and it can inflict considerable harm on any victims who are targeted, create a sense of fear and cause considerable harm.
Journal ArticleDOI

A deep neural network based multi-task learning approach to hate speech detection

TL;DR: A deep multi-task learning (MTL) framework is proposed to leverage useful information from multiple related classification tasks in order to improve the performance of the individual task.
Posted Content

Contextualizing Hate Speech Classifiers with Post-hoc Explanation

TL;DR: This work extracts post-hoc explanations from fine-tuned BERT classifiers to detect bias towards identity terms and proposes a novel regularization technique based on these explanations that encourages models to learn from the context of group identifiers in addition to the identifiers themselves.
References
More filters
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Related Papers (5)