scispace - formally typeset
Open AccessPosted Content

Interpretable Multi-Modal Hate Speech Detection.

TLDR
This article proposed a deep neural multi-modal model that can detect hate speech by effectively capturing the semantics of the text along with socio-cultural context in which a particular hate expression is made, and provide interpretable insights into decisions of the model.
Abstract
With growing role of social media in shaping public opinions and beliefs across the world, there has been an increased attention to identify and counter the problem of hate speech on social media. Hate speech on online spaces has serious manifestations, including social polarization and hate crimes. While prior works have proposed automated techniques to detect hate speech online, these techniques primarily fail to look beyond the textual content. Moreover, few attempts have been made to focus on the aspects of interpretability of such models given the social and legal implications of incorrect predictions. In this work, we propose a deep neural multi-modal model that can: (a) detect hate speech by effectively capturing the semantics of the text along with socio-cultural context in which a particular hate expression is made, and (b) provide interpretable insights into decisions of our model. By performing a thorough evaluation of different modeling techniques, we demonstrate that our model is able to outperform the existing state-of-the-art hate speech classification approaches. Finally, we show the importance of social and cultural context features towards unearthing clusters associated with different categories of hate.

read more

Citations
More filters
Posted Content

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

TL;DR: The authors proposed a new challenge set for multimodal classification, focusing on detecting hate speech in multi-modal memes, where difficult examples are added to the dataset to make it hard to rely on unimodal signals.
Proceedings ArticleDOI

Impact of politically biased data on hate speech classification

TL;DR: It is shown that political bias negatively impairs the performance of hate speech classifiers and an explainable machine learning model can help to visualize such bias within the training data.
Proceedings ArticleDOI

Understanding and Interpreting the Impact of User Context in Hate Speech Detection

TL;DR: This work reveals that user features play a role in the model’s decision and how they affect the feature space learned by the model, and shows how such techniques can be combined to better understand the model and to detect unintended bias.
Book ChapterDOI

Explainable Abusive Language Classification Leveraging User and Network Data

TL;DR: In this article, an explainable AI framework SHAP (SHapley Additive explanations) is proposed to alleviate the general issue of missing transparency associated with deep learning models, allowing the model to assess the model's vulnerability toward bias and systematic discrimination reliably.
Proceedings ArticleDOI

A Multimodal Deep Framework for Derogatory Social Media Post Identification of a Recognized Person

TL;DR: In this article, social media platforms play a significant role in networking and influencing the perception of the general population in today's era of digitization, and social network sites have recently been used t...
References
More filters
Proceedings ArticleDOI

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.
Posted Content

Convolutional Neural Networks for Sentence Classification

TL;DR: In this article, CNNs are trained on top of pre-trained word vectors for sentence-level classification tasks and a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks.
Journal ArticleDOI

A Survey of Methods for Explaining Black Box Models

TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.
Proceedings ArticleDOI

Supervised learning of universal sentence representations from natural language inference data

TL;DR: This article showed how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
Posted Content

Skip-Thought Vectors

TL;DR: The approach for unsupervised learning of a generic, distributed sentence encoder is described, using the continuity of text from books to train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage.
Related Papers (5)
Trending Questions (1)
What are the current methods for detecting hate speech?

The current methods for detecting hate speech primarily focus on textual content and lack interpretability.