Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset

Open AccessPosted Content

Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset

Hannah Rose Kirk, +9 more

- 09 Jul 2021 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

This article collected hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset and found that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, injecting noise and diminishing performance of multimodal models, and 2) Memes are more diverse than traditional memes, including screenshots of conversations or text on a plain background.

Abstract:

Hateful memes pose a unique challenge for current machine learning systems because their message is derived from both text- and visual-modalities. To this effect, Facebook released the Hateful Memes Challenge, a dataset of memes with pre-extracted text captions, but it is unclear whether these synthetic examples generalize to `memes in the wild'. In this paper, we collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset. We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, injecting noise and diminishing performance of multimodal models, and 2) Memes are more diverse than `traditional memes', including screenshots of conversations or text on a plain background. This paper thus serves as a reality check for the current benchmark of hateful meme detection and its applicability for detecting real world hate.

References

PDF

Open Access

More filters

Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008 -

Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

Proceedings ArticleDOI

FaceNet: A unified embedding for face recognition and clustering

Florian Schroff, +2 more

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.

...read moreread less

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Joy Buolamwini, +1 more

TL;DR: It is shown that the highest error involves images of dark-skinned women, while the most accurate result is for light-skinned men, in commercial API-based classifiers of gender from facial images, including IBM Watson Visual Recognition.

...read moreread less

Proceedings ArticleDOI

An Overview of the Tesseract OCR Engine

Ray Smith

TL;DR: The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview.

...read moreread less

Proceedings Article

Automated Hate Speech Detection and the Problem of Offensive Language

Thomas Davidson, +3 more

TL;DR: This work used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and labels a sample of these tweets into three categories: those containinghate speech, only offensive language, and those with neither.

...read moreread less