scispace - formally typeset
Search or ask a question
Posted Content

Surfacing contextual hate speech words within social media

TL;DR: The goal is to advance the domain by providing a high quality hate speech dataset in addition to learned code words that can be fed into existing classification approaches, thus improving the accuracy of automated detection.
Abstract: Social media platforms have recently seen an increase in the occurrence of hate speech discourse which has led to calls for improved detection methods. Most of these rely on annotated data, keywords, and a classification technique. While this approach provides good coverage, it can fall short when dealing with new terms produced by online extremist communities which act as original sources of words which have alternate hate speech meanings. These code words (which can be both created and adopted words) are designed to evade automatic detection and often have benign meanings in regular discourse. As an example, "skypes", "googles", and "yahoos" are all instances of words which have an alternate meaning that can be used for hate speech. This overlap introduces additional challenges when relying on keywords for both the collection of data that is specific to hate speech, and downstream classification. In this work, we develop a community detection approach for finding extremist hate speech communities and collecting data from their members. We also develop a word embedding model that learns the alternate hate speech meaning of words and demonstrate the candidacy of our code words with several annotation experiments, designed to determine if it is possible to recognize a word as being used for hate speech without knowing its alternate meaning. We report an inter-annotator agreement rate of K=0.871, and K=0.676 for data drawn from our extremist community and the keyword approach respectively, supporting our claim that hate speech detection is a contextual task and does not depend on a fixed list of keywords. Our goal is to advance the domain by providing a high quality hate speech dataset in addition to learned code words that can be fed into existing classification approaches, thus improving the accuracy of automated detection.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper used Perspective, an AI technology developed by Jigsaw (formerly Google Ideas), to measure the levels of toxicity of tweets from prominent drag queens in the United States, and found that Perspective considered a significant number of drag queen Twitter accounts to have higher level of toxicity than white nationalists and failed to recognize cases in which words, that might be seen as offensive, conveyed different meanings in LGBTQ speech.
Abstract: Companies operating internet platforms are developing artificial intelligence tools for content moderation purposes. This paper discusses technologies developed to measure the ‘toxicity’ of text-based content. The research builds upon queer linguistic studies that have indicated the use of ‘mock impoliteness’ as a form of interaction employed by LGBTQ people to cope with hostility. Automated analyses that disregard such a pro-social function may, contrary to their intended design, actually reinforce harmful biases. This paper uses ‘Perspective’, an AI technology developed by Jigsaw (formerly Google Ideas), to measure the levels of toxicity of tweets from prominent drag queens in the United States. The research indicated that Perspective considered a significant number of drag queen Twitter accounts to have higher levels of toxicity than white nationalists. The qualitative analysis revealed that Perspective was not able to properly consider social context when measuring toxicity levels and failed to recognize cases in which words, that might conventionally be seen as offensive, conveyed different meanings in LGBTQ speech.

46 citations

Proceedings ArticleDOI
23 May 2021
TL;DR: The authors used unsupervised word embeddings to detect words being used euphemistically, and identify the secret meaning of each word, achieving a 30-400% higher detection accuracies of unlabeled euphemisms in a text corpus.
Abstract: Fringe groups and organizations have a long history of using euphemisms—ordinary-sounding words with a secret meaning—to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a "ban list", but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives [1]. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider "pot" (storage container or marijuana?) or "heater" (household appliance or firearm?) The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind.This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30–400% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators.

24 citations

Proceedings ArticleDOI
01 Nov 2020
TL;DR: This work proposes a framework aimed at fortifying existing toxic speech detectors without a large labeled corpus of veiled toxicity, by augmenting the toxic speech detector's training data with discovered offensive examples, thereby making it more robust to veiled toxicity while preserving its utility in detecting overt toxicity.
Abstract: Modern toxic speech detectors are incompetent in recognizing disguised offensive language, such as adversarial attacks that deliberately avoid known toxic lexicons, or manifestations of implicit bias. Building a large annotated dataset for such veiled toxicity can be very expensive. In this work, we propose a framework aimed at fortifying existing toxic speech detectors without a large labeled corpus of veiled toxicity. Just a handful of probing examples are used to surface orders of magnitude more disguised offenses. We augment the toxic speech detector’s training data with these discovered offensive examples, thereby making it more robust to veiled toxicity while preserving its utility in detecting overt toxicity.

22 citations


Cites background from "Surfacing contextual hate speech wo..."

  • ..., codewords (Taylor et al., 2017), novel forms of offense (Jain et al....

    [...]

  • ...…approaches (Waseem and Hovy, 2016; Davidson et al., 2017) and thus are ineffective at detecting forms of veiled toxicity; e.g., codewords (Taylor et al., 2017), novel forms of offense (Jain et al., 2018), and subtle and often unintentional manifestations of social bias such as…...

    [...]

Journal Article
TL;DR: In this paper, the authors examined the effect of hate speech on the formation of public opinion by some people especially those who form their opinions from the dominant view and found that measuring hate speech requires knowing the hate words or hate targets priori and that the description of hate words tends to be wide, sometimes extending to embody words that are insulting of those in power or minority groups, or demeaning of individuals who are particularly visible in the society.
Abstract: Given the pronouncement of a new bill by the Nigerian senate to douse the increasing hate speech that results in conflict occasionally fuelled by the social media, this work aims at examining the interplay between hate speech, social media and conflict in the society. The design adopted in the study is the Critical Discourse Analyses (CDA). Fifty-three textual documents downloaded from the social media comprising speeches made by some Nigerian personalities were analyzed. Hinged on the spiral of silence theory, the study considers the effect of hate speech on the formation of public opinion by some people especially those who form their opinions from the dominant view. While the study found the existence of hate contents on the social media, the extant literature shows that measuring hate speech requires knowing the hate words or hate targets priori and that the description of hate speech tends to be wide, sometimes extending to embody words that are insulting of those in power or minority groups, or demeaning of individuals who are particularly visible in the society. As the study also revealed, while hate speech may be prone to manipulation at critical times such as during election campaigns, accusations of promoting hate speech may be traded among political opponents or used by those in power to curb opposition and criticism, suggesting the need for intermediaries to advance the fight against hate speech because of the tendency of negative opinion formation by those exposed to hate messages given that some efforts are motivated by the impulse to ban hate speech as it can provoke pain, distress, fear, embarrassment and isolation to individuals.

2 citations

Proceedings ArticleDOI
01 Dec 2020
TL;DR: The authors investigated the use of embeddings learned with syntactic dependency context (words that have a grammatical relationship with the target word) as opposed to linear context in various forms as features for the classification task on a multi-class dataset.
Abstract: The task of detecting online textual hate speech focuses on various objectives with one of the most important being distinguishing between hate speech and generic offensive language. This distinction is very blurry since the two classes are strongly overlapped. Using only keywords for identification falls short because of this overlap, hence the context in which each textual instance exists in, needs to be extracted for increased effectiveness. To achieve this, we investigate the use of embeddings learned with syntactic dependency context (words that have a grammatical relationship with the target word) as opposed to linear context (words that precede and succeed the target word as determined by the window size) in various forms as features for the classification task on a multi-class dataset. Our results and analysis show that for the downstream task of hate speech detection, specifically for distinguishing between hateful and offensive language, the dependency-based embedding has a very comparable performance with its linear-based counterpart even outperforming it in some settings. Moreover, compared to the state-of-the-art (BERT), it demonstrates competitive performance especially when in an ensemble with linear-based embedding. Also, we observed that for a specialised task such as hate speech detection, a domain-specific embedding is probably more important than a large out-of-domain embedding with a larger vocabulary size.

2 citations

References
More filters
Proceedings Article
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

24,012 citations


"Surfacing contextual hate speech wo..." refers methods in this paper

  • ...Topical Context is the context used by word embedding approaches like word2vec [9], that utilize a bag-of-words in an effort to rank words by their domain similarity....

    [...]

Proceedings Article
11 Nov 1999
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Abstract: The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

14,400 citations


"Surfacing contextual hate speech wo..." refers methods in this paper

  • ...• Using graph expansion and PageRank scores to bootstrap our initial HS seed words....

    [...]

  • ...To further reduce the search space we use PageRank [13] to rank out-of-dictionary words in a graph where some of the vertices are known hate speech keywords....

    [...]

  • ...Concisely, this boosting is done to set known hate speech words as the important “pages" that pass on their weight during the PageRank computation....

    [...]

  • ...For the PageRank scores we set d = 0.85 as it is the standard rate of decay used for the algorithm....

    [...]

  • ...Using cosine similarity scores alone as the edge weight would not allow us to model the idea that hate speech words are the important “pages" in the graph, the key concept behind PageRank....

    [...]

Posted Content
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

11,343 citations

Journal ArticleDOI
TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.
Abstract: Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models to learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram, words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.

7,537 citations

Posted Content
TL;DR: A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
Abstract: Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character $n$-grams. A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.

2,425 citations