scispace - formally typeset
Open AccessProceedings Article

Benchmarking Aggression Identification in Social Media.

TLDR
The Shared Task on Aggression Identification organised as part of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC - 1) at COLING 2018 was to develop a classifier that could discriminate between Overtly Aggression, Covertly Aggressive, and Non-aggressive texts.
Abstract
In this paper, we present the report and findings of the Shared Task on Aggression Identification organised as part of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC - 1) at COLING 2018. The task was to develop a classifier that could discriminate between Overtly Aggressive, Covertly Aggressive, and Non-aggressive texts. For this task, the participants were provided with a dataset of 15,000 aggression-annotated Facebook Posts and Comments each in Hindi (in both Roman and Devanagari script) and English for training and validation. For testing, two different sets - one from Facebook and another from a different social media - were provided. A total of 130 teams registered to participate in the task, 30 teams submitted their test runs, and finally 20 teams also sent their system description paper which are included in the TRAC workshop proceedings. The best system obtained a weighted F-score of 0.64 for both Hindi and English on the Facebook test sets, while the best scores on the surprise set were 0.60 and 0.50 for English and Hindi respectively. The results presented in this report depict how challenging the task is. The positive response from the community and the great levels of participation in the first edition of this shared task also highlights the interest in this topic.

read more

Citations
More filters
Proceedings ArticleDOI

Predicting the Type and Target of Offensive Posts in Social Media

TL;DR: The Offensive Language Identification Dataset (OLID), a new dataset with tweets annotated for offensive content using a fine-grained three-layer annotation scheme, is complied and made publicly available.
Proceedings ArticleDOI

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval).

TL;DR: The SemEval-2019 Task 6 on Identifying and categorizing Offensive Language in Social Media (OffensEval) as mentioned in this paper was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets, and featured three sub-tasks.
Journal ArticleDOI

Hate speech detection: Challenges and solutions.

TL;DR: This work identifies and examines challenges faced by online automatic approaches for hate speech detection in text, and proposes a multi-view SVM approach that achieves near state-of-the-art performance, while being simpler and producing more easily interpretable decisions than neural methods.
Proceedings ArticleDOI

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

TL;DR: The SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020) as mentioned in this paper included three subtasks corresponding to the hierarchical taxonomy of the OLID schema, and was offered in five languages: Arabic, Danish, English, Greek, and Turkish.
Journal ArticleDOI

Resources and benchmark corpora for hate speech detection: a systematic review

TL;DR: This review systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors, to highlight a heterogeneous, growing landscape.
References
More filters
Journal ArticleDOI

Bullying, cyberbullying, and suicide

TL;DR: Examining the extent to which a nontraditional form of peer aggression—cyberbullying—is also related to suicidal ideation among adolescents suggests that a suicide prevention and intervention component is essential within comprehensive bullying response programs implemented in schools.
Proceedings Article

Automated Hate Speech Detection and the Problem of Offensive Language

TL;DR: This work used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords and labels a sample of these tweets into three categories: those containinghate speech, only offensive language, and those with neither.
Proceedings ArticleDOI

Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter

TL;DR: A list of criteria founded in critical race theory is provided, and these are used to annotate a publicly available corpus of more than 16k tweets and present a dictionary based the most indicative words in the data.
Proceedings ArticleDOI

A Survey on Hate Speech Detection using Natural Language Processing

TL;DR: A survey on hate speech detection describes key areas that have been explored to automatically recognize these types of utterances using natural language processing and discusses limits of those approaches.
Proceedings ArticleDOI

Abusive Language Detection in Online User Content

TL;DR: A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind.