scispace - formally typeset
Search or ask a question
Author

Cynthia Van Hee

Bio: Cynthia Van Hee is an academic researcher from Ghent University. The author has contributed to research in topics: Sentiment analysis & Irony. The author has an hindex of 12, co-authored 25 publications receiving 681 citations.

Papers
More filters
Proceedings ArticleDOI
01 Jun 2018
TL;DR: This paper presents the first shared task on irony detection: given a tweet, automatic natural language processing systems should determine whether the tweet is ironic and which type of irony (if any) is expressed (Task B) and demonstrates that fine-grained irony classification is much more challenging than binary irony detection.
Abstract: This paper presents the first shared task on irony detection: given a tweet, automatic natural language processing systems should determine whether the tweet is ironic (Task A) and which type of irony (if any) is expressed (Task B). The ironic tweets were collected using irony-related hashtags (i.e. #irony, #sarcasm, #not) and were subsequently manually annotated to minimise the amount of noise in the corpus. Prior to distributing the data, hashtags that were used to collect the tweets were removed from the corpus. For both tasks, a training corpus of 3,834 tweets was provided, as well as a test set containing 784 tweets. Our shared tasks received submissions from 43 teams for the binary classification Task A and from 31 teams for the multiclass Task B. The highest classification scores obtained for both subtasks are respectively F1= 0.71 and F1= 0.51 and demonstrate that fine-grained irony classification is much more challenging than binary irony detection.

241 citations

Journal ArticleDOI
08 Oct 2018-PLOS ONE
TL;DR: This paper describes the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and performs a series of binary classification experiments to determine the feasibility of automatic cyberbullies detection.
Abstract: While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a cyberbullying corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for the task. Experiments on a hold-out test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F1 score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems.

231 citations

Proceedings Article
01 Sep 2015
TL;DR: A new scheme for cyberbullying annotation is developed and applied, which describes the presence and severity of cyberbullies, a post author's role (harasser, victim or bystander) and a number of fine-grained categories related to cyber Bullying, such as insults and threats.
Abstract: In the current era of online interactions, both positive and negative experiences are abundant on the Web. As in real life, negative experiences can have a serious impact on youngsters. Recent studies have reported cybervictimization rates among teenagers that vary between 20% and 40%. In this paper, we focus on cyberbullying as a particular form of cybervictimization and explore its automatic detection and fine-grained classification. Data containing cyberbullying was collected from the social networking site Ask.fm. We developed and applied a new scheme for cyberbullying annotation, which describes the presence and severity of cyberbullying, a post author's role (harasser, victim or bystander) and a number of fine-grained categories related to cyberbullying, such as insults and threats. We present experimental results on the automatic detection of cyberbullying and explore the feasibility of detecting the more fine-grained cyberbullying categories in online posts. For the first task, an F-score of 55.39% is obtained. We observe that the detection of the fine-grained categories (e.g. threats) is more challenging, presumably due to data sparsity, and because they are often expressed in a subtle and implicit way.

140 citations

11 Oct 2015
TL;DR: This paper presents the construction and annotation of a corpus of Dutch social media posts annotated with fine-grained cyberbullying-related text categories, such as insults and threats, and describes the specific participants (harasser, victim or bystander) in a cyberbullies conversation to enhance the analysis of human interactions involving cyber Bullying.
Abstract: The recent development of social media poses new challenges to the research community in analyzing online interactions between people. Social networking sites offer great opportunities for connecting with others, but also increase the vulnerability of young people to undesirable phenomena, such as cybervictimization. Recent research reports that on average, 20% to 40% of all teenagers have been victimized online. In this paper, we focus on cyberbullying as a particular form of cybervictimization. Successful prevention depends on the adequate detection of potentially harmful messages. However, given the massive information overload on the Web, there is a need for intelligent systems to identify potential risks automatically. We present the construction and annotation of a corpus of Dutch social media posts annotated with fine-grained cyberbullying-related text categories, such as insults and threats. Also, the specific participants (harasser, victim or bystander) in a cyberbullying conversation are identified to enhance the analysis of human interactions involving cyberbullying. Apart from describing our dataset construction and annotation, we present proof-of-concept experiments on the automatic identification of cyberbullying events and fine-grained cyberbullying categories.

73 citations

Proceedings Article
01 Dec 2016
TL;DR: A qualitative analysis of the output reveals that recognising irony that results from a polarity clash appears to be (much) more feasible than recognising other forms of ironic utterances (e.g., descriptions of situational irony).
Abstract: Recognising and understanding irony is crucial for the improvement natural language processing tasks including sentiment analysis. In this study, we describe the construction of an English Twitter corpus and its annotation for irony based on a newly developed fine-grained annotation scheme. We also explore the feasibility of automatic irony recognition by exploiting a varied set of features including lexical, syntactic, sentiment and semantic (Word2Vec) information. Experiments on a held-out test set show that our irony classifier benefits from this combined information, yielding an F1-score of 67.66%. When explicit hashtag information like #irony is included in the data, the system even obtains an F1-score of 92.77%. A qualitative analysis of the output reveals that recognising irony that results from a polarity clash appears to be (much) more feasible than recognising other forms of ironic utterances (e.g., descriptions of situational irony).

33 citations


Cited by
More filters
Proceedings ArticleDOI
01 Apr 2017
TL;DR: A survey on hate speech detection describes key areas that have been explored to automatically recognize these types of utterances using natural language processing and discusses limits of those approaches.
Abstract: This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.

1,030 citations

Journal ArticleDOI
TL;DR: Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results as mentioned in this paper, which is also popularly used in sentiment analysis in recent years.
Abstract: Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.

917 citations

01 Jan 1992
TL;DR: In this article, the authors present a framework for media assessment based on the notion of objective concepts of news as information measuring objectivity and evaluate the dimension of news in the context of mass media.
Abstract: PART ONE: MASS COMMUNICATION AND SOCIETY Public Communication and Public Interest Contested Territory Media Performance Traditions of Enquiry The `Public Interest' in Communication PART TWO: MEDIA PERFORMANCE NORMS Performance Norms in Media Policy Discourse The Newspaper Press Performance Norms in Media Policy Discourse Broadcasting A Framework of Principle for Media Assessment PART THREE: RESEARCH MODELS AND METHODS Media Organizational Performance Models and Research Options PART FOUR: MEDIA FREEDOM Concepts and Models of Media Freedom Media Freedom From Structure to Performance Media Freedom The Organizational Environment PART FIVE: DIVERSITY Varieties and Processes of Diversity Taking the Measure of Diversity Media Reflection Media Access and Audience Choice PART SIX: OBJECTIVITY Concepts of Objectivity A Framework for Objectivity Research Measuring Objectivity News as Information Measuring Objectivity The Evaluative Dimension of News PART SEVEN: MASS MEDIA, ORDER AND SOCIAL CONTROL Media and the Maintenance of Public Order Policing the Symbolic Environment Solidarity and Social Identity PART EIGHT: MEDIA AND CULTURE Questions of Culture and Mass Communication Cultural Identity and Autonomy Whose Media Culture? PART NINE: IN CONCLUSION Changing Media, Changing Mores Implications for Assessment

738 citations

Book
01 Jun 2015
TL;DR: Sentiment analysis is the computational study of people's opinions, sentiments, emotions, moods, and attitudes as discussed by the authors, which offers numerous research challenges, but promises insight useful to anyone interested in opinion analysis and social media analysis.
Abstract: Sentiment analysis is the computational study of people's opinions, sentiments, emotions, moods, and attitudes. This fascinating problem offers numerous research challenges, but promises insight useful to anyone interested in opinion analysis and social media analysis. This comprehensive introduction to the topic takes a natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs commonly used to express opinions, sentiments, and emotions. The book covers core areas of sentiment analysis and also includes related topics such as debate analysis, intention mining, and fake-opinion detection. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.In addition to traditional computational methods, this second edition includes recent deep learning methods to analyze and summarize sentiments and opinions, and also new material on emotion and mood analysis techniques, emotion-enhanced dialogues, and multimodal emotion analysis.

587 citations

Journal ArticleDOI
01 Sep 2016
TL;DR: This comprehensive introduction to sentiment analysis takes a natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs commonly used to express opinions, sentiments, and emotions.

531 citations