Semantic Textual Similarity of Sentences with Emojis

doi:10.1145/3366424.3383758

Proceedings ArticleDOI

Semantic Textual Similarity of Sentences with Emojis

Alok Debnath, +4 more

- pp 426-430

Chats0

TLDR

The amount of semantic information lost by discounting emojis is qualitatively ascertained, as well as a mechanism of accounting for emojiis in a semantic task is shown.

Abstract:

In this paper, we extend the task of semantic textual similarity to include sentences which contain emojis. Emojis are ubiquitous on social media today, but are often removed in the pre-processing stage of curating datasets for NLP tasks. In this paper, we qualitatively ascertain the amount of semantic information lost by discounting emojis, as well as show a mechanism of accounting for emojis in a semantic task. We create a sentence similarity dataset of 4000 pairs of tweets with emojis, which have been annotated for relatedness. The corpus contains tweets curated based on common topic as well as by replacement of emojis. The latter was done to analyze the difference in semantics associated with different emojis. We aim to provide an understanding of the information lost by removing emojis by providing a qualitative analysis of the dataset. We also aim to present a method of using both emojis and words for downstream NLP tasks beyond sentiment analysis.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning on COVID-19 Related Tweets

Ali Shariq Imran, +3 more

- 28 Sep 2020 -

IEEE Access

TL;DR: Deep long short-term memory models used for estimating the sentiment polarity and emotions from extracted tweets have been trained to achieve state-of-the-art accuracy on the sentiment140 dataset and the use of emoticons showed a unique and novel way of validating the supervised deep learning models on tweets extracted from Twitter.

...read moreread less

Posted Content

Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning - a Case Study on COVID-19.

Ali Shariq Imran, +3 more

- 23 Aug 2020 -

arXiv: Social and Information Networks

TL;DR: This study tends to detect and analyze sentiment polarity and emotions demonstrated during the initial phase of the pandemic and the lockdown period employing natural language processing (NLP) and deep learning techniques on Twitter posts.

...read moreread less

Book ChapterDOI

Social Media Sentiment Analysis Related to COVID-19 Vaccines: Case Studies in English and Greek Language

Evridiki Kapoteli, +2 more

TL;DR: In this article , a supervised learning approach was applied to monitor the dynamics of public opinion on COVID-19 vaccines using Twitter data, which revealed that overall negative, neutral, and positive sentiments were at 36.5, 39.9, and 23.6% in the English language dataset, respectively, whereas overall negative and nonnegative sentiments were on 60.1% and 39.1%, respectively, in the Greek language dataset.

...read moreread less

Book ChapterDOI

A Novel Emoji Based Deep Super Learner (EDSL) for Sentiment Classification

Geetika Vashisht, +2 more

Posted Content

Towards Explainable Fact Checking

Isabelle Augenstein

- 23 Aug 2021 -

arXiv: Computation and Language

TL;DR: In this article, the authors propose a model for explainable fact checking using natural language processing methods, which in turn utilize deep neural networks to learn higher-order features from text in order to make predictions.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Posted Content

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler

- 22 Dec 2012 -

arXiv: Learning

TL;DR: A novel per-dimension learning rate method for gradient descent called ADADELTA that dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent is presented.

...read moreread less