scispace - formally typeset
Proceedings ArticleDOI

Semantic Textual Similarity of Sentences with Emojis

Reads0
Chats0
TLDR
The amount of semantic information lost by discounting emojis is qualitatively ascertained, as well as a mechanism of accounting for emojiis in a semantic task is shown.
Abstract
In this paper, we extend the task of semantic textual similarity to include sentences which contain emojis. Emojis are ubiquitous on social media today, but are often removed in the pre-processing stage of curating datasets for NLP tasks. In this paper, we qualitatively ascertain the amount of semantic information lost by discounting emojis, as well as show a mechanism of accounting for emojis in a semantic task. We create a sentence similarity dataset of 4000 pairs of tweets with emojis, which have been annotated for relatedness. The corpus contains tweets curated based on common topic as well as by replacement of emojis. The latter was done to analyze the difference in semantics associated with different emojis. We aim to provide an understanding of the information lost by removing emojis by providing a qualitative analysis of the dataset. We also aim to present a method of using both emojis and words for downstream NLP tasks beyond sentiment analysis.

read more

Citations
More filters
Journal ArticleDOI

Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning on COVID-19 Related Tweets

TL;DR: Deep long short-term memory models used for estimating the sentiment polarity and emotions from extracted tweets have been trained to achieve state-of-the-art accuracy on the sentiment140 dataset and the use of emoticons showed a unique and novel way of validating the supervised deep learning models on tweets extracted from Twitter.
Posted Content

Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning - a Case Study on COVID-19.

TL;DR: This study tends to detect and analyze sentiment polarity and emotions demonstrated during the initial phase of the pandemic and the lockdown period employing natural language processing (NLP) and deep learning techniques on Twitter posts.
Book ChapterDOI

Social Media Sentiment Analysis Related to COVID-19 Vaccines: Case Studies in English and Greek Language

TL;DR: In this article , a supervised learning approach was applied to monitor the dynamics of public opinion on COVID-19 vaccines using Twitter data, which revealed that overall negative, neutral, and positive sentiments were at 36.5, 39.9, and 23.6% in the English language dataset, respectively, whereas overall negative and nonnegative sentiments were on 60.1% and 39.1%, respectively, in the Greek language dataset.
Posted Content

Towards Explainable Fact Checking

TL;DR: In this article, the authors propose a model for explainable fact checking using natural language processing methods, which in turn utilize deep neural networks to learn higher-order features from text in order to make predictions.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Posted Content

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler
- 22 Dec 2012 - 
TL;DR: A novel per-dimension learning rate method for gradient descent called ADADELTA that dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent is presented.
Related Papers (5)