Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Extricating from GroundTruth: An Unpaired Learning Based Evaluation Metric for Image Captioning

[...]

Zhong-Qiu Zhao¹, Yue-Lin Sun¹, Nan-Xun Wang¹, Weidong Tian¹•Institutions (1)

Hefei University of Technology¹

19 Jul 2020

TL;DR: A novel learning-based evaluation metric, namely Unpaired Image Captioning Evaluation (UICE), which can be trained to distinguish between human-written and generated captions, and which can correctly judge the grammatical correctness of generatedCaptions and the semantic consistency between captions and corresponding images.

...read moreread less

Abstract: Recently, instead of pursuing high performance on classical evaluation metrics, the research focus of image captioning has shifted to generating sentences which are more vivid and stylized than human-written ones. However, there are still no applicable metrics which can judge how close the generated captions are to the human-written ones. In this paper, we propose a novel learning-based evaluation metric, namely Unpaired Image Captioning Evaluation (UICE), which can be trained to distinguish between human-written and generated captions. Unlike existing metrics, our UICE consists of two parts: the semantic alignment module measuring the semantic distance between extracted image features and caption meanings, and the syntactic discriminating module syntactically judging how human-like the candidate caption is. The semantic alignment module is implemented by mapping the image features and the word embedding into a unified tensor space. And the syntactic discriminating module is designed to be learning-based, and thereby can be trained to be stylized by users’ own, fed with additional personalized corpus during the training process. Extensive experiments indicate that our metric can correctly judge the grammatical correctness of generated captions and the semantic consistency between captions and corresponding images.

...read moreread less

Posted Content•

NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for Lexical Semantic Change in Diachronic Italian Corpora

[...]

Jason Angel¹, Carlos A. Rodriguez-Diaz¹, Alexander Gelbukh, Sergio Jimenez²•Institutions (2)

Instituto Politécnico Nacional¹, Caro and Cuervo Institute²

07 Nov 2020-arXiv: Computation and Language

TL;DR: This article proposed two models representing the target words across the periods to predict the changing words using threshold and voting schemes, which achieved competent results, ranking third in the DIACR-Ita shared task at EVALITA 2020.

...read moreread less

Abstract: We present our systems and findings on unsupervised lexical semantic change for the Italian language in the DIACR-Ita shared-task at EVALITA 2020. The task is to determine whether a target word has evolved its meaning with time, only relying on raw-text from two time-specific datasets. We propose two models representing the target words across the periods to predict the changing words using threshold and voting schemes. Our first model solely relies on part-of-speech usage and an ensemble of distance measures. The second model uses word embedding representation to extract the neighbor's relative distances across spaces and propose "the average of absolute differences" to estimate lexical semantic change. Our models achieved competent results, ranking third in the DIACR-Ita competition. Furthermore, we experiment with the k_neighbor parameter of our second model to compare the impact of using "the average of absolute differences" versus the cosine distance used in Hamilton et al. (2016).

...read moreread less

Book Chapter•DOI•

NLP-CIC @ DIACR-Ita: POS and Neighbor Based Distributional Models for Lexical Semantic Change in Diachronic Italian Corpora

[...]

Jason Angel¹, Carlos A. Rodriguez-Diaz¹, Alexander Gelbukh, Sergio Jimenez²•Institutions (2)

Instituto Politécnico Nacional¹, Caro and Cuervo Institute²

01 Jan 2020

TL;DR: This work proposes two models representing the target words across the periods to predict the changing words using threshold and voting schemes and proposes "the average of absolute differences" to estimate lexical semantic change.

...read moreread less

Proceedings Article•DOI•

Better Word Representations with Word Weight

[...]

Gege Song¹, Xianglin Huang¹, Gang Cao¹, Tao Zhulin¹, Wei Liu¹, Lifang Yang¹ - Show less +2 more•Institutions (1)

Communication University of China¹

01 Sep 2019

TL;DR: This paper proposes an effective text classification scheme by incorporating word weight into word embedding, and extensive experimental results verify that the accuracy of the proposedText classification scheme outperforms the state-of-the-art ones.

...read moreread less

Abstract: As a fundamental task of natural language processing, text classification has been widely used in various applications such as sentiment analysis and spam detection. In recent years, the continuous-valued word embedding learned by neural network attaches extensive attentions. Although word embedding achieves impressive results in capturing similarities and regularities between words, it fails to highlight important words for identifying text category. Such deficiency could be attenuated by word weight, which conveys word contribution in text categorization. Toward this end, we propose an effective text classification scheme by incorporating word weight into word embedding in this paper. Specifically, in order to enrich word representation, the bidirectional gated recurrent units (Bi-GRU) is first employed to grasp context information of words. Then the word weights yielded by term frequency (TF) are used to modulate the word representation of Bi-GRU for constructing text representation. Extensive experimental results on several large text datasets verify that the accuracy of our proposed text classification scheme outperforms the state-of-the-art ones.

...read moreread less

Book Chapter•DOI•

Deep Learning-Based Approach for Sentiment Classification of Hotel Reviews

[...]

Sarah Anis¹, Sally Saad¹, Mostafa Aref¹•Institutions (1)

Ain Shams University¹

01 Jan 2021

TL;DR: In this article, a deep learning-based approach is introduced that automatically performs sentiment analysis for hotel reviews using word embedding and gated recurrent unit, which outperformed the performance of the traditional machine learning methods in sentiment classification of hotel reviews with 89% accuracy and 92% F-score.

...read moreread less

Abstract: As everything is shifting online, the demand for sentiment analysis has expanded tremendously in the recent years. Sentiment analysis is an automated process of analyzing people’s opinions and feelings using natural language processing tools. Organizations in the field of tourism can benefit from sentiment analysis to accurately track their customers’ opinions. A deep learning-based approach is introduced in this paper that automatically performs sentiment analysis for hotel reviews using word embedding and gated recurrent unit. The performance of our deep learning-based model outperformed the performance of the traditional machine learning methods in sentiment classification of hotel reviews with 89% accuracy and 92% F-score.

...read moreread less

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics