Home
/
Topics
/
Word embedding

Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2005
2003

Papers

PDF

Open Access

More filters

Posted Content•

A survey of cross-lingual embedding models.

[...]

Sebastian Ruder

15 Jun 2017

TL;DR: This work surveys models that seek to learn cross-lingual embeddings and discusses them based on the type of approach and the nature of parallel data that they employ.

...read moreread less

Abstract: Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.

...read moreread less

140 citations

Proceedings Article•

Context-sensitive Twitter sentiment classification using neural network

[...]

Yafeng Ren¹, Yue Zhang², Meishan Zhang³, Donghong Ji¹•Institutions (3)

Wuhan University¹, Singapore University of Technology and Design², Heilongjiang University³

12 Feb 2016

TL;DR: This paper proposes a context-based neural network model for Twitter sentiment analysis, incorporating contextualized features from relevant Tweets into the model in the form of word embedding vectors.

...read moreread less

Abstract: Sentiment classification on Twitter has attracted increasing research in recent years. Most existing work focuses on feature engineering according to the tweet content itself. In this paper, we propose a context-based neural network model for Twitter sentiment analysis, incorporating contextualized features from relevant Tweets into the model in the form of word embedding vectors. Experiments on both balanced and unbalanced datasets show that our proposed models outperform the current state-of-the-art.

...read moreread less

140 citations

Proceedings Article•DOI•

Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations

[...]

Tian Shi¹, Kyeongpil Kang², Jaegul Choo², Chandan K. Reddy¹•Institutions (2)

Virginia Tech¹, Korea University²

10 Apr 2018

TL;DR: A semantics-assisted non-negative matrix factorization (SeaNMF) model to discover topics for the short texts effectively incorporates the word-context semantic correlations into the model, where the semantic relationships between the words and their contexts are learned from the skip-gram view of the corpus.

...read moreread less

Abstract: Being a prevalent form of social communications on the Internet, billions of short texts are generated everyday. Discovering knowledge from them has gained a lot of interest from both industry and academia. The short texts have a limited contextual information, and they are sparse, noisy and ambiguous, and hence, automatically learning topics from them remains an important challenge. To tackle this problem, in this paper, we propose a semantics-assisted non-negative matrix factorization (SeaNMF) model to discover topics for the short texts. It effectively incorporates the word-context semantic correlations into the model, where the semantic relationships between the words and their contexts are learned from the skip-gram view of the corpus. The SeaNMF model is solved using a block coordinate descent algorithm. We also develop a sparse variant of the SeaNMF model which can achieve a better model interpretability. Extensive quantitative evaluations on various real-world short text datasets demonstrate the superior performance of the proposed models over several other state-of-the-art methods in terms of topic coherence and classification accuracy. The qualitative semantic analysis demonstrates the interpretability of our models by discovering meaningful and consistent topics. With a simple formulation and the superior performance, SeaNMF can be an effective standard topic model for short texts.

...read moreread less

138 citations

Journal Article•DOI•

Recurrent neural networks for classifying relations in clinical notes.

[...]

Yuan Luo¹•Institutions (1)

Northwestern University¹

01 Aug 2017-Journal of Biomedical Informatics

TL;DR: The first models based on recurrent neural networks (more specifically Long Short-Term Memory - LSTM) for classifying relations from clinical notes show comparable performance to previously published systems while requiring no manual feature engineering.

...read moreread less

137 citations

Proceedings Article•DOI•

Evaluating the underlying gender bias in contextualized word embeddings

[...]

Christine Basta¹, Marta R. Costa-jussà¹, Noe Casas¹•Institutions (1)

Polytechnic University of Catalonia¹

18 Apr 2019

TL;DR: The findings suggest that contextualized word embeddings are less biased than standard ones even when the latter are debiased.

...read moreread less

Abstract: Gender bias is highly impacting natural language processing applications. Word embeddings have clearly been proven both to keep and amplify gender biases that are present in current data sources. Recently, contextualized word embeddings have enhanced previous word embedding techniques by computing word vector representations dependent on the sentence they appear in. In this paper, we study the impact of this conceptual change in the word embedding computation in relation with gender bias. Our analysis includes different measures previously applied in the literature to standard word embeddings. Our findings suggest that contextualized word embeddings are less biased than standard ones even when the latter are debiased.

...read moreread less

137 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
…
18
19
20
21
22
23
24
…
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics