Home
/
Topics
/
Word embedding

Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2005
2003

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Detecting fake news with capsule neural networks

[...]

Mohammad Hadi Goldani¹, Saeedeh Momtazi¹, Reza Safabakhsh¹•Institutions (1)

Amirkabir University of Technology¹

01 Mar 2021-Applied Soft Computing

TL;DR: This paper aims to use capsule neural networks in the fake news detection task, using different embedding models for news items of different lengths and outperforming the state-of-the-art methods on ISOT and LIAR.

...read moreread less

74 citations

Journal Article•DOI•

A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art

[...]

Juan J. Lastra-Díaz¹, Josu Goikoetxea², Mohamed Ali Hadj Taieb, Ana García-Serrano¹, Mohamed Ben Aouicha, Eneko Agirre² - Show less +2 more•Institutions (2)

National University of Distance Education¹, University of the Basque Country²

01 Oct 2019-Engineering Applications of Artificial Intelligence

TL;DR: This work introduces the largest, reproducible and detailed experimental survey of OM measures and THE AUTHORS models reported in the literature, based on the evaluation of both families of methods on a same software platform, with the aim of elucidating what is the state of the problem.

...read moreread less

74 citations

Proceedings Article•DOI•

Semantic and Verbatim Word Spotting Using Deep Neural Networks

[...]

Tomas Wilkinson¹, Anders Brun¹•Institutions (1)

Uppsala University¹

01 Oct 2016

TL;DR: A word spotting system based on convolutional neural networks that outperforms the previous state-of-the-art for word spotting on standard datasets and can perform word spotting using both query- by-string and query-by-example in a variety of word embedding spaces.

...read moreread less

Abstract: In the last few years, deep convolutional neural networks have become ubiquitous in computer vision, achieving state-of-the-art results on problems like object detection, semantic segmentation, and image captioning. However, they have not yet been widely investigated in the document analysis community. In this paper, we present a word spotting system based on convolutional neural networks. We train a network to extract a powerful image representation, which we then embed into a word embedding space. This allows us to perform word spotting using both query-by-string and query-by-example in a variety of word embedding spaces, both learned and handcrafted, for verbatim as well as semantic word spotting. Our novel approach is versatile and the evaluation shows that it outperforms the previous state-of-the-art for word spotting on standard datasets.

...read moreread less

74 citations

Posted Content•

Context-Dependent Word Representation for Neural Machine Translation

[...]

Heeyoul Choi¹, Kyunghyun Cho², Yoshua Bengio³•Institutions (3)

Handong Global University¹, New York University², Université de Montréal³

03 Jul 2016-arXiv: Computation and Language

TL;DR: This paper proposes to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence and proposes to represent special tokens with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors.

...read moreread less

Abstract: We first observe a potential weakness of continuous vector representations of symbols in neural machine translation. That is, the continuous vector representation, or a word embedding vector, of a symbol encodes multiple dimensions of similarity, equivalent to encoding more than one meaning of the word. This has the consequence that the encoder and decoder recurrent networks in neural machine translation need to spend substantial amount of their capacity in disambiguating source and target words based on the context which is defined by a source sentence. Based on this observation, in this paper we propose to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence. Additionally, we propose to represent special tokens (such as numbers, proper nouns and acronyms) with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors. The experiments on En-Fr and En-De reveal that the proposed approaches of contextualization and symbolization improves the translation quality of neural machine translation systems significantly.

...read moreread less

72 citations

Journal Article•DOI•

LSTM-CRF for Drug-Named Entity Recognition

[...]

Donghuo Zeng, Chengjie Sun, Lei Lin, Bingquan Liu

17 Jun 2017-Entropy

TL;DR: This work offers an automatic exploring words and characters level features approach: a recurrent neural network using bidirectional long short-term memory (L STM) with Conditional Random Fields decoding (LSTM-CRF), which outperforms the best system in the DDI2013 challenge.

...read moreread less

Abstract: Drug-Named Entity Recognition (DNER) for biomedical literature is a fundamental facilitator of Information Extraction. For this reason, the DDIExtraction2011 (DDI2011) and DDIExtraction2013 (DDI2013) challenge introduced one task aiming at recognition of drug names. State-of-the-art DNER approaches heavily rely on hand-engineered features and domain-specific knowledge which are difficult to collect and define. Therefore, we offer an automatic exploring words and characters level features approach: a recurrent neural network using bidirectional long short-term memory (LSTM) with Conditional Random Fields decoding (LSTM-CRF). Two kinds of word representations are used in this work: word embedding, which is trained from a large amount of text, and character-based representation, which can capture orthographic feature of words. Experimental results on the DDI2011 and DDI2013 dataset show the effect of the proposed LSTM-CRF method. Our method outperforms the best system in the DDI2013 challenge.

...read moreread less

72 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
…
43
44
45
46
47
48
49
…
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics