Home
/
Topics
/
Word embedding

Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2005
2003

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Progress in Neural NLP: Modeling, Learning, and Reasoning

[...]

Ming Zhou¹, Nan Duan¹, Shujie Liu¹, Heung-Yeung Shum¹•Institutions (1)

Microsoft¹

01 Mar 2020-Engineering

TL;DR: The importance of reasoning is emphasized in this paper because it is important for building interpretable and knowledge-driven neural NLP models to handle complex tasks.

...read moreread less

86 citations

Posted Content•

Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs)

[...]

Shaohua Li, Tat-Seng Chua, Jun Zhu, Chunyan Miao

09 Jun 2016-arXiv: Computation and Language

TL;DR: A generative topic embedding model is proposed that performs better than eight existing methods, with fewer features, and can generate coherent topics even based on only one document.

...read moreread less

Abstract: Word embedding maps words into a low-dimensional continuous embedding space by exploiting the local word collocation patterns in a small context window. On the other hand, topic modeling maps documents onto a low-dimensional topic space, by utilizing the global word collocation patterns in the same document. These two types of patterns are complementary. In this paper, we propose a generative topic embedding model to combine the two types of patterns. In our model, topics are represented by embedding vectors, and are shared across documents. The probability of each word is influenced by both its local context and its topic. A variational inference method yields the topic embeddings as well as the topic mixing proportions for each document. Jointly they represent the document in a low-dimensional continuous space. In two document classification tasks, our method performs better than eight existing methods, with fewer features. In addition, we illustrate with an example that our method can generate coherent topics even based on only one document.

...read moreread less

86 citations

Journal Article•DOI•

Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining

[...]

Petr Hájek¹, Aliaksandr Barushka¹, Michal Munk²•Institutions (2)

University of Pardubice¹, University of Constantine the Philosopher²

01 Dec 2020-Neural Computing and Applications

TL;DR: Two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions are proposed that perform well on all datasets, irrespective of their sentiment polarity and product category.

...read moreread less

Abstract: Fake consumer review detection has attracted much interest in recent years owing to the increasing number of Internet purchases. Existing approaches to detect fake consumer reviews use the review content, product and reviewer information and other features to detect fake reviews. However, as shown in recent studies, the semantic meaning of reviews might be particularly important for text classification. In addition, the emotions hidden in the reviews may represent another potential indicator of fake content. To improve the performance of fake review detection, here we propose two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions. Specifically, the models learn document-level representation by using three sets of features: (1) n-grams, (2) word embeddings and (3) various lexicon-based emotion indicators. Such a high-dimensional feature representation is used to classify fake reviews into four domains. To demonstrate the effectiveness of the presented detection systems, we compare their classification performance with several state-of-the-art methods for fake review detection. The proposed systems perform well on all datasets, irrespective of their sentiment polarity and product category.

...read moreread less

86 citations

Proceedings Article•

A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.

[...]

Yonghui Wu¹, Jun Xu¹, Min Jiang¹, Yaoyun Zhang¹, Hua Xu¹ - Show less +1 more•Institutions (1)

University of Texas Health Science Center at Houston¹

05 Nov 2015

TL;DR: The results from both 2010 i2b2 and 2014 Semantic Evaluation data showed that the binarized word embedding features outperformed other strategies for deriving distributed word representations and can be adapted to any other clinical natural language processing research.

...read moreread less

Abstract: Clinical Named Entity Recognition (NER) is a critical task for extracting important patient information from clinical text to support clinical and translational research. This study explored the neural word embeddings derived from a large unlabeled clinical corpus for clinical NER. We systematically compared two neural word embedding algorithms and three different strategies for deriving distributed word representations. Two neural word embeddings were derived from the unlabeled Multiparameter Intelligent Monitoring in Intensive Care (MIMIC) II corpus (403,871 notes). The results from both 2010 i2b2 and 2014 Semantic Evaluation (SemEval) data showed that the binarized word embedding features outperformed other strategies for deriving distributed word representations. The binarized embedding features improved the F1-score of the Conditional Random Fields based clinical NER system by 2.3% on i2b2 data and 2.4% on SemEval data. The combined feature from the binarized embeddings and the Brown clusters improved the F1-score of the clinical NER system by 2.9% on i2b2 data and 2.7% on SemEval data. Our study also showed that the distributed word embedding features derived from a large unlabeled corpus can be better than the widely used Brown clusters. Further analysis found that the neural word embeddings captured a wide range of semantic relations, which could be discretized into distributed word representations to benefit the clinical NER system. The low-cost distributed feature representation can be adapted to any other clinical natural language processing research.

...read moreread less

86 citations

Journal Article•DOI•

Deep neural network for hierarchical extreme multi-label text classification

[...]

Francesco Gargiulo¹, Stefano Silvestri², Stefano Silvestri¹, Mario Ciampi¹, Giuseppe De Pietro¹ - Show less +1 more•Institutions (2)

Indian Council of Agricultural Research¹, University of Naples Federico II²

01 Jun 2019-Applied Soft Computing

TL;DR: An analysis of a Deep Learning architecture devoted to text classification, considering the extreme multi-class and multi-label text classification problem, when a hierarchical label set is defined and a methodology named Hierarchical Label Set Expansion (HLSE) is presented.

...read moreread less

86 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
…
35
36
37
38
39
40
41
…
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics