Home
/
Topics
/
Word embedding

Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2005
2003

Papers

PDF

Open Access

More filters

Posted Content•

Learning to Scale Multilingual Representations for Vision-Language Tasks.

[...]

Andrea Burns¹, Donghyun Kim¹, Derry Tanti Wijaya¹, Kate Saenko¹, Bryan A. Plummer¹ - Show less +1 more•Institutions (1)

Boston University¹

09 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A Scalable Multilingual Aligned Language Representation (SMALR) that supports many languages with few model parameters without sacrificing downstream task performance, and a cross-lingual consistency module that ensures predictions made for a query and its machine translation are comparable.

...read moreread less

Abstract: Current multilingual vision-language models either require a large number of additional parameters for each supported language, or suffer performance degradation as languages are added. In this paper, we propose a Scalable Multilingual Aligned Language Representation (SMALR) that supports many languages with few model parameters without sacrificing downstream task performance. SMALR learns a fixed size language-agnostic representation for most words in a multilingual vocabulary, keeping language-specific features for just a few. We use a masked cross-language modeling loss to align features with context from other languages. Additionally, we propose a cross-lingual consistency module that ensures predictions made for a query and its machine translation are comparable. The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date. We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.

...read moreread less

34 citations

Journal Article•DOI•

SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction.

[...]

Yideng Cai¹, Jiacheng Wang¹, Lei Deng², Lei Deng¹•Institutions (2)

Central South University¹, Xinjiang University²

29 Apr 2020-Frontiers in Bioengineering and Biotechnology

TL;DR: An integrated deep-learning-based classification model, named SDN2GO, to predict protein functions, which outperforms others on each sub-ontology of GO and learns from the Natural Language Processing to process domain information and pre-trained a deep learning sub-model to extract the comprehensive features of domains.

...read moreread less

Abstract: The assignment of function to proteins at a large scale is essential for understanding the molecular mechanism of life. However, only a very small percentage of the more than 179 million proteins in UniProtKB have Gene Ontology (GO) annotations supported by experimental evidence. In this paper, we proposed an integrated deep-learning-based classification model, named SDN2GO, to predict protein functions. SDN2GO applies convolutional neural networks to learn and extract features from sequences, protein domains, and known PPI networks, and then utilizes a weight classifier to integrate these features and achieve accurate predictions of GO terms. We constructed the training set and the independent test set according to the time-delayed principle of the Critical Assessment of Function Annotation (CAFA) and compared it with two highly competitive methods and the classic BLAST method on the independent test set. The results show that our method outperforms others on each sub-ontology of GO. We also investigated the performance of using protein domain information. We learned from the Natural Language Processing (NLP) to process domain information and pre-trained a deep learning sub-model to extract the comprehensive features of domains. The experimental results demonstrate that the domain features we obtained are much improved the performance of our model. Our deep learning models together with the data pre-processing scripts are publicly available as an open source software at https://github.com/Charrick/SDN2GO.

...read moreread less

33 citations

Journal Article•DOI•

Speculation detection for Chinese clinical notes

[...]

Shaodian Zhang¹, Tian Kang¹, Xingting Zhang², Dong Wen², Noémie Elhadad¹, Jianbo Lei² - Show less +2 more•Institutions (2)

Columbia University¹, Peking University²

01 Apr 2016-Journal of Biomedical Informatics

TL;DR: It is demonstrated that word segmentation is critical to produce high quality word embedding to facilitate downstream information extraction applications, and it is suggested that a domain dependent word segmenter can be vital to such a clinical NLP task in Chinese language.

...read moreread less

33 citations

Proceedings Article•DOI•

Using word embedding for bio-event extraction

[...]

Chen Li, Runqing Song, Maria Liakata, Andreas Vlachos, Stephanie Seneff, Xiangrong Zhang - Show less +2 more

01 Jul 2015

TL;DR: By using bag-ofwords (BOW) features as the baseline, the result has been improved by the introduction of word-embedding features, and is comparable to the state-of-the-art solution.

...read moreread less

Abstract: Bio-event extraction is an important phase towards the goal of extracting biological networks from the scientific literature. Recent advances in word embedding make computation of word distribution more ef- ficient and possible. In this study, we investigate methods bringing distributional characteristics of words in the text into event extraction by using the latest word embedding methods. By using bag-ofwords (BOW) features as the baseline, the result has been improved by the introduction of word-embedding features, and is comparable to the state-of-the-art solution.

...read moreread less

33 citations

Journal Article•DOI•

Convolution-deconvolution word embedding: an end-to-end multi-prototype fusion embedding method for natural language processing

[...]

Kai Shuang¹, Zhixuan Zhang¹, Jonathan Loo², Sen Su¹•Institutions (2)

Beijing University of Posts and Telecommunications¹, University of West London²

01 Jan 2020-Information Fusion

TL;DR: In this paper, an end-to-end multi-prototype fusion embedding that fuses context-specific and task-specific information was proposed to solve the problem of polysemous-unaware word embedding.

...read moreread less

33 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
…
102
103
104
105
106
107
108
…
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics