Home
/
Authors
/
Yuexin Wu

Author

Yuexin Wu

Other affiliations: Microsoft, Tsinghua University

Bio: Yuexin Wu is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Computer science & Search algorithm. The author has an hindex of 15, co-authored 27 publications receiving 1421 citations. Previous affiliations of Yuexin Wu include Microsoft & Tsinghua University.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Learning for Extreme Multi-label Text Classification

[...]

Jingzhou Liu¹, Wei-Cheng Chang¹, Yuexin Wu¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

07 Aug 2017

TL;DR: This paper presents the first attempt at applying deep learning to XMTC, with a family of new Convolutional Neural Network models which are tailored for multi-label classification in particular.

...read moreread less

Abstract: Extreme multi-label text classification (XMTC) refers to the problem of assigning to each document its most relevant subset of class labels from an extremely large label collection, where the number of labels could reach hundreds of thousands or millions. The huge label space raises research challenges such as data sparsity and scalability. Significant progress has been made in recent years by the development of new machine learning methods, such as tree induction with large-margin partitions of the instance spaces and label-vector embedding in the target space. However, deep learning has not been explored for XMTC, despite its big successes in other related areas. This paper presents the first attempt at applying deep learning to XMTC, with a family of new Convolutional Neural Network (CNN) models which are tailored for multi-label classification in particular. With a comparative evaluation of 7 state-of-the-art methods on 6 benchmark datasets where the number of labels is up to 670,000, we show that the proposed CNN approach successfully scaled to the largest datasets, and consistently produced the best or the second best results on all the datasets. On the Wikipedia dataset with over 2 million documents and 500,000 labels in particular, it outperformed the second best method by 11.7%~15.3% in precision@K and by 11.5%~11.7% in NDCG@K for K = 1,3,5.

...read moreread less

539 citations

Proceedings Article•

Analogical Inference for Multi-relational Embeddings

[...]

Hanxiao Liu¹, Yuexin Wu¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

17 Jul 2017

TL;DR: This paper proposed a novel framework for optimizing the latent representations with respect to the \textit{analogical} properties of the embedded entities and relations by formulating the learning objective in a differentiable fashion.

...read moreread less

Abstract: Large-scale multi-relational embedding refers to the task of learning the latent representations for entities and relations in large knowledge graphs. An effective and scalable solution for this problem is crucial for the true success of knowledge-based inference in a broad range of applications. This paper proposes a novel framework for optimizing the latent representations with respect to the \textit{analogical} properties of the embedded entities and relations. By formulating the learning objective in a differentiable fashion, our model enjoys both theoretical power and computational scalability, and significantly outperformed a large number of representative baseline methods on benchmark datasets. Furthermore, the model offers an elegant unification of several well-known methods in multi-relational embedding, which can be proven to be special instantiations of our framework.

...read moreread less

265 citations

Proceedings Article•

Review Networks for Caption Generation

[...]

Zhilin Yang¹, Ye Yuan, Yuexin Wu², William W. Cohen¹, Ruslan Salakhutdinov³ - Show less +1 more•Institutions (3)

Carnegie Mellon University¹, Tsinghua University², University of Toronto³

05 Dec 2016

Abstract: We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoder- decoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework. Empirically, we show that our framework improves over state-of- the-art encoder-decoder systems on the tasks of image captioning and source code captioning.

...read moreread less

219 citations

Posted Content•

Review Networks for Caption Generation

[...]

Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, William W. Cohen - Show less +1 more

25 May 2016-arXiv: Learning

TL;DR: The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder.

...read moreread less

Abstract: We propose a novel extension of the encoder-decoder framework, called a review network The review network is generic and can enhance any existing encoder- decoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder We show that conventional encoder-decoders are a special case of our framework Empirically, we show that our framework improves over state-of- the-art encoder-decoder systems on the tasks of image captioning and source code captioning

...read moreread less

120 citations

Proceedings Article•DOI•

Unsupervised Cross-lingual Transfer of Word Embedding Spaces.

[...]

Ruochen Xu¹, Yiming Yang¹, Naoki Otani², Yuexin Wu¹•Institutions (2)

Carnegie Mellon University¹, Kyoto University²

01 Jan 2018

TL;DR: This paper proposed an unsupervised learning approach that does not require any cross-lingual labeled data and optimizes the transformation functions in both directions simultaneously based on distributional matching as well as minimizing the back-translation losses.

...read moreread less

Abstract: Cross-lingual transfer of word embeddings aims to establish the semantic mappings among words in different languages by learning the transformation functions over the corresponding word embedding spaces Successfully solving this problem would benefit many downstream tasks such as to translate text classification models from resource-rich languages (eg English) to low-resource languages Supervised methods for this problem rely on the availability of cross-lingual supervision, either using parallel corpora or bilingual lexicons as the labeled data for training, which may not be available for many low resource languages This paper proposes an unsupervised learning approach that does not require any cross-lingual labeled data Given two monolingual word embedding spaces for any language pair, our algorithm optimizes the transformation functions in both directions simultaneously based on distributional matching as well as minimizing the back-translation losses We use a neural network implementation to calculate the Sinkhorn distance, a well-defined distributional similarity measure, and optimize our objective through back-propagation Our evaluation on benchmark datasets for bilingual lexicon induction and cross-lingual word similarity prediction shows stronger or competitive performance of the proposed method compared to other state-of-the-art supervised and unsupervised baseline methods over many language pairs

...read moreread less

97 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

[...]

Peter Anderson¹, Xiaodong He, Chris Buehler², Damien Teney³, Mark Johnson, Stephen Gould¹, Lei Zhang² - Show less +3 more•Institutions (3)

Australian National University¹, Microsoft², University of Adelaide³

18 Jun 2018

TL;DR: In this paper, a bottom-up and top-down attention mechanism was proposed to enable attention to be calculated at the level of objects and other salient image regions, which achieved state-of-the-art results on the MSCOCO test server.

...read moreread less

Abstract: Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning. In this work, we propose a combined bottom-up and top-down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions. This is the natural basis for attention to be considered. Within our approach, the bottom-up mechanism (based on Faster R-CNN) proposes image regions, each with an associated feature vector, while the top-down mechanism determines feature weightings. Applying this approach to image captioning, our results on the MSCOCO test server establish a new state-of-the-art for the task, achieving CIDEr / SPICE / BLEU-4 scores of 117.9, 21.5 and 36.9, respectively. Demonstrating the broad applicability of the method, applying the same approach to VQA we obtain first place in the 2017 VQA Challenge.

...read moreread less

2,904 citations

Posted Content•

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.

[...]

Peter Anderson¹, Xiaodong He, Chris Buehler², Damien Teney³, Mark Johnson, Stephen Gould¹, Lei Zhang² - Show less +3 more•Institutions (3)

Australian National University¹, Microsoft², University of Adelaide³

25 Jul 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: A combined bottom-up and top-down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions is proposed, demonstrating the broad applicability of this approach to VQA.

...read moreread less

2,248 citations

Journal Article•DOI•

Knowledge Graph Embedding: A Survey of Approaches and Applications

[...]

Quan Wang¹, Zhendong Mao¹, Bin Wang¹, Li Guo¹•Institutions (1)

Chinese Academy of Sciences¹

01 Dec 2017-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This article provides a systematic review of existing techniques of Knowledge graph embedding, including not only the state-of-the-arts but also those with latest trends, based on the type of information used in the embedding task.

...read moreread less

Abstract: Knowledge graph (KG) embedding is to embed components of a KG including entities and relations into continuous vector spaces, so as to simplify the manipulation while preserving the inherent structure of the KG. It can benefit a variety of downstream tasks such as KG completion and relation extraction, and hence has quickly gained massive attention. In this article, we provide a systematic review of existing techniques, including not only the state-of-the-arts but also those with latest trends. Particularly, we make the review based on the type of information used in the embedding task. Techniques that conduct embedding using only facts observed in the KG are first introduced. We describe the overall framework, specific model design, typical training procedures, as well as pros and cons of such techniques. After that, we discuss techniques that further incorporate additional information besides facts. We focus specifically on the use of entity types, relation paths, textual descriptions, and logical rules. Finally, we briefly introduce how KG embedding can be applied to and benefit a wide variety of downstream tasks such as KG completion, relation extraction, question answering, and so forth.

...read moreread less

1,905 citations

Proceedings Article•DOI•

Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning

[...]

Piyush Sharma¹, Nan Ding¹, Sebastian Goodman¹, Radu Soricut¹•Institutions (1)

Google¹

01 Jul 2018

TL;DR: The Conceptual Captions dataset as discussed by the authors contains an order of magnitude more images than the MS-COCO dataset and represents a wider variety of both images and image caption styles.

...read moreread less

Abstract: We present a new dataset of image caption annotations, Conceptual Captions, which contains an order of magnitude more images than the MS-COCO dataset (Lin et al., 2014) and represents a wider variety of both images and image caption styles. We achieve this by extracting and filtering image caption annotations from billions of webpages. We also present quantitative evaluations of a number of image captioning models and show that a model architecture based on Inception-ResNetv2 (Szegedy et al., 2016) for image-feature extraction and Transformer (Vaswani et al., 2017) for sequence modeling achieves the best performance when trained on the Conceptual Captions dataset.

...read moreread less

1,443 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse