Home
/
Authors
/
Guoyin Wang

Author

Guoyin Wang

Other affiliations: Amazon.com

Bio: Guoyin Wang is an academic researcher from Duke University. The author has contributed to research in topics: Generative model & Automatic summarization. The author has an hindex of 13, co-authored 50 publications receiving 947 citations. Previous affiliations of Guoyin Wang include Amazon.com.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Joint Embedding of Words and Labels for Text Classification

[...]

Guoyin Wang¹, Chunyuan Li¹, Wenlin Wang¹, Yizhe Zhang¹, Dinghan Shen¹, Xinyuan Zhang¹, Ricardo Henao¹, Lawrence Carin¹ - Show less +4 more•Institutions (1)

Duke University¹

01 Jan 2018

TL;DR: This article proposed to view text classification as a label-word joint embedding problem, where each label is embedded in the same space with the word vectors and the attention is learned on a training set of labeled samples to ensure that the relevant words are weighted higher than the irrelevant ones.

...read moreread less

Abstract: Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences. We propose to view text classification as a label-word joint embedding problem: each label is embedded in the same space with the word vectors. We introduce an attention framework that measures the compatibility of embeddings between text sequences and labels. The attention is learned on a training set of labeled samples to ensure that, given a text sequence, the relevant words are weighted higher than the irrelevant ones. Our method maintains the interpretability of word embeddings, and enjoys a built-in ability to leverage alternative sources of information, in addition to input text sequences. Extensive results on the several large text datasets show that the proposed framework outperforms the state-of-the-art methods by a large margin, in terms of both accuracy and speed.

...read moreread less

311 citations

Proceedings Article•DOI•

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

[...]

Dinghan Shen¹, Guoyin Wang¹, Wenlin Wang¹, Martin Renqiang Min, Qinliang Su², Yizhe Zhang³, Chunyuan Li¹, Ricardo Henao¹, Lawrence Carin¹ - Show less +5 more•Institutions (3)

Duke University¹, Sun Yat-sen University², Microsoft³

01 Jan 2018

TL;DR: This paper conducted a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models.

...read moreread less

Abstract: Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging.

...read moreread less

203 citations

Proceedings Article•

Adversarial Text Generation via Feature-Mover's Distance

[...]

Liqun Chen¹, Shuyang Dai¹, Chenyang Tao¹, Haichao Zhang², Zhe Gan³, Dinghan Shen¹, Yizhe Zhang³, Guoyin Wang¹, Ruiyi Zhang¹, Lawrence Carin¹ - Show less +6 more•Institutions (3)

Duke University¹, Baidu², Microsoft³

01 Jan 2018

TL;DR: This work proposes to improve text-generation GAN via a novel approach inspired by optimal transport, which leads to a highly discriminative critic and easy-to-optimize objective, overcoming the mode-collapsing and brittle-training problems in existing methods.

...read moreread less

Abstract: Generative adversarial networks (GANs) have achieved significant success in generating real-valued data. However, the discrete nature of text hinders the application of GAN to text-generation tasks. Instead of using the standard GAN objective, we propose to improve text-generation GAN via a novel approach inspired by optimal transport. Specifically, we consider matching the latent feature distributions of real and synthetic sentences using a novel metric, termed the feature-mover's distance (FMD). This formulation leads to a highly discriminative critic and easy-to-optimize objective, overcoming the mode-collapsing and brittle-training problems in existing methods. Extensive experiments are conducted on a variety of tasks to evaluate the proposed model empirically, including unconditional text generation, style transfer from non-parallel text, and unsupervised cipher cracking. The proposed model yields superior performance, demonstrating wide applicability and effectiveness.

...read moreread less

119 citations

Proceedings Article•

Deconvolutional paragraph representation learning

[...]

Yizhe Zhang¹, Dinghan Shen¹, Guoyin Wang¹, Zhe Gan¹, Ricardo Henao¹, Lawrence Carin¹ - Show less +2 more•Institutions (1)

Duke University¹

01 Jan 2017

TL;DR: It is shown empirically that compared to RNNs, the proposed sequence-to-sequence, purely convolutional and deconvolutional autoencoding framework is better at reconstructing and correcting long paragraphs.

...read moreread less

Abstract: Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences during RNN-based decoding (reconstruction) decreases with the length of the text. We propose a sequence-to-sequence, purely convolutional and deconvolutional autoencoding framework that is free of the above issue, while also being computationally efficient. The proposed method is simple, easy to implement and can be leveraged as a building block for many applications. We show empirically that compared to RNNs, our framework is better at reconstructing and correcting long paragraphs. Quantitative evaluation on semi-supervised text classification and summarization tasks demonstrate the potential for better utilization of long unlabeled text data.

...read moreread less

100 citations

Posted Content•

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

[...]

Dinghan Shen¹, Guoyin Wang¹, Wenlin Wang¹, Martin Renqiang Min, Qinliang Su², Yizhe Zhang³, Chunyuan Li¹, Ricardo Henao¹, Lawrence Carin¹ - Show less +5 more•Institutions (3)

Duke University¹, Sun Yat-sen University², Microsoft³

24 May 2018-arXiv: Computation and Language

TL;DR: This paper conducts a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models, and proposes two additional pooling strategies over learned word embeddings: a max-pooling operation for improved interpretability and a hierarchical pooling operation, which preserves spatial information within text sequences.

...read moreread less

Abstract: Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging. The source code and datasets can be obtained from https:// this http URL.

...read moreread less

57 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

IEEE transactions on pattern analysis and machine intelligence

[...]

Ieee Xplore

01 Jan 1979

TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.

...read moreread less

Abstract: In the real world, a realistic setting for computer vision or multimedia recognition problems is that we have some classes containing lots of training data and many classes contain a small amount of training data. Therefore, how to use frequent classes to help learning rare classes for which it is harder to collect the training data is an open question. Learning with Shared Information is an emerging topic in machine learning, computer vision and multimedia analysis. There are different level of components that can be shared during concept modeling and machine learning stages, such as sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, etc. Regarding the specific methods, multi-task learning, transfer learning and deep learning can be seen as using different strategies to share information. These learning with shared information methods are very effective in solving real-world large-scale problems. This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis. Both state-of-the-art works, as well as literature reviews, are welcome for submission. Papers addressing interesting real-world computer vision and multimedia applications are especially encouraged. Topics of interest include, but are not limited to: • Multi-task learning or transfer learning for large-scale computer vision and multimedia analysis • Deep learning for large-scale computer vision and multimedia analysis • Multi-modal approach for large-scale computer vision and multimedia analysis • Different sharing strategies, e.g., sharing generic object parts, sharing attributes, sharing transformations, sharing regularization parameters and sharing training examples, • Real-world computer vision and multimedia applications based on learning with shared information, e.g., event detection, object recognition, object detection, action recognition, human head pose estimation, object tracking, location-based services, semantic indexing. • New datasets and metrics to evaluate the benefit of the proposed sharing ability for the specific computer vision or multimedia problem. • Survey papers regarding the topic of learning with shared information. Authors who are unsure whether their planned submission is in scope may contact the guest editors prior to the submission deadline with an abstract, in order to receive feedback.

...read moreread less

1,758 citations

Journal Article•DOI•

A Survey of the Usages of Deep Learning for Natural Language Processing

[...]

Daniel W. Otter¹, Julian Richard Medina¹, Jugal Kalita¹•Institutions (1)

University of Colorado Colorado Springs¹

01 Feb 2021-IEEE Transactions on Neural Networks

TL;DR: The field of natural language processing has been propelled forward by an explosion in the use of deep learning models over the last several years as mentioned in this paper, which includes several core linguistic processing issues in addition to many applications of computational linguistics.

...read moreread less

Abstract: Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This article provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to many applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.

...read moreread less

783 citations

Journal Article•DOI•

A Primer in BERTology: What We Know About How BERT Works

[...]

Anna Rogers¹, Olga Kovaleva¹, Anna Rumshisky¹•Institutions (1)

University of Massachusetts Lowell¹

01 Jan 2020-Transactions of the Association for Computational Linguistics

TL;DR: A survey of over 150 studies of the BERT model can be found in this paper, where the current state of knowledge about how BERT works, what kind of information it learns and how it is represented, common modifications to its training objectives and architecture, the overparameterization issue and approaches to compression.

...read moreread less

Abstract: Transformer-based models have pushed state of the art in many areas of NLP, but our understanding of what is behind their success is still limited. This paper is the first survey of over 150 studies of the popular BERT model. We review the current state of knowledge about how BERT works, what kind of information it learns and how it is represented, common modifications to its training objectives and architecture, the overparameterization issue and approaches to compression. We then outline directions for future research.

...read moreread less

617 citations

Posted Content•

A Primer in BERTology: What we know about how BERT works

[...]

Anna Rogers¹, Olga Kovaleva², Anna Rumshisky²•Institutions (2)

University of Copenhagen¹, University of Massachusetts Lowell²

27 Feb 2020-arXiv: Computation and Language

TL;DR: This paper is the first survey of over 150 studies of the popular BERT model, reviewing the current state of knowledge about how BERT works, what kind of information it learns and how it is represented, common modifications to its training objectives and architecture, the overparameterization issue, and approaches to compression.

...read moreread less

616 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse