Home
/
Authors
/
Shu Wu

Author

Shu Wu

Other affiliations: Association for Computing Machinery, University of Science and Technology of China, Université de Sherbrooke ...read more

Bio: Shu Wu is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Graph (abstract data type) & Recommender system. The author has an hindex of 28, co-authored 133 publications receiving 3509 citations. Previous affiliations of Shu Wu include Association for Computing Machinery & University of Science and Technology of China.

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2007
2006
1997
1996

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Session-Based Recommendation with Graph Neural Networks

[...]

Shu Wu¹, Yuyuan Tang², Yanqiao Zhu³, Liang Wang¹, Xing Xie⁴, Tieniu Tan¹ - Show less +2 more•Institutions (4)

Chinese Academy of Sciences¹, University of Science and Technology Beijing², Tongji University³, Microsoft⁴

17 Jul 2019

TL;DR: Wang et al. as discussed by the authors proposed Session-based Recommendation with Graph Neural Networks (SR-GNN) to capture complex transitions of items, which are difficult to be revealed by previous conventional sequential methods.

...read moreread less

Abstract: The problem of session-based recommendation aims to predict user actions based on anonymous sessions. Previous methods model a session as a sequence and estimate user representations besides item representations to make recommendations. Though achieved promising results, they are insufficient to obtain accurate user vectors in sessions and neglect complex transitions of items. To obtain accurate item embedding and take complex transitions of items into account, we propose a novel method, i.e. Session-based Recommendation with Graph Neural Networks, SR-GNN for brevity. In the proposed method, session sequences are modeled as graphstructured data. Based on the session graph, GNN can capture complex transitions of items, which are difficult to be revealed by previous conventional sequential methods. Each session is then represented as the composition of the global preference and the current interest of that session using an attention network. Extensive experiments conducted on two real datasets show that SR-GNN evidently outperforms the state-of-the-art session-based recommendation methods consistently.

...read moreread less

1,011 citations

Proceedings Article•

Predicting the next location: a recurrent model with spatial and temporal contexts

[...]

Qiang Liu¹, Shu Wu¹, Liang Wang¹, Tieniu Tan¹•Institutions (1)

Chinese Academy of Sciences¹

12 Feb 2016

TL;DR: RNN is extended and a novel method called Spatial Temporal Recurrent Neural Networks (ST-RNN) is proposed, which can model local temporal and spatial contexts in each layer with time-specific transition matrices for different time intervals and distance-specific transitions for different geographical distances.

...read moreread less

Abstract: Spatial and temporal contextual information plays a key role for analyzing user behaviors, and is helpful for predicting where he or she will go next. With the growing ability of collecting information, more and more temporal and spatial contextual information is collected in systems, and the location prediction problem becomes crucial and feasible. Some works have been proposed to address this problem, but they all have their limitations. Factorizing Personalized Markov Chain (FPMC) is constructed based on a strong independence assumption among different factors, which limits its performance. Tensor Factorization (TF) faces the cold start problem in predicting future actions. Recurrent Neural Networks (RNN) model shows promising performance comparing with PFMC and TF, but all these methods have problem in modeling continuous time interval and geographical distance. In this paper, we extend RNN and propose a novel method called Spatial Temporal Recurrent Neural Networks (ST-RNN). ST-RNN can model local temporal and spatial contexts in each layer with time-specific transition matrices for different time intervals and distance-specific transition matrices for different geographical distances. Experimental results show that the proposed ST-RNN model yields significant improvements over the competitive compared methods on two typical datasets, i.e., Global Terrorism Database (GTD) and Gowalla dataset.

...read moreread less

687 citations

Proceedings Article•DOI•

A Dynamic Recurrent Model for Next Basket Recommendation

[...]

Feng Yu¹, Qiang Liu¹, Shu Wu¹, Liang Wang¹, Tieniu Tan¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

07 Jul 2016

TL;DR: This work proposes a novel model, Dynamic REcurrent bAsket Model (DREAM), based on Recurrent Neural Network (RNN), which not only learns a dynamic representation of a user but also captures global sequential features among baskets.

...read moreread less

Abstract: Next basket recommendation becomes an increasing concern. Most conventional models explore either sequential transaction features or general interests of users. Further, some works treat users' general interests and sequential behaviors as two totally divided matters, and then combine them in some way for next basket recommendation. Moreover, the state-of-the-art models are based on the assumption of Markov Chains (MC), which only capture local sequential features between two adjacent baskets. In this work, we propose a novel model, Dynamic REcurrent bAsket Model (DREAM), based on Recurrent Neural Network (RNN). DREAM not only learns a dynamic representation of a user but also captures global sequential features among baskets. The dynamic representation of a specific user can reveal user's dynamic interests at different time, and the global sequential features reflect interactions of all baskets of the user over time. Experiment results on two public datasets indicate that DREAM is more effective than the state-of-the-art models for next basket recommendation.

...read moreread less

420 citations

Proceedings Article•DOI•

Graph Contrastive Learning with Adaptive Augmentation

[...]

Yanqiao Zhu¹, Yichen Xu², Feng Yu³, Qiang Liu¹, Shu Wu¹, Liang Wang¹ - Show less +2 more•Institutions (3)

Chinese Academy of Sciences¹, Beijing University of Posts and Telecommunications², Alibaba Group³

19 Apr 2021

TL;DR: This paper proposes a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph that consistently outperforms existing state-of-the-art baselines and even surpasses some supervised counterparts.

...read moreread less

Abstract: Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the agreement of representations in the two views. Despite the prosperous development of graph CL methods, the design of graph augmentation schemes—a crucial component in CL—remains rarely explored. We argue that the data augmentation schemes should preserve intrinsic structures and attributes of graphs, which will force the model to learn representations that are insensitive to perturbation on unimportant nodes and edges. However, most existing methods adopt uniform data augmentation schemes, like uniformly dropping edges and uniformly shuffling features, leading to suboptimal performance. In this paper, we propose a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph. Specifically, on the topology level, we design augmentation schemes based on node centrality measures to highlight important connective structures. On the node attribute level, we corrupt node features by adding more noise to unimportant node features, to enforce the model to recognize underlying semantic information. We perform extensive experiments of node classification on a variety of real-world datasets. Experimental results demonstrate that our proposed method consistently outperforms existing state-of-the-art baselines and even surpasses some supervised counterparts, which validates the effectiveness of the proposed contrastive framework with adaptive augmentation.

...read moreread less

359 citations

Posted Content•

Deep Graph Contrastive Representation Learning.

[...]

Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, Liang Wang - Show less +2 more

07 Jun 2020-arXiv: Learning

TL;DR: This paper proposes a novel framework for unsupervised graph representation learning by leveraging a contrastive objective at the node level, and generates two graph views by corruption and learns node representations by maximizing the agreement of node representations in these two views.

...read moreread less

Abstract: Graph representation learning nowadays becomes fundamental in analyzing graph-structured data. Inspired by recent success of contrastive methods, in this paper, we propose a novel framework for unsupervised graph representation learning by leveraging a contrastive objective at the node level. Specifically, we generate two graph views by corruption and learn node representations by maximizing the agreement of node representations in these two views. To provide diverse node contexts for the contrastive objective, we propose a hybrid scheme for generating graph views on both structure and attribute levels. Besides, we provide theoretical justification behind our motivation from two perspectives, mutual information and the classical triplet loss. We perform empirical experiments on both transductive and inductive learning tasks using a variety of real-world datasets. Experimental experiments demonstrate that despite its simplicity, our proposed method consistently outperforms existing state-of-the-art methods by large margins. Moreover, our unsupervised method even surpasses its supervised counterparts on transductive tasks, demonstrating its great potential in real-world applications.

...read moreread less

300 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Matrix Factorization Techniques for Recommender Systems

[...]

Patrick Seemann

01 Jan 2014

2,080 citations

Proceedings Article•DOI•

DeepFM: a factorization-machine based neural network for CTR prediction

[...]

Huifeng Guo¹, Ruiming Tang², Yunming Ye¹, Zhenguo Li², Xiuqiang He² - Show less +1 more•Institutions (2)

Harbin Institute of Technology¹, Huawei²

19 Aug 2017

TL;DR: This paper shows that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions, and combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture.

...read moreread less

Abstract: Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide & Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data.

...read moreread less

1,695 citations

Proceedings Article•DOI•

Self-Attentive Sequential Recommendation

[...]

Wang-Cheng Kang¹, Julian McAuley¹•Institutions (1)

University of California, San Diego¹

01 Nov 2018

TL;DR: In this article, a self-attention based sequential model (SASRec) is proposed, which uses an attention mechanism to identify which items are'relevant' from a user's action history, and use them to predict the next item.

...read moreread less

Abstract: Sequential dynamics are a key feature of many modern recommender systems, which seek to capture the 'context' of users' activities on the basis of actions they have performed recently. To capture such patterns, two approaches have proliferated: Markov Chains (MCs) and Recurrent Neural Networks (RNNs). Markov Chains assume that a user's next action can be predicted on the basis of just their last (or last few) actions, while RNNs in principle allow for longer-term semantics to be uncovered. Generally speaking, MC-based methods perform best in extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser datasets where higher model complexity is affordable. The goal of our work is to balance these two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based on relatively few actions (like an MC). At each time step, SASRec seeks to identify which items are 'relevant' from a user's action history, and use them to predict the next item. Extensive empirical studies show that our method outperforms various state-of-the-art sequential models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. Visualizations on attention weights also show how our model adaptively handles datasets with various density, and uncovers meaningful patterns in activity sequences.

...read moreread less

1,202 citations

Journal Article•

Measuring statistical dependence with Hilbert-Schmidt norms

[...]

Arthur Gretton, Olivier Bousquet, Alexander J. Smola, Bernhard Schölkopf

01 Jan 2005-Lecture Notes in Computer Science

TL;DR: An independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator, or HSIC, is proposed.

...read moreread less

Abstract: We propose an independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernel-based independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on HSIC do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernel-based criteria, and of other recently published ICA methods.

...read moreread less

1,134 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse