Home
/
Authors
/
Xingjian Shi

Author

Xingjian Shi

Other affiliations: Hong Kong University of Science and Technology, Shanghai Jiao Tong University

Bio: Xingjian Shi is an academic researcher from Amazon.com. The author has contributed to research in topics: Computer science & Deep learning. The author has an hindex of 20, co-authored 33 publications receiving 6863 citations. Previous affiliations of Xingjian Shi include Hong Kong University of Science and Technology & Shanghai Jiao Tong University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014

Papers

PDF

Open Access

More filters

Posted Content•

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

[...]

Xingjian Shi¹, Zhourong Chen¹, Hao Wang¹, Dit-Yan Yeung¹, Wai-kin Wong, Wang-chun Woo - Show less +2 more•Institutions (1)

Hong Kong University of Science and Technology¹

13 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes the convolutional LSTM (ConvLSTM) and uses it to build an end-to-end trainable model for the precipitation nowcasting problem and shows that it captures spatiotemporal correlations better and consistently outperforms FC-L STM and the state-of-the-art operational ROVER algorithm.

...read moreread less

Abstract: The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machine learning perspective. In this paper, we formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the prediction target are spatiotemporal sequences. By extending the fully connected LSTM (FC-LSTM) to have convolutional structures in both the input-to-state and state-to-state transitions, we propose the convolutional LSTM (ConvLSTM) and use it to build an end-to-end trainable model for the precipitation nowcasting problem. Experiments show that our ConvLSTM network captures spatiotemporal correlations better and consistently outperforms FC-LSTM and the state-of-the-art operational ROVER algorithm for precipitation nowcasting.

...read moreread less

4,487 citations

Proceedings Article•

Convolutional LSTM Network: a machine learning approach for precipitation nowcasting

[...]

Xingjian Shi¹, Zhourong Chen¹, Hao Wang¹, Dit-Yan Yeung¹, Wai-kin Wong, Wang-chun Woo - Show less +2 more•Institutions (1)

Hong Kong University of Science and Technology¹

07 Dec 2015

TL;DR: In this article, a convolutional LSTM (ConvLSTM) was proposed to capture spatiotemporal correlations better and consistently outperforms FC-LSTMs.

...read moreread less

2,474 citations

Proceedings Article•DOI•

Dynamic Key-Value Memory Networks for Knowledge Tracing

[...]

Jiani Zhang¹, Xingjian Shi², Irwin King¹, Dit-Yan Yeung²•Institutions (2)

The Chinese University of Hong Kong¹, Hong Kong University of Science and Technology²

03 Apr 2017

TL;DR: Li et al. as discussed by the authors introduced a new model called Dynamic Key-Value Memory Networks (DKVMN) that can exploit the relationships between underlying concepts and directly output a student's mastery level of each concept.

...read moreread less

Abstract: Knowledge Tracing (KT) is a task of tracing evolving knowledge state of students with respect to one or more concepts as they engage in a sequence of learning activities. One important purpose of KT is to personalize the practice sequence to help students learn knowledge concepts efficiently. However, existing methods such as Bayesian Knowledge Tracing and Deep Knowledge Tracing either model knowledge state for each predefined concept separately or fail to pinpoint exactly which concepts a student is good at or unfamiliar with. To solve these problems, this work introduces a new model called Dynamic Key-Value Memory Networks (DKVMN) that can exploit the relationships between underlying concepts and directly output a student's mastery level of each concept. Unlike standard memory-augmented neural networks that facilitate a single memory matrix or two static memory matrices, our model has one static matrix called key, which stores the knowledge concepts and the other dynamic matrix called value, which stores and updates the mastery levels of corresponding concepts. Experiments show that our model consistently outperforms the state-of-the-art model in a range of KT datasets. Moreover, the DKVMN model can automatically discover underlying concepts of exercises typically performed by human annotations and depict the changing knowledge state of a student.

...read moreread less

307 citations

Posted Content•

Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model

[...]

Xingjian Shi¹, Zhihan Gao¹, Leonard Lausen¹, Hao Wang¹, Dit-Yan Yeung¹, Wai-kin Wong², Wang-chun Woo² - Show less +3 more•Institutions (2)

Hong Kong University of Science and Technology¹, Quintiles²

12 Jun 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work goes beyond ConvLSTM and proposes the Trajectory GRU (TrajGRU) model that can actively learn the location-variant structure for recurrent connections, and provides a benchmark that includes a real-world large-scale dataset from the Hong Kong Observatory.

...read moreread less

Abstract: With the goal of making high-resolution forecasts of regional rainfall, precipitation nowcasting has become an important and fundamental technology underlying various public services ranging from rainstorm warnings to flight safety. Recently, the Convolutional LSTM (ConvLSTM) model has been shown to outperform traditional optical flow based methods for precipitation nowcasting, suggesting that deep learning models have a huge potential for solving the problem. However, the convolutional recurrence structure in ConvLSTM-based models is location-invariant while natural motion and transformation (e.g., rotation) are location-variant in general. Furthermore, since deep-learning-based precipitation nowcasting is a newly emerging area, clear evaluation protocols have not yet been established. To address these problems, we propose both a new model and a benchmark for precipitation nowcasting. Specifically, we go beyond ConvLSTM and propose the Trajectory GRU (TrajGRU) model that can actively learn the location-variant structure for recurrent connections. Besides, we provide a benchmark that includes a real-world large-scale dataset from the Hong Kong Observatory, a new training loss, and a comprehensive evaluation protocol to facilitate future research and gauge the state of the art.

...read moreread less

276 citations

Posted Content•

Dynamic Key-Value Memory Networks for Knowledge Tracing

[...]

Jiani Zhang¹, Xingjian Shi², Irwin King¹, Dit-Yan Yeung²•Institutions (2)

The Chinese University of Hong Kong¹, Hong Kong University of Science and Technology²

24 Nov 2016-arXiv: Artificial Intelligence

TL;DR: This work introduces a new model called Dynamic Key-Value Memory Networks (DKVMN) that can exploit the relationships between underlying concepts and directly output a student's mastery level of each concept.

...read moreread less

235 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Comprehensive Survey on Graph Neural Networks

[...]

Zonghan Wu¹, Shirui Pan², Fengwen Chen¹, Guodong Long¹, Chengqi Zhang¹, Philip S. Yu³ - Show less +2 more•Institutions (3)

University of Technology, Sydney¹, Monash University, Clayton campus², University of Illinois at Chicago³

01 Jan 2021-IEEE Transactions on Neural Networks

TL;DR: This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.

...read moreread less

Abstract: Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications, where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on the existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this article, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNs, convolutional GNNs, graph autoencoders, and spatial–temporal GNNs. We further discuss the applications of GNNs across various domains and summarize the open-source codes, benchmark data sets, and model evaluation of GNNs. Finally, we propose potential research directions in this rapidly growing field.

...read moreread less

4,584 citations

Posted Content•

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

[...]

Shaojie Bai, J. Zico Kolter, Vladlen Koltun

04 Mar 2018-arXiv: Learning

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.

...read moreread less

Abstract: For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this http URL .

...read moreread less

2,776 citations

Posted Content•

Graph Neural Networks: A Review of Methods and Applications

[...]

Jie Zhou¹, Ganqu Cui¹, Shengding Hu¹, Zhengyan Zhang¹, Cheng Yang², Zhiyuan Liu¹, Lifeng Wang³, Changcheng Li³, Maosong Sun¹ - Show less +5 more•Institutions (3)

Tsinghua University¹, Beijing University of Posts and Telecommunications², Tencent³

20 Dec 2018-arXiv: Learning

TL;DR: A detailed review over existing graph neural network models is provided, systematically categorize the applications, and four open problems for future research are proposed.

...read moreread less

Abstract: Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

...read moreread less

2,494 citations

Proceedings Article•DOI•

Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting

[...]

Bing Yu¹, Haoteng Yin¹, Zhanxing Zhu¹•Institutions (1)

Peking University¹

13 Jul 2018

TL;DR: Wang et al. as mentioned in this paper proposed a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain.

...read moreread less

Abstract: Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.

...read moreread less

2,103 citations

Journal Article•DOI•

Deep learning and process understanding for data-driven Earth system science

[...]

Markus Reichstein¹, Gustau Camps-Valls², Bjorn Stevens¹, Martin Jung¹, Joachim Denzler³, Nuno Carvalhais¹, Nuno Carvalhais⁴, Prabhat⁵ - Show less +4 more•Institutions (5)

Max Planck Society¹, University of Valencia², University of Jena³, Universidade Nova de Lisboa⁴, Lawrence Berkeley National Laboratory⁵

13 Feb 2019-Nature

TL;DR: It is argued that contextual cues should be used as part of deep learning to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales.

...read moreread less

Abstract: Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybrid modelling approach, coupling physical process models with the versatility of data-driven machine learning.

...read moreread less

2,014 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse