Home
/
Authors
/
Xiangbo Shu

Author

Xiangbo Shu

Nanjing University of Science and Technology

Other affiliations: National University of Singapore, Nanjing University

Bio: Xiangbo Shu is an academic researcher from Nanjing University of Science and Technology. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 21, co-authored 65 publications receiving 1572 citations. Previous affiliations of Xiangbo Shu include National University of Singapore & Nanjing University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Weakly-Shared Deep Transfer Networks for Heterogeneous-Domain Knowledge Propagation

[...]

Xiangbo Shu¹, Guo-Jun Qi², Jinhui Tang³, Jingdong Wang⁴•Institutions (4)

Nanjing University¹, University of Central Florida², Nanjing University of Science and Technology³, Microsoft⁴

13 Oct 2015

TL;DR: This paper develops a novel deep network structure, capable of transferring labeling information across heterogeneous domains, especially from text domain to image domain, and presents a novel architecture of DTNs to translate cross-domain information from text to image.

...read moreread less

Abstract: In recent years, deep networks have been successfully applied to model image concepts and achieved competitive performance on many data sets. In spite of impressive performance, the conventional deep networks can be subjected to the decayed performance if we have insufficient training examples. This problem becomes extremely severe for deep networks with powerful representation structure, making them prone to over fitting by capturing nonessential or noisy information in a small data set. In this paper, to address this challenge, we will develop a novel deep network structure, capable of transferring labeling information across heterogeneous domains, especially from text domain to image domain. This weakly-shared Deep Transfer Networks (DTNs) can adequately mitigate the problem of insufficient image training data by bringing in rich labels from the text domain. Specifically, we present a novel architecture of DTNs to translate cross-domain information from text to image. To share the labels between two domains, we will build multiple weakly shared layers of features. It allows to represent both shared inter-domain features and domain-specific features, making this structure more flexible and powerful in capturing complex data of different domains jointly than the strongly shared layers. Experiments on real world dataset will show its competitive performance as compared with the other state-of-the-art methods.

...read moreread less

202 citations

Proceedings Article•DOI•

Recurrent Face Aging

[...]

Wei Wang¹, Zhen Cui², Yan Yan¹, Jiashi Feng³, Shuicheng Yan³, Xiangbo Shu³, Nicu Sebe¹ - Show less +3 more•Institutions (3)

University of Trento¹, Southeast University², National University of Singapore³

01 Jun 2016

TL;DR: A recurrent face aging (RFA) framework based on a recurrent neural network which can identify the ages of people from 0 to 80 is introduced and demonstrates the proposed RFA provides better aging faces over other state-of-the-art age progression methods.

...read moreread less

Abstract: Modeling the aging process of human face is important for cross-age face verification and recognition. In this paper, we introduce a recurrent face aging (RFA) framework based on a recurrent neural network which can identify the ages of people from 0 to 80. Due to the lack of labeled face data of the same person captured in a long range of ages, traditional face aging models usually split the ages into discrete groups and learn a one-step face feature transformation for each pair of adjacent age groups. However, those methods neglect the in-between evolving states between the adjacent age groups and the synthesized faces often suffer from severe ghosting artifacts. Since human face aging is a smooth progression, it is more appropriate to age the face by going through smooth transition states. In this way, the ghosting artifacts can be effectively eliminated and the intermediate aged faces between two discrete age groups can also be obtained. Towards this target, we employ a twolayer gated recurrent unit as the basic recurrent module whose bottom layer encodes a young face to a latent representation and the top layer decodes the representation to a corresponding older face. The experimental results demonstrate our proposed RFA provides better aging faces over other state-of-the-art age progression methods.

...read moreread less

202 citations

Journal Article•DOI•

Coherence Constrained Graph LSTM for Group Activity Recognition.

[...]

Jinhui Tang¹, Xiangbo Shu¹, Rui Yan¹, Liyan Zhang²•Institutions (2)

Nanjing University of Science and Technology¹, Nanjing University of Aeronautics and Astronautics²

15 Jul 2019-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work proposes a novel Coherence Constrained Graph LSTM (CCG-LSTM) with STCC and GCC to effectively recognize group activity, by modeling the relevant motions of individuals while suppressing the irrelevant motions.

...read moreread less

Abstract: This work aims to address the group activity recognition problem by exploring human motion characteristics. Traditional methods hold that the motions of all persons contribute equally to the group activity, which suppresses the contributions of some relevant motions to the whole activity while overstates some irrelevant motions. To handle this problem, we present a Spatio-Temporal Context Coherence (STCC) constraint and a Global Context Coherence (GCC) constraint to capture the relevant motions and quantify their contributions to the group activity, respectively. Based on this, we propose a novel Coherence Constrained Graph LSTM (CCG-LSTM) with STCC and GCC to effectively recognize group activity, by modeling the relevant motions of individuals while suppressing the irrelevant motions. Specifically, to capture the relevant motions, we build the CCG-LSTM with a temporal confidence gate and a spatial confidence gate to control the memory state updating in terms of the temporally previous state and the spatially neighboring states, respectively. Besides, an attention mechanism is employed to quantify the contribution of a certain motion by measuring the consistency between itself and the whole activity at each time step. Finally, we conduct experiments on two widely-used datasets to illustrate the effectiveness of the proposed CCG-LSTM compared with the state-of-the-arts methods.

...read moreread less

140 citations

Journal Article•DOI•

Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains

[...]

Jinhui Tang¹, Xiangbo Shu¹, Zechao Li¹, Guo-Jun Qi², Jingdong Wang³ - Show less +1 more•Institutions (3)

Nanjing University of Science and Technology¹, University of Central Florida², Microsoft³

18 Nov 2016

TL;DR: A novel generalized deep transfer networks (DTNs) capable of transferring label information across heterogeneous domains, textual domain to visual domain, and to share the labels between two domains are proposed, able to generate domain-specific and shared interdomain features.

...read moreread less

Abstract: In recent years, deep neural networks have been successfully applied to model visual concepts and have achieved competitive performance on many tasks. Despite their impressive performance, traditional deep networks are subjected to the decayed performance under the condition of lacking sufficient training data. This problem becomes extremely severe for deep networks trained on a very small dataset, making them overfitting by capturing nonessential or noisy information in the training set. Toward this end, we propose a novel generalized deep transfer networks (DTNs), capable of transferring label information across heterogeneous domains, textual domain to visual domain. The proposed framework has the ability to adequately mitigate the problem of insufficient training images by bringing in rich labels from the textual domain. Specifically, to share the labels between two domains, we build parameter- and representation-shared layers. They are able to generate domain-specific and shared interdomain features, making this architecture flexible and powerful in capturing complex information from different domains jointly. To evaluate the proposed method, we release a new dataset extended from NUS-WIDE at http://imag.njust.edu.cn/NUS-WIDE-128.html. Experimental results on this dataset show the superior performance of the proposed DTNs compared to existing state-of-the-art methods.

...read moreread less

137 citations

Journal Article•DOI•

Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition

[...]

Xiangbo Shu¹, Jinhui Tang¹, Guo-Jun Qi², Wei Liu³, Jian Yang¹ - Show less +1 more•Institutions (3)

Nanjing University of Science and Technology¹, Huawei², Tencent³

01 Mar 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel Hierarchical Long Short-Term Concurrent Memory (H-LSTCM) is proposed to model the long-term inter-related dynamics among a group of persons for recognizing human interactions by comparing against baseline and state-of-the-art methods.

...read moreread less

Abstract: In this work, we aim to address the problem of human interaction recognition in videos by exploring the long-term inter-related dynamics among multiple persons. Recently, Long Short-Term Memory (LSTM) has become a popular choice to model individual dynamic for single-person action recognition due to its ability to capture the temporal motion information in a range. However, most existing LSTM-based methods focus only on capturing the dynamics of human interaction by simply combining all dynamics of individuals or modeling them as a whole. Such methods neglect the inter-related dynamics of how human interactions change over time. To this end, we propose a novel Hierarchical Long Short-Term Concurrent Memory (H-LSTCM) to model the long-term inter-related dynamics among a group of persons for recognizing human interactions. Specifically, we first feed each person's static features into a Single-Person LSTM to model the single-person dynamic. Subsequently, at one time step, the outputs of all Single-Person LSTM units are fed into a novel Concurrent LSTM (Co-LSTM) unit, which mainly consists of multiple sub-memory units, a new cell gate, and a new co-memory cell. In the Co-LSTM unit, each sub-memory unit stores individual motion information, while this Co-LSTM unit selectively integrates and stores inter-related motion information between multiple interacting persons from multiple sub-memory units via the cell gate and co-memory cell, respectively. Extensive experiments on several public datasets validate the effectiveness of the proposed H-LSTCM by comparing against baseline and state-of-the-art methods.

...read moreread less

131 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Journal Article•DOI•

Deep visual domain adaptation: A survey

[...]

Mei Wang¹, Weihong Deng¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

27 Oct 2018-Neurocomputing

TL;DR: Deep domain adaptation has emerged as a new learning technique to address the lack of massive amounts of labeled data as discussed by the authors, which leverages deep networks to learn more transferable representations by embedding domain adaptation in the pipeline of deep learning.

...read moreread less

1,211 citations

Proceedings Article•DOI•

Age Progression/Regression by Conditional Adversarial Autoencoder

[...]

Zhifei Zhang¹, Yang Song¹, Hairong Qi¹•Institutions (1)

University of Tennessee¹

27 Feb 2017

TL;DR: In this article, a conditional adversarial autoencoder (CAAE) is proposed to learn a face manifold, traversing on which smooth age progression and regression can be realized simultaneously.

...read moreread less

Abstract: If I provide you a face image of mine (without telling you the actual age when I took the picture) and a large amount of face images that I crawled (containing labeled faces of different ages but not necessarily paired), can you show me what I would look like when I am 80 or what I was like when I was 5? The answer is probably a No. Most existing face aging works attempt to learn the transformation between age groups and thus would require the paired samples as well as the labeled query image. In this paper, we look at the problem from a generative modeling perspective such that no paired samples is required. In addition, given an unlabeled image, the generative model can directly produce the image with desired age attribute. We propose a conditional adversarial autoencoder (CAAE) that learns a face manifold, traversing on which smooth age progression and regression can be realized simultaneously. In CAAE, the face is first mapped to a latent vector through a convolutional encoder, and then the vector is projected to the face manifold conditional on age through a deconvolutional generator. The latent vector preserves personalized face features (i.e., personality) and the age condition controls progression vs. regression. Two adversarial networks are imposed on the encoder and generator, respectively, forcing to generate more photo-realistic faces. Experimental results demonstrate the appealing performance and flexibility of the proposed framework by comparing with the state-of-the-art and ground truth.

...read moreread less

766 citations

Neocognitron--A New Algorithm for Pattern Recognition Tolerant of Deformations and Shifts in Position

[...]

Kunihiko Fukushima, Sei Miyake

01 Jan 1983

TL;DR: The neocognitron recognizes stimulus patterns correctly without being affected by shifts in position or even by considerable distortions in shape of the stimulus patterns.

...read moreread less

Abstract: Suggested by the structure of the visual nervous system, a new algorithm is proposed for pattern recognition. This algorithm can be realized with a multilayered network consisting of neuron-like cells. The network, “neocognitron”, is self-organized by unsupervised learning, and acquires the ability to recognize stimulus patterns according to the differences in their shapes: Any patterns which we human beings judge to be alike are also judged to be of the same category by the neocognitron. The neocognitron recognizes stimulus patterns correctly without being affected by shifts in position or even by considerable distortions in shape of the stimulus patterns.

...read moreread less

649 citations

Proceedings Article•DOI•

AgeDB: The First Manually Collected, In-the-Wild Age Database

[...]

Stylianos Moschoglou¹, Athanasios Papaioannou¹, Christos Sagonas¹, Jiankang Deng¹, Irene Kotsia², Stefanos Zafeiriou¹ - Show less +2 more•Institutions (2)

Imperial College London¹, Middlesex University²

26 Jul 2017

TL;DR: This paper presents the first, to the best of knowledge, manually collected "in-the-wild" age database, dubbed AgeDB, containing images annotated with accurate to the year, noise-free labels, which renders AgeDB suitable when performing experiments on age-invariant face verification, age estimation and face age progression "in the wild".

...read moreread less

Abstract: Over the last few years, increased interest has arisen with respect to age-related tasks in the Computer Vision community. As a result, several "in-the-wild" databases annotated with respect to the age attribute became available in the literature. Nevertheless, one major drawback of these databases is that they are semi-automatically collected and annotated and thus they contain noisy labels. Therefore, the algorithms that are evaluated in such databases are prone to noisy estimates. In order to overcome such drawbacks, we present in this paper the first, to the best of knowledge, manually collected "in-the-wild" age database, dubbed AgeDB, containing images annotated with accurate to the year, noise-free labels. As demonstrated by a series of experiments utilizing state-of-the-art algorithms, this unique property renders AgeDB suitable when performing experiments on age-invariant face verification, age estimation and face age progression "in-the-wild".

...read moreread less

520 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse