Home
/
Authors
/
Tong Zhang

Author

Tong Zhang

Other affiliations: University of Macau, Hong Kong University of Science and Technology, Southeast University ...read more

Bio: Tong Zhang is an academic researcher from South China University of Technology. The author has contributed to research in topics: Convolutional neural network & Graph (abstract data type). The author has an hindex of 28, co-authored 118 publications receiving 2618 citations. Previous affiliations of Tong Zhang include University of Macau & Hong Kong University of Science and Technology.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Spatial–Temporal Recurrent Neural Network for Emotion Recognition

[...]

Tong Zhang¹, Wenming Zheng¹, Zhen Cui², Yuan Zong¹, Yang Li¹ - Show less +1 more•Institutions (2)

Southeast University¹, Nanjing University of Science and Technology²

01 Mar 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The proposed two-layer RNN model provides an effective way to make use of both spatial and temporal dependencies of the input signals for emotion recognition and experimental results demonstrate the proposed STRNN method is more competitive over those state-of-the-art methods.

...read moreread less

Abstract: In this paper, we propose a novel deep learning framework, called spatial–temporal recurrent neural network (STRNN), to integrate the feature learning from both spatial and temporal information of signal sources into a unified spatial–temporal dependency model. In STRNN, to capture those spatially co-occurrent variations of human emotions, a multidirectional recurrent neural network (RNN) layer is employed to capture long-range contextual cues by traversing the spatial regions of each temporal slice along different directions. Then a bi-directional temporal RNN layer is further used to learn the discriminative features characterizing the temporal dependencies of the sequences, where sequences are produced from the spatial RNN layer. To further select those salient regions with more discriminative ability for emotion recognition, we impose sparse projection onto those hidden states of spatial and temporal domains to improve the model discriminant ability. Consequently, the proposed two-layer RNN model provides an effective way to make use of both spatial and temporal dependencies of the input signals for emotion recognition. Experimental results on the public emotion datasets of electroencephalogram and facial expression demonstrate the proposed STRNN method is more competitive over those state-of-the-art methods.

...read moreread less

335 citations

Journal Article•DOI•

A Deep Neural Network-Driven Feature Learning Method for Multi-view Facial Expression Recognition

[...]

Tong Zhang¹, Wenming Zheng¹, Zhen Cui¹, Yuan Zong¹, Jingwei Yan¹, Keyu Yan¹ - Show less +2 more•Institutions (1)

Southeast University¹

01 Dec 2016-IEEE Transactions on Multimedia

TL;DR: A novel deep neural network (DNN)-driven feature learning method is proposed and applied to multi-view facial expression recognition (FER) and the experimental results show that the algorithm outperforms the state-of-the-art methods.

...read moreread less

Abstract: In this paper, a novel deep neural network (DNN)-driven feature learning method is proposed and applied to multi-view facial expression recognition (FER). In this method, scale invariant feature transform (SIFT) features corresponding to a set of landmark points are first extracted from each facial image. Then, a feature matrix consisting of the extracted SIFT feature vectors is used as input data and sent to a well-designed DNN model for learning optimal discriminative features for expression classification. The proposed DNN model employs several layers to characterize the corresponding relationship between the SIFT feature vectors and their corresponding high-level semantic information. By training the DNN model, we are able to learn a set of optimal features that are well suitable for classifying the facial expressions across different facial views. To evaluate the effectiveness of the proposed method, two nonfrontal facial expression databases, namely BU-3DFE and Multi-PIE, are respectively used to testify our method and the experimental results show that our algorithm outperforms the state-of-the-art methods.

...read moreread less

262 citations

Book Chapter•DOI•

Recurrent Fusion Network for Image Captioning

[...]

Wenhao Jiang¹, Lin Ma¹, Yu-Gang Jiang², Wei Liu¹, Tong Zhang¹ - Show less +1 more•Institutions (2)

Tencent¹, Fudan University²

08 Sep 2018

TL;DR: This paper proposes a novel recurrent fusion network (RFNet) for the image captioning task, which can exploit the interactions among the outputs of the image encoders and generate new compact and informative representations for the decoder.

...read moreread less

Abstract: Recently, much advance has been made in image captioning, and an encoder-decoder framework has been adopted by all the state-of-the-art models. Under this framework, an input image is encoded by a convolutional neural network (CNN) and then translated into natural language with a recurrent neural network (RNN). The existing models counting on this framework employ only one kind of CNNs, e.g., ResNet or Inception-X, which describes the image contents from only one specific view point. Thus, the semantic meaning of the input image cannot be comprehensively understood, which restricts improving the performance. In this paper, to exploit the complementary information from multiple encoders, we propose a novel recurrent fusion network (RFNet) for the image captioning task. The fusion process in our model can exploit the interactions among the outputs of the image encoders and generate new compact and informative representations for the decoder. Experiments on the MSCOCO dataset demonstrate the effectiveness of our proposed RFNet, which sets a new state-of-the-art for image captioning.

...read moreread less

242 citations

Proceedings Article•DOI•

Multi-scale convolutional neural networks for crowd counting

[...]

Lingke Zeng¹, Xiangmin Xu¹, Bolun Cai¹, Suo Qiu¹, Tong Zhang¹ - Show less +1 more•Institutions (1)

South China University of Technology¹

01 Sep 2017

TL;DR: A novel multi-scale convolutional neural network (MSCNN) for single image crowd counting is proposed, able to generate scale-relevant features for higher crowd counting performances in a single-column architecture, which is both accuracy and cost effective for practical applications.

...read moreread less

Abstract: Crowd counting on static images is a challenging problem due to scale variations. Recently deep neural networks have been shown to be effective in this task. However, existing neural-networks-based methods often use the multi-column or multi-network model to extract the scale-relevant features, which is more complicated for optimization and computation wasting. To this end, we propose a novel multi-scale convolutional neural network (MSCNN) for single image crowd counting. Based on the multi-scale blobs, the network is able to generate scale-relevant features for higher crowd counting performances in a single-column architecture, which is both accuracy and cost effective for practical applications. Complemental results show that our method outperforms the state-of-the-art methods on both accuracy and robustness with far less number of parameters.

...read moreread less

191 citations

Posted Content•

Efficient Decision-based Black-box Adversarial Attacks on Face Recognition

[...]

Yinpeng Dong¹, Hang Su¹, Baoyuan Wu², Zhifeng Li², Wei Liu³, Tong Zhang⁴, Jun Zhu¹ - Show less +3 more•Institutions (4)

Tsinghua University¹, Tencent², Columbia University³, Hong Kong University of Science and Technology⁴

09 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper evaluates the robustness of state-of-the-art face recognition models in the decision-based black-box attack setting, where the attackers have no access to the model parameters and gradients, but can only acquire hard-label predictions by sending queries to the target model.

...read moreread less

Abstract: Face recognition has obtained remarkable progress in recent years due to the great improvement of deep convolutional neural networks (CNNs). However, deep CNNs are vulnerable to adversarial examples, which can cause fateful consequences in real-world face recognition applications with security-sensitive purposes. Adversarial attacks are widely studied as they can identify the vulnerability of the models before they are deployed. In this paper, we evaluate the robustness of state-of-the-art face recognition models in the decision-based black-box attack setting, where the attackers have no access to the model parameters and gradients, but can only acquire hard-label predictions by sending queries to the target model. This attack setting is more practical in real-world face recognition systems. To improve the efficiency of previous methods, we propose an evolutionary attack algorithm, which can model the local geometries of the search directions and reduce the dimension of the search space. Extensive experiments demonstrate the effectiveness of the proposed method that induces a minimum perturbation to an input face image with fewer queries. We also apply the proposed method to attack a real-world face recognition system successfully.

...read moreread less

180 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

[신간의 별자리x] 우리/미술, 그리고 ‘슬픔의 박물관’

[...]

이화영

01 Jan 2015

12,972 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Journal Article•DOI•

Deep Learning for Image Super-Resolution: A Survey

[...]

Zhihao Wang¹, Jian Chen¹, Steven C. H. Hoi²•Institutions (2)

South China University of Technology¹, Salesforce.com²

01 Oct 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A survey on recent advances of image super-resolution techniques using deep learning approaches in a systematic way, which can roughly group the existing studies of SR techniques into three major categories: supervised SR, unsupervised SR, and domain-specific SR.

...read moreread less

Abstract: Image Super-Resolution (SR) is an important class of image processing techniqueso enhance the resolution of images and videos in computer vision. Recent years have witnessed remarkable progress of image super-resolution using deep learning techniques. This article aims to provide a comprehensive survey on recent advances of image super-resolution using deep learning approaches. In general, we can roughly group the existing studies of SR techniques into three major categories: supervised SR, unsupervised SR, and domain-specific SR. In addition, we also cover some other important issues, such as publicly available benchmark datasets and performance evaluation metrics. Finally, we conclude this survey by highlighting several future directions and open issues which should be further addressed by the community in the future.

...read moreread less

837 citations

Journal Article•DOI•

Deep learning for electroencephalogram (EEG) classification tasks: a review.

[...]

Alexander Craik¹, Yongtian He¹, Jose L. Contreras-Vidal¹•Institutions (1)

University of Houston¹

26 Feb 2019-Journal of Neural Engineering

TL;DR: Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.

...read moreread less

Abstract: Objective Electroencephalography (EEG) analysis has been an important tool in neuroscience with applications in neuroscience, neural engineering (e.g. Brain-computer interfaces, BCI's), and even commercial applications. Many of the analytical tools used in EEG studies have used machine learning to uncover relevant information for neural classification and neuroimaging. Recently, the availability of large EEG data sets and advances in machine learning have both led to the deployment of deep learning architectures, especially in the analysis of EEG signals and in understanding the information it may contain for brain functionality. The robust automatic classification of these signals is an important step towards making the use of EEG more practical in many applications and less reliant on trained professionals. Towards this goal, a systematic review of the literature on deep learning applications to EEG classification was performed to address the following critical questions: (1) Which EEG classification tasks have been explored with deep learning? (2) What input formulations have been used for training the deep networks? (3) Are there specific deep learning network structures suitable for specific types of tasks? Approach A systematic literature review of EEG classification using deep learning was performed on Web of Science and PubMed databases, resulting in 90 identified studies. Those studies were analyzed based on type of task, EEG preprocessing methods, input type, and deep learning architecture. Main results For EEG classification tasks, convolutional neural networks, recurrent neural networks, deep belief networks outperform stacked auto-encoders and multi-layer perceptron neural networks in classification accuracy. The tasks that used deep learning fell into five general groups: emotion recognition, motor imagery, mental workload, seizure detection, event related potential detection, and sleep scoring. For each type of task, we describe the specific input formulation, major characteristics, and end classifier recommendations found through this review. Significance This review summarizes the current practices and performance outcomes in the use of deep learning for EEG classification. Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.

...read moreread less

777 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse