Home
/
Authors
/
Xi Li

Author

Xi Li

Other affiliations: Chinese Academy of Sciences, Télécom ParisTech, University of Adelaide ...read more

Bio: Xi Li is an academic researcher from Zhejiang University. The author has contributed to research in topics: Deep learning & Video tracking. The author has an hindex of 42, co-authored 245 publications receiving 7217 citations. Previous affiliations of Xi Li include Chinese Academy of Sciences & Télécom ParisTech.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Deeply-Learned Part-Aligned Representations for Person Re-identification

[...]

Liming Zhao¹, Xi Li¹, Yueting Zhuang¹, Jingdong Wang²•Institutions (2)

Zhejiang University¹, Microsoft²

01 Oct 2017

TL;DR: This paper proposes a simple yet effective human part-aligned representation for handling the body part misalignment problem, and shows state-of-the-art results over standard datasets, Market-1501,CUHK03, CUHK01 and VIPeR.

...read moreread less

Abstract: In this paper, we address the problem of person re-identification, which refers to associating the persons captured from different cameras. We propose a simple yet effective human part-aligned representation for handling the body part misalignment problem. Our approach decomposes the human body into regions (parts) which are discriminative for person matching, accordingly computes the representations over the regions, and aggregates the similarities computed between the corresponding regions of a pair of probe and gallery images as the overall matching score. Our formulation, inspired by attention models, is a deep neural network modeling the three steps together, which is learnt through minimizing the triplet loss function without requiring body part labeling information. Unlike most existing deep learning algorithms that learn a global or spatial partition-based local representation, our approach performs human body partition, and thus is more robust to pose changes and various human spatial distributions in the person bounding box. Our approach shows state-of-the-art results over standard datasets, Market-1501, CUHK03, CUHK01 and VIPeR. 1

...read moreread less

653 citations

Journal Article•DOI•

A survey of appearance models in visual object tracking

[...]

Xi Li¹, Weiming Hu¹, Chunhua Shen², Zhongfei Zhang³, Anthony Dick², Anton van den Hengel² - Show less +2 more•Institutions (3)

Chinese Academy of Sciences¹, University of Adelaide², Binghamton University³

08 Oct 2013-ACM Transactions on Intelligent Systems and Technology

TL;DR: A detailed review of the existing 2D appearance models for visual object tracking can be found in this article, where the authors decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling.

...read moreread less

Abstract: Visual object tracking is a significant computer vision task which can be applied to many domains, such as visual surveillance, human computer interaction, and video compression. Despite extensive research on this topic, it still suffers from difficulties in handling complex object appearance changes caused by factors such as illumination variation, partial occlusion, shape deformation, and camera motion. Therefore, effective modeling of the 2D appearance of tracked objects is a key issue for the success of a visual tracker. In the literature, researchers have proposed a variety of 2D appearance models. To help readers swiftly learn the recent advances in 2D appearance models for visual object tracking, we contribute this survey, which provides a detailed review of the existing 2D appearance models. In particular, this survey takes a module-based architecture that enables readers to easily grasp the key points of visual object tracking. In this survey, we first decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling. Then, different 2D appearance models are categorized and discussed with respect to their composition modules. Finally, we address several issues of interest as well as the remaining challenges for future research on this topic. The contributions of this survey are fourfold. First, we review the literature of visual representations according to their feature-construction mechanisms (i.e., local and global). Second, the existing statistical modeling schemes for tracking-by-detection are reviewed according to their model-construction mechanisms: generative, discriminative, and hybrid generative-discriminative. Third, each type of visual representations or statistical modeling techniques is analyzed and discussed from a theoretical or practical viewpoint. Fourth, the existing benchmark resources (e.g., source codes and video datasets) are examined in this survey.

...read moreread less

653 citations

Posted Content•

A Survey of Appearance Models in Visual Object Tracking

[...]

Xi Li¹, Weiming Hu¹, Chunhua Shen², Zhongfei Zhang³, Anthony Dick², Anton van den Hengel² - Show less +2 more•Institutions (3)

Chinese Academy of Sciences¹, University of Adelaide², Binghamton University³

20 Mar 2013-arXiv: Computer Vision and Pattern Recognition

TL;DR: This survey provides a detailed review of the existing 2D appearance models for visual object tracking and takes a module-based architecture that enables readers to easily grasp the key points ofVisual object tracking.

...read moreread less

Abstract: Visual object tracking is a significant computer vision task which can be applied to many domains such as visual surveillance, human computer interaction, and video compression. In the literature, researchers have proposed a variety of 2D appearance models. To help readers swiftly learn the recent advances in 2D appearance models for visual object tracking, we contribute this survey, which provides a detailed review of the existing 2D appearance models. In particular, this survey takes a module-based architecture that enables readers to easily grasp the key points of visual object tracking. In this survey, we first decompose the problem of appearance modeling into two different processing stages: visual representation and statistical modeling. Then, different 2D appearance models are categorized and discussed with respect to their composition modules. Finally, we address several issues of interest as well as the remaining challenges for future research on this topic. The contributions of this survey are four-fold. First, we review the literature of visual representations according to their feature-construction mechanisms (i.e., local and global). Second, the existing statistical modeling schemes for tracking-by-detection are reviewed according to their model-construction mechanisms: generative, discriminative, and hybrid generative-discriminative. Third, each type of visual representations or statistical modeling techniques is analyzed and discussed from a theoretical or practical viewpoint. Fourth, the existing benchmark resources (e.g., source code and video datasets) are examined in this survey.

...read moreread less

605 citations

Journal Article•DOI•

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

[...]

Xi Li¹, Liming Zhao¹, Lina Wei¹, Ming-Hsuan Yang², Fei Wu¹, Yueting Zhuang¹, Haibin Ling³, Jingdong Wang⁴ - Show less +4 more•Institutions (4)

Zhejiang University¹, University of California, Merced², Temple University³, Microsoft⁴

09 Jun 2016-IEEE Transactions on Image Processing

TL;DR: This paper proposes a multi-task deep saliency model based on a fully convolutional neural network with global input (whole raw images) and global output (Whole saliency maps) and presents a graph Laplacian regularized nonlinear regression model for saliency refinement.

...read moreread less

Abstract: A key problem in salient object detection is how to effectively model the semantic properties of salient objects in a data-driven manner. In this paper, we propose a multi-task deep saliency model based on a fully convolutional neural network with global input (whole raw images) and global output (whole saliency maps). In principle, the proposed saliency model takes a data-driven strategy for encoding the underlying saliency prior information, and then sets up a multi-task learning scheme for exploring the intrinsic correlations between saliency detection and semantic image segmentation. Through collaborative feature learning from such two correlated tasks, the shared fully convolutional layers produce effective features for object perception. Moreover, it is capable of capturing the semantic information on salient objects across different levels using the fully convolutional layers, which investigate the feature-sharing properties of salient object detection with a great reduction of feature redundancy. Finally, we present a graph Laplacian regularized nonlinear regression model for saliency refinement. Experimental results demonstrate the effectiveness of our approach in comparison with the state-of-the-art approaches.

...read moreread less

497 citations

Proceedings Article•DOI•

Contextual Hypergraph Modeling for Salient Object Detection

[...]

Xi Li¹, Yao Li¹, Chunhua Shen¹, Anthony Dick¹, Anton van den Hengel¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

01 Dec 2013

TL;DR: This work model an image as a hyper graph that utilizes a set of hyper edges to capture the contextual properties of image pixels or regions to solve the problem of salient object detection.

...read moreread less

Abstract: Salient object detection aims to locate objects that capture human attention within images. Previous approaches often pose this as a problem of image contrast analysis. In this work, we model an image as a hyper graph that utilizes a set of hyper edges to capture the contextual properties of image pixels or regions. As a result, the problem of salient object detection becomes one of finding salient vertices and hyper edges in the hyper graph. The main advantage of hyper graph modeling is that it takes into account each pixel's (or region's) affinity with its neighborhood as well as its separation from image background. Furthermore, we propose an alternative approach based on center-versus-surround contextual contrast analysis, which performs salient object detection by optimizing a cost-sensitive support vector machine (SVM) objective function. Experimental results on four challenging datasets demonstrate the effectiveness of the proposed approaches against the state-of-the-art approaches to salient object detection.

...read moreread less

208 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Proceedings Article•DOI•

Online Object Tracking: A Benchmark

[...]

Yi Wu¹, Jongwoo Lim², Ming-Hsuan Yang¹•Institutions (2)

University of California, Merced¹, Hanyang University²

23 Jun 2013

TL;DR: Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

Abstract: Object tracking is one of the most important components in numerous applications of computer vision. While much progress has been made in recent years with efforts on sharing code and datasets, it is of great importance to develop a library and benchmark to gauge the state of the art. After briefly reviewing recent advances of online object tracking, we carry out large scale experiments with various evaluation criteria to understand how these algorithms perform. The test image sequences are annotated with different attributes for performance evaluation and analysis. By analyzing quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

3,828 citations

Journal Article•DOI•

Object Detection With Deep Learning: A Review

[...]

Zhong-Qiu Zhao¹, Peng Zheng¹, Shou-Tao Xu¹, Xindong Wu²•Institutions (2)

Hefei University of Technology¹, University of Louisiana at Lafayette²

28 Jan 2019-IEEE Transactions on Neural Networks

TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.

...read moreread less

Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.

...read moreread less

3,097 citations

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse