Home
/
Authors
/
Kinh Tieu

Author

Kinh Tieu

Other affiliations: Mitsubishi Electric Research Laboratories, Yale University

Bio: Kinh Tieu is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Image retrieval & Feature vector. The author has an hindex of 16, co-authored 31 publications receiving 2206 citations. Previous affiliations of Kinh Tieu include Mitsubishi Electric Research Laboratories & Yale University.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Boosting image retrieval

[...]

Kinh Tieu¹, Paul A. Viola¹•Institutions (1)

Massachusetts Institute of Technology¹

15 Jun 2000

TL;DR: An approach for image retrieval using a very large number of highly selective features and efficient online learning based on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes.

...read moreread less

Abstract: We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure (in our implementation there are over 45,000 highly selective features). At query time a user selects a few example images, and a technique known as "boosting" is used to learn a classification function in this feature space. By construction, the boosting procedure learns a simple classifier which only relies on 20 of the features. As a result a very large database of images can be scanned rapidly, perhaps a million images per second. Finally we will describe a set of experiments performed using our retrieval system on a database of 3000 images.

...read moreread less

504 citations

Journal Article•DOI•

Boosting Image Retrieval

[...]

Kinh Tieu¹, Paul A. Viola²•Institutions (2)

Massachusetts Institute of Technology¹, Mitsubishi Electric Research Laboratories²

01 Jan 2004-International Journal of Computer Vision

TL;DR: This work proposes a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure and shows results on a wide variety of image queries.

...read moreread less

Abstract: We present an approach for image retrieval using a very large number of highly selective features and efficient learning of queries. Our approach is predicated on the assumption that each image is generated by a sparse set of visual “causes” and that images which are visually similar share causes. We propose a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure (in our implementation there are over 46,000 highly selective features). At query time a user selects a few example images, and the AdaBoost algorithm is used to learn a classification function which depends on a small number of the most appropriate features. This yields a highly efficient classification function. In addition we show that the AdaBoost framework provides a natural mechanism for the incorporation of relevance feedback. Finally we show results on a wide variety of image queries.

...read moreread less

419 citations

Book Chapter•DOI•

Learning semantic scene models by trajectory analysis

[...]

Xiaogang Wang¹, Kinh Tieu¹, Eric Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

07 May 2006

TL;DR: An unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene is described and novel clustering algorithms which use both similarity and comparison confidence are introduced.

...read moreread less

Abstract: In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also provide a comparison confidence measure which indicates how well the measured image-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, sources and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.

...read moreread less

318 citations

Proceedings Article•DOI•

Fully automatic pose-invariant face recognition via 3D pose normalization

[...]

Akshay Asthana¹, Tim K. Marks¹, Michael Jones¹, Kinh Tieu¹, M. V. Rohith¹ - Show less +1 more•Institutions (1)

Mitsubishi Electric Research Laboratories¹

06 Nov 2011

TL;DR: This paper proposes a 3D pose normalization method that is completely automatic and leverages the accurate 2D facial feature points found by the system and outperforms other comparable methods convincingly.

...read moreread less

Abstract: An ideal approach to the problem of pose-invariant face recognition would handle continuous pose variations, would not be database specific, and would achieve high accuracy without any manual intervention. Most of the existing approaches fail to match one or more of these goals. In this paper, we present a fully automatic system for pose-invariant face recognition that not only meets these requirements but also outperforms other comparable methods. We propose a 3D pose normalization method that is completely automatic and leverages the accurate 2D facial feature points found by the system. The current system can handle 3D pose variation up to ±45° in yaw and ±30° in pitch angles. Recognition experiments were conducted on the USF 3D, Multi-PIE, CMU-PIE, FERET, and FacePix databases. Our system not only shows excellent generalization by achieving high accuracy on all 5 databases but also outperforms other methods convincingly.

...read moreread less

243 citations

Proceedings Article•DOI•

Inference of non-overlapping camera network topology by measuring statistical dependence

[...]

Kinh Tieu¹, G. Dalley¹, W.E.L. Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

17 Oct 2005

TL;DR: An approach for inferring the topology of a camera network by measuring statistical dependence between observations in different cameras is presented, accomplished by non-parametric estimates of statistical dependence and Bayesian integration of the unknown correspondence.

...read moreread less

Abstract: We present an approach for inferring the topology of a camera network by measuring statistical dependence between observations in different cameras. Two cameras are considered connected if objects seen departing in one camera is seen arriving in the other. This is captured by the degree of statistical dependence between the cameras. The nature of dependence is characterized by the distribution of observation transformations between cameras, such as departure to arrival transition times, and color appearance. We show how to measure statistical dependence when the correspondence between observations in different cameras is unknown. This is accomplished by non-parametric estimates of statistical dependence and Bayesian integration of the unknown correspondence. Our approach generalizes previous work which assumed restricted parametric transition distributions and only implicitly dealt with unknown correspondence. Results are shown on simulated and real data. We also describe a technique for learning the absolute locations of the cameras with Global Positioning System (GPS) side information

...read moreread less

161 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Rapid object detection using a boosted cascade of simple features

[...]

Paul A. Viola¹, Michael Jones•Institutions (1)

Mitsubishi¹

01 Dec 2001

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Abstract: This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the "integral image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers. The third contribution is a method for combining increasingly more complex classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection.

...read moreread less

18,620 citations

Journal Article•DOI•

Robust Real-Time Face Detection

[...]

Paul A. Viola¹, Michael Jones²•Institutions (2)

Microsoft¹, Mitsubishi Electric²

01 May 2004-International Journal of Computer Vision

TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.

...read moreread less

Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

...read moreread less

13,037 citations

Proceedings Article•DOI•

Robust real-time face detection

[...]

Paul A. Viola¹, Michael Jones²•Institutions (2)

Microsoft¹, Mitsubishi Electric²

07 Jul 2001

TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.

...read moreread less

Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algo- rithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a "cascade" which allows back- ground regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection perfor- mance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

...read moreread less

10,592 citations

Journal Article•DOI•

Content-based image retrieval at the end of the early years

[...]

Arnold W. M. Smeulders¹, Marcel Worring¹, Simone Santini², Amarnath Gupta², Ramesh Jain - Show less +1 more•Institutions (2)

University of Amsterdam¹, University of California, San Diego²

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.

...read moreread less

Abstract: Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

...read moreread less

6,447 citations

Journal Article•DOI•

Object tracking: A survey

[...]

Alper Yilmaz¹, Omar Javed, Mubarak Shah²•Institutions (2)

Ohio State University¹, University of Central Florida²

25 Dec 2006-ACM Computing Surveys

TL;DR: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends to discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

Abstract: The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.

...read moreread less

5,318 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse