Home
/
Authors
/
Mei-Chen Yeh

Author

Mei-Chen Yeh

Other affiliations: University of California, Santa Barbara, Intel, National Tsing Hua University

Bio: Mei-Chen Yeh is an academic researcher from National Taiwan Normal University. The author has contributed to research in topics: Contextual image classification & Video copy detection. The author has an hindex of 12, co-authored 49 publications receiving 2058 citations. Previous affiliations of Mei-Chen Yeh include University of California, Santa Barbara & Intel.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2006
2005
2004
2002

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

[...]

Qiang Zhu¹, Mei-Chen Yeh¹, Kwang-Ting Cheng¹, Shai Avidan²•Institutions (2)

University of California, Santa Barbara¹, Mitsubishi Electric Research Laboratories²

17 Jun 2006

TL;DR: This work integrates the cascade-of-rejectors approach with the Histograms of Oriented Gradients features to achieve a fast and accurate human detection system that can process 5 to 30 frames per second depending on the density in which the image is scanned, while maintaining an accuracy level similar to existing methods.

...read moreread less

Abstract: We integrate the cascade-of-rejectors approach with the Histograms of Oriented Gradients (HoG) features to achieve a fast and accurate human detection system. The features used in our system are HoGs of variable-size blocks that capture salient features of humans automatically. Using AdaBoost for feature selection, we identify the appropriate set of blocks, from a large set of possible blocks. In our system, we use the integral image representation and a rejection cascade which significantly speed up the computation. For a 320 × 280 image, the system can process 5 to 30 frames per second depending on the density in which we scan the image, while maintaining an accuracy level similar to existing methods.

...read moreread less

1,626 citations

Proceedings Article•DOI•

Video copy detection by fast sequence matching

[...]

Mei-Chen Yeh¹, Kwang-Ting Cheng¹•Institutions (1)

University of California, Santa Barbara¹

08 Jul 2009

TL;DR: This paper views video copy detection as a local alignment problem between two frame sequences and proposes a two-level filtration approach which achieves significant acceleration to the matching process and is 18X faster than the original sequence matching algorithms.

...read moreread less

Abstract: Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-scale applications. In this paper we view video copy detection as a local alignment problem between two frame sequences and propose a two-level filtration approach which achieves significant acceleration to the matching process. First, we propose to use an adaptive vocabulary tree to index all frame descriptors extracted from the video database. In this step, each video is treated as a "bag of frames." Such an indexing structure not only provides a rich vocabulary for representing videos, but also enables efficient computation of a pyramid matching kernel between videos. This vocabulary tree filters those videos that are dissimilar to the query based on their histogram pyramid representations. Second, we propose a fast edit-distance-based sequence matching method that avoids unnecessary comparisons between dissimilar frame pairs. This step reduces the quadratic runtime to a linear time with respect to the lengths of the sequences under comparison. Experiments on the MUSCLE VCD benchmark demonstrate that our approach is effective and efficient. It is 18X faster than the original sequence matching algorithms. This technique can be applied to several other visual retrieval tasks including shape retrieval. We demonstrate that the proposed method can also achieve a significant speedup for the shape retrieval task on the MPEG-7 shape dataset.

...read moreread less

70 citations

Proceedings Article•DOI•

Multimodal fusion using learned text concepts for image categorization

[...]

Qiang Zhu¹, Mei-Chen Yeh¹, Kwang-Ting Cheng¹•Institutions (1)

University of California, Santa Barbara¹

23 Oct 2006

TL;DR: A multimodal fusion scheme which improves the image classification accuracy by incorporating the information derived from the embedded texts detected in the image under classification.

...read moreread less

Abstract: Conventional image categorization techniques primarily rely on low-level visual cues. In this paper, we describe a multimodal fusion scheme which improves the image classification accuracy by incorporating the information derived from the embedded texts detected in the image under classification. Specific to each image category, a text concept is first learned from a set of labeled texts in images of the target category using Multiple Instance Learning [1]. For an image under classification which contains multiple detected text lines, we calculate a weighted Euclidian distance between each text line and the learned text concept of the target category. Subsequently, the minimum distance, along with low-level visual cues, are jointly used as the features for SVM-based classification. Experiments on a challenging image database demonstrate that the proposed fusion framework achieves a higher accuracy than the state-of-art methods for image classification.

...read moreread less

48 citations

Patent•

Image buffering techniques

[...]

Yi-Jen Chiu¹, Mei-Chen Yeh¹•Institutions (1)

Intel¹

31 Mar 2006

TL;DR: In this paper, a system, apparatus, method and article to perform buffering techniques are described; the apparatus may include a buffer having a fixed number of storage slots that store reconstructed picture representations received from an image processing module.

...read moreread less

Abstract: A system, apparatus, method and article to perform buffering techniques are described. The apparatus may include a buffer having a fixed number of storage slots that store reconstructed picture representations received from an image processing module. Also, the apparatus may include a buffer status unit to store a multiple information items to indicate one or more buffer characteristics of the buffer. Further, the apparatus may include a buffer control module to manage storage within the buffer.

...read moreread less

46 citations

Proceedings Article•DOI•

A compact, effective descriptor for video copy detection

[...]

Mei-Chen Yeh¹, Kwang-Ting Cheng¹•Institutions (1)

University of California, Santa Barbara¹

19 Oct 2009

TL;DR: A new frame-level descriptor is proposed that encodes the internal structure of a video frame by computing the pair-wise correlations between geometrically pre-indexed blocks and is conceptually simple, small in size, and fast to compute.

...read moreread less

Abstract: Large scale video copy detection tasks require a compact and computational-efficient descriptor that is robust to various transformations that are typically applied to generate copies. In this paper, we propose a new frame-level descriptor for such a task. The descriptor encodes the internal structure of a video frame by computing the pair-wise correlations between geometrically pre-indexed blocks. It is conceptually simple, small in size, and fast to compute. Experiments using the MUSCLE VCD benchmark show its superior performance compared to existing approaches.

...read moreread less

36 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks

[...]

Kaipeng Zhang¹, Zhanpeng Zhang², Zhifeng Li¹, Yu Qiao¹•Institutions (2)

Chinese Academy of Sciences¹, The Chinese University of Hong Kong²

26 Aug 2016-IEEE Signal Processing Letters

TL;DR: Zhang et al. as mentioned in this paper proposed a deep cascaded multitask framework that exploits the inherent correlation between detection and alignment to boost up their performance, which leverages a cascaded architecture with three stages of carefully designed deep convolutional networks to predict face and landmark location in a coarse-to-fine manner.

...read moreread less

Abstract: Face detection and alignment in unconstrained environment are challenging due to various poses, illuminations, and occlusions. Recent studies show that deep learning approaches can achieve impressive performance on these two tasks. In this letter, we propose a deep cascaded multitask framework that exploits the inherent correlation between detection and alignment to boost up their performance. In particular, our framework leverages a cascaded architecture with three stages of carefully designed deep convolutional networks to predict face and landmark location in a coarse-to-fine manner. In addition, we propose a new online hard sample mining strategy that further improves the performance in practice. Our method achieves superior accuracy over the state-of-the-art techniques on the challenging face detection dataset and benchmark and WIDER FACE benchmarks for face detection, and annotated facial landmarks in the wild benchmark for face alignment, while keeps real-time performance.

...read moreread less

3,980 citations

Journal Article•DOI•

Pedestrian Detection: An Evaluation of the State of the Art

[...]

Piotr Dollár¹, Christian Wojek², Bernt Schiele², Pietro Perona¹•Institutions (2)

California Institute of Technology¹, Max Planck Society²

01 Apr 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An extensive evaluation of the state of the art in a unified framework of monocular pedestrian detection using sixteen pretrained state-of-the-art detectors across six data sets and proposes a refined per-frame evaluation methodology.

...read moreread less

Abstract: Pedestrian detection is a key problem in computer vision, with several applications that have the potential to positively impact quality of life. In recent years, the number of approaches to detecting pedestrians in monocular images has grown steadily. However, multiple data sets and widely varying evaluation protocols are used, making direct comparisons difficult. To address these shortcomings, we perform an extensive evaluation of the state of the art in a unified framework. We make three primary contributions: 1) We put together a large, well-annotated, and realistic monocular pedestrian detection data set and study the statistics of the size, position, and occlusion patterns of pedestrians in urban scenes, 2) we propose a refined per-frame evaluation methodology that allows us to carry out probing and informative comparisons, including measuring performance in relation to scale and occlusion, and 3) we evaluate the performance of sixteen pretrained state-of-the-art detectors across six data sets. Our study allows us to assess the state of the art and provides a framework for gauging future efforts. Our experiments show that despite significant progress, performance still has much room for improvement. In particular, detection is disappointing at low resolutions and for partially occluded pedestrians.

...read moreread less

3,170 citations

Journal Article•DOI•

Fast Feature Pyramids for Object Detection

[...]

Piotr Dollár¹, Ron Appel², Serge Belongie³, Pietro Perona²•Institutions (3)

Microsoft¹, California Institute of Technology², Cornell University³

01 Aug 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: For a broad family of features, this work finds that features computed at octave-spaced scale intervals are sufficient to approximate features on a finely-sampled pyramid, and this approximation yields considerable speedups with negligible loss in detection accuracy.

...read moreread less

Abstract: Multi-resolution image features may be approximated via extrapolation from nearby scales, rather than being computed explicitly. This fundamental insight allows us to design object detection algorithms that are as accurate, and considerably faster, than the state-of-the-art. The computational bottleneck of many modern detectors is the computation of features at every scale of a finely-sampled image pyramid. Our key insight is that one may compute finely sampled feature pyramids at a fraction of the cost, without sacrificing performance: for a broad family of features we find that features computed at octave-spaced scale intervals are sufficient to approximate features on a finely-sampled pyramid. Extrapolation is inexpensive as compared to direct feature computation. As a result, our approximation yields considerable speedups with negligible loss in detection accuracy. We modify three diverse visual recognition systems to use fast feature pyramids and show results on both pedestrian detection (measured on the Caltech, INRIA, TUD-Brussels and ETH data sets) and general object detection (measured on the PASCAL VOC). The approach is general and is widely applicable to vision algorithms requiring fine-grained multi-scale analysis. Our approximation is valid for images with broad spectra (most natural images) and fails for images with narrow band-pass spectra (e.g., periodic textures).

...read moreread less

2,000 citations

Proceedings Article•DOI•

An HOG-LBP human detector with partial occlusion handling

[...]

Xiaoyu Wang¹, Tony X. Han¹, Shuicheng Yan²•Institutions (2)

University of Missouri¹, National University of Singapore²

01 Sep 2009

TL;DR: By combining Histograms of Oriented Gradients (HOG) and Local Binary Pattern (LBP) as the feature set, this work proposes a novel human detection approach capable of handling partial occlusion and achieves the best human detection performance on the INRIA dataset.

...read moreread less

Abstract: By combining Histograms of Oriented Gradients (HOG) and Local Binary Pattern (LBP) as the feature set, we propose a novel human detection approach capable of handling partial occlusion. Two kinds of detectors, i.e., global detector for whole scanning windows and part detectors for local regions, are learned from the training data using linear SVM. For each ambiguous scanning window, we construct an occlusion likelihood map by using the response of each block of the HOG feature to the global detector. The occlusion likelihood map is then segmented by Mean-shift approach. The segmented portion of the window with a majority of negative response is inferred as an occluded region. If partial occlusion is indicated with high likelihood in a certain scanning window, part detectors are applied on the unoccluded regions to achieve the final classification on the current scanning window. With the help of the augmented HOG-LBP feature and the global-part occlusion handling method, we achieve a detection rate of 91.3% with FPPW= 10−6, 94.7% with FPPW= 10−5, and 97.9% with FPPW= 10−4 on the INRIA dataset, which, to our best knowledge, is the best human detection performance on the INRIA dataset. The global-part occlusion handling method is further validated using synthesized occlusion data constructed from the INRIA and Pascal dataset.

...read moreread less

1,838 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse