scispace - formally typeset
Search or ask a question
Author

Terence Sim

Other affiliations: Carnegie Mellon University
Bio: Terence Sim is an academic researcher from National University of Singapore. The author has contributed to research in topics: Facial recognition system & Face detection. The author has an hindex of 30, co-authored 125 publications receiving 6917 citations. Previous affiliations of Terence Sim include Carnegie Mellon University.


Papers
More filters
Journal ArticleDOI
TL;DR: In the Fall of 2000, a database of more than 40,000 facial images of 68 people was collected using the Carnegie Mellon University 3D Room to imaged each person across 13 different poses, under 43 different illumination conditions, and with four different expressions.
Abstract: In the Fall of 2000, we collected a database of more than 40,000 facial images of 68 people. Using the Carnegie Mellon University 3D Room, we imaged each person across 13 different poses, under 43 different illumination conditions, and with four different expressions. We call this the CMU pose, illumination, and expression (PIE) database. We describe the imaging hardware, the collection procedure, the organization of the images, several possible uses, and how to obtain the database.

1,880 citations

Proceedings ArticleDOI
20 May 2002
TL;DR: Between October 2000 and December 2000, a database of over 40,000 facial images of 68 people was collected, using the CMU 3D Room to imaged each person across 13 different poses, under 43 different illumination conditions, and with four different expressions.
Abstract: Between October 2000 and December 2000, we collected a database of over 40,000 facial images of 68 people. Using the CMU (Carnegie Mellon University) 3D Room, we imaged each person across 13 different poses, under 43 different illumination conditions, and with four different expressions. We call this database the CMU Pose, Illumination and Expression (PIE) database. In this paper, we describe the imaging hardware, the collection procedure, the organization of the database, several potential uses of the database, and how to obtain the database.

1,697 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: This paper proposes a novel approach to correlation filter estimation that takes advantage of inherent computational redundancies in the frequency domain, dramatically reduces boundary effects, and is able to implicitly exploit all possible patches densely extracted from training examples during learning process.
Abstract: Correlation filters take advantage of specific properties in the Fourier domain allowing them to be estimated efficiently: O(N D log D) in the frequency domain, versus O(D3 + N D2) spatially where D is signal length, and N is the number of signals. Recent extensions to correlation filters, such as MOSSE, have reignited interest of their use in the vision community due to their robustness and attractive computational properties. In this paper we demonstrate, however, that this computational efficiency comes at a cost. Specifically, we demonstrate that only 1/D proportion of shifted examples are unaffected by boundary effects which has a dramatic effect on detection/tracking performance. In this paper, we propose a novel approach to correlation filter estimation that: (i) takes advantage of inherent computational redundancies in the frequency domain, (ii) dramatically reduces boundary effects, and (iii) is able to implicitly exploit all possible patches densely extracted from training examples during learning process. Impressive object tracking and detection results are presented in terms of both accuracy and computational efficiency.

373 citations

Journal ArticleDOI
TL;DR: This paper presents a simple yet effective approach to estimate the amount of spatially varying defocus blur at edge locations, and demonstrates the effectiveness of this method in providing a reliable estimation of the defocus map.

370 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: A novel framework for learning a multi-channel detector/filter efficiently in the frequency domain, both in terms of training time and memory footprint is proposed, which is referred to as a multichannel correlation filter.
Abstract: Modern descriptors like HOG and SIFT are now commonly used in vision for pattern detection within image and video. From a signal processing perspective, this detection process can be efficiently posed as a correlation/ convolution between a multi-channel image and a multi-channel detector/filter which results in a single channel response map indicating where the pattern (e.g. object) has occurred. In this paper, we propose a novel framework for learning a multi-channel detector/filter efficiently in the frequency domain, both in terms of training time and memory footprint, which we refer to as a multichannel correlation filter. To demonstrate the effectiveness of our strategy, we evaluate it across a number of visual detection/ localization tasks where we: (i) exhibit superior performance to current state of the art correlation filters, and (ii) superior computational and memory efficiencies compared to state of the art spatial detectors.

276 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.
Abstract: Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.

8,289 citations

01 Oct 2008
TL;DR: The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life, and exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background.
Abstract: Most face databases have been created under controlled conditions to facilitate the study of specific parameters on the face recognition problem. These parameters include such variables as position, pose, lighting, background, camera quality, and gender. While there are many applications for face recognition technology in which one can control the parameters of image acquisition, there are also many applications in which the practitioner has little or no control over such parameters. This database, Labeled Faces in the Wild, is provided as an aid in studying the latter, unconstrained, recognition problem. The database contains labeled face photographs spanning the range of conditions typically encountered in everyday life. The database exhibits “natural” variability in factors such as pose, lighting, race, accessories, occlusions, and background. In addition to describing the details of the database, we provide specific experimental paradigms for which the database is suitable. This is done in an effort to make research performed with the database as consistent and comparable as possible. We provide baseline results, including results of a state of the art face recognition system combined with a face alignment system. To facilitate experimentation on the database, we provide several parallel databases, including an aligned version.

5,742 citations

01 Jan 1964
TL;DR: In this paper, the notion of a collective unconscious was introduced as a theory of remembering in social psychology, and a study of remembering as a study in Social Psychology was carried out.
Abstract: Part I. Experimental Studies: 2. Experiment in psychology 3. Experiments on perceiving III Experiments on imaging 4-8. Experiments on remembering: (a) The method of description (b) The method of repeated reproduction (c) The method of picture writing (d) The method of serial reproduction (e) The method of serial reproduction picture material 9. Perceiving, recognizing, remembering 10. A theory of remembering 11. Images and their functions 12. Meaning Part II. Remembering as a Study in Social Psychology: 13. Social psychology 14. Social psychology and the matter of recall 15. Social psychology and the manner of recall 16. Conventionalism 17. The notion of a collective unconscious 18. The basis of social recall 19. A summary and some conclusions.

5,690 citations

Journal ArticleDOI
TL;DR: A new kernelized correlation filter is derived, that unlike other kernel algorithms has the exact same complexity as its linear counterpart, which is called dual correlation filter (DCF), which outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite being implemented in a few lines of code.
Abstract: The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies—any overlapping pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the discrete Fourier transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new kernelized correlation filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call dual correlation filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source.

4,994 citations