scispace - formally typeset
Search or ask a question
Author

Matthew Turk

Bio: Matthew Turk is an academic researcher from Toyota Technological Institute at Chicago. The author has contributed to research in topics: Augmented reality & Facial recognition system. The author has an hindex of 55, co-authored 198 publications receiving 30972 citations. Previous affiliations of Matthew Turk include Massachusetts Institute of Technology & University of California.


Papers
More filters
Proceedings ArticleDOI
TL;DR: This work review and categorize algorithms for contentaware image retargeting, i.e., resizing an image while taking its content into consideration to preserve important regions and minimize distortions, as it requires preserving the relevant information while maintaining an aesthetically pleasing image for the user.
Abstract: Advances in imaging technology have made the capture and display of digital images ubiquitous. A variety of displays are used to view them, ranging from high-resolution computer monitors to low-resolution mobile devices, and images often have to undergo changes in size and aspect ratio to adapt to different screens. Also, displaying and printing documents with embedded images frequently entail resizing of the images to comply with the overall layout. Straightforward image resizing operators, such as scaling, often do not produce satisfactory results, since they are oblivious to image content. In this work, we review and categorize algorithms for contentaware image retargeting, i.e., resizing an image while taking its content into consideration to preserve important regions and minimize distortions. This is a challenging problem, as it requires preserving the relevant information while maintaining an aesthetically pleasing image for the user. The techniques typically start by computing an importance map which represents the relevance of every pixel, and then apply an operator that resizes the image while taking into account the importance map and additional constraints. We intend this review to be useful to researchers and practitioners interested in image retargeting.

146 citations

Proceedings ArticleDOI
21 Sep 2012
TL;DR: A framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment using the Augmented Reality paradigm and model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments is described.
Abstract: We describe a framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment. Our system uses the Augmented Reality paradigm and model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments. In order to evaluate our concept and prototype, we conducted a user study with 48 participants in which a remote expert instructed a local user to operate a mock-up airplane cockpit. Users performed significantly better with our prototype (40.8 tasks completed on average) as well as with static annotations (37.3) than without annotations (28.9). 79% of the users preferred our prototype despite noticeably imperfect tracking.

139 citations

Journal ArticleDOI
TL;DR: In this paper , the authors present Event Horizon Telescope (EHT) 1.3 mm measurements of the radio source located at the position of the supermassive black hole Sagittarius A* (Sgr A*), collected during the 2017 April 5-11 campaign.
Abstract: We present Event Horizon Telescope (EHT) 1.3 mm measurements of the radio source located at the position of the supermassive black hole Sagittarius A* (Sgr A*), collected during the 2017 April 5–11 campaign. The observations were carried out with eight facilities at six locations across the globe. Novel calibration methods are employed to account for Sgr A*'s flux variability. The majority of the 1.3 mm emission arises from horizon scales, where intrinsic structural source variability is detected on timescales of minutes to hours. The effects of interstellar scattering on the image and its variability are found to be subdominant to intrinsic source structure. The calibrated visibility amplitudes, particularly the locations of the visibility minima, are broadly consistent with a blurred ring with a diameter of ∼50 μas, as determined in later works in this series. Contemporaneous multiwavelength monitoring of Sgr A* was performed at 22, 43, and 86 GHz and at near-infrared and X-ray wavelengths. Several X-ray flares from Sgr A* are detected by Chandra, one at low significance jointly with Swift on 2017 April 7 and the other at higher significance jointly with NuSTAR on 2017 April 11. The brighter April 11 flare is not observed simultaneously by the EHT but is followed by a significant increase in millimeter flux variability immediately after the X-ray outburst, indicating a likely connection in the emission physics near the event horizon. We compare Sgr A*’s broadband flux during the EHT campaign to its historical spectral energy distribution and find that both the quiescent emission and flare emission are consistent with its long-term behavior.

137 citations

Proceedings ArticleDOI
17 Oct 2003
TL;DR: A nonlinear alignment algorithm is proposed that keeps the semantic similarity of facial expression from different subjects on one generalized manifold and shows that non linear alignment outperforms linear alignment in expression classification.
Abstract: We propose the concept of manifold of facial expression based on the observation that images of a subject's facial expressions define a smooth manifold in the high dimensional image space. Such a manifold representation can provide a unified framework for facial expression analysis. We first apply active wavelet networks (AWN) on the image sequences for facial feature localization. To learn the structure of the manifold in the feature space derived by AWN, we investigated two types of embeddings from a high dimensional space to a low dimensional space: locally linear embedding (LLE) and Lipschitz embedding. Our experiments show that LLE is suitable for visualizing expression manifolds. After applying Lipschitz embedding, the expression manifold can be approximately considered as a super-spherical surface in the embedding space. For manifolds derived from different subjects, we propose a nonlinear alignment algorithm that keeps the semantic similarity of facial expression from different subjects on one generalized manifold. We also show that nonlinear alignment outperforms linear alignment in expression classification.

135 citations

Proceedings ArticleDOI
27 Jun 2004
TL;DR: A probabilistic video-based facial expression recognition method on manifolds that synthesizes image sequences of changing expressions through the manifold model and demonstrates that the Probabilistic approach can recognize expression transitions effectively.
Abstract: In this paper, we propose a probabilistic video-based facial expression recognition method on manifolds. The concept of the manifold of facial expression is based on the observation that the images of all possible facial deformations of an individual make a smooth manifold embedded in a high dimensional image space. An enhanced Lipschitz embedding is developed to embed the aligned face appearance in a low dimensional space while keeping the main structure of the manifold. In the embedded space, a complete expression sequence becomes a path on the expression manifold, emanating from a center that corresponds to the neutral expression. Each path consists of several clusters. A probabilistic model of transition between the clusters and paths is learned through training videos in the embedded space. The likelihood of one kind of facial expression is modeled as a mixture density with the clusters as mixture centers. The transition between different expressions is represented as the evolution of the posterior probability of the six basic paths. The experimental results demonstrate that the probabilistic approach can recognize expression transitions effectively. We also synthesize image sequences of changing expressions through the manifold model.

125 citations


Cited by
More filters
Journal ArticleDOI
22 Dec 2000-Science
TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Abstract: Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs-30,000 auditory nerve fibers or 10(6) optic nerve fibers-a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

13,652 citations

Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations

Journal ArticleDOI
21 Oct 1999-Nature
TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

11,500 citations

Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations

01 Jan 1999
TL;DR: In this article, non-negative matrix factorization is used to learn parts of faces and semantic features of text, which is in contrast to principal components analysis and vector quantization that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

9,604 citations