scispace - formally typeset
Search or ask a question
Author

Matthew Turk

Bio: Matthew Turk is an academic researcher from Toyota Technological Institute at Chicago. The author has contributed to research in topics: Augmented reality & Facial recognition system. The author has an hindex of 55, co-authored 198 publications receiving 30972 citations. Previous affiliations of Matthew Turk include Massachusetts Institute of Technology & University of California.


Papers
More filters
Proceedings ArticleDOI
17 Jun 2007
TL;DR: The H-ISOSOM algorithm is proposed, which learns an organized structure of a non-convex, large scale manifold and represents it by a set of hierarchical organized maps and iteratively learns the nonlinearity of each patch in finer levels to obtain the concise representation.
Abstract: We present an algorithm, Hierarchical ISOmetric Self-Organizing Map (H-ISOSOM), for a concise, organized manifold representation of complex, non-linear, large scale, high-dimensional input data in a low dimensional space. The main contribution of our algorithm is threefold. First, we modify the previous ISOSOM algorithm by a local linear interpolation (LLl) technique, which maps the data samples from low dimensional space back to high dimensional space and makes the complete mapping pseudo-invertible. The modified-ISOSOM (M-ISOSOM) follows the global geometric structure of the data, and also preserves local geometric relations to reduce the nonlinear mapping distortion and make the learning more accurate. Second, we propose the H-ISOSOM algorithm for the computational complexity problem of Isomap, SOM and LLI and the nonlinear complexity problem of the highly twisted manifold. H-ISOSOM learns an organized structure of a non-convex, large scale manifold and represents it by a set of hierarchical organized maps. The hierarchical structure follows a coarse-to-fine strategy. According to the coarse global structure, it "unfolds " the manifold at the coarse level and decomposes the sample data into small patches, then iteratively learns the nonlinearity of each patch in finer levels. The algorithm simultaneously reorganizes and clusters the data samples in a low dimensional space to obtain the concise representation. Third, we give quantitative comparisons of the proposed method with similar methods on standard data sets. Finally, we apply H-ISOSOM to the problem of appearance-based hand pose estimation. Encouraging experimental results validate the effectiveness and efficiency of H-ISOSOM.

9 citations

Proceedings ArticleDOI
07 Oct 2016
TL;DR: A new optical design for head-mounted displays (HMD) that has an exceptionally wide field of view (FOV) based on seamless lenses and screens curved around the eyes is presented, suggesting a feasible way to significantly expand the FOV of HMDs.
Abstract: We present a new optical design for head-mounted displays (HMD) that has an exceptionally wide field of view (FOV). It can cover even the full human FOV. It is based on seamless lenses and screens curved around the eyes. We constructed several compact and lightweight proof-of-concept prototypes of the optical design. One of them far exceeds the human FOV, although the anatomy of the human head limits the effective FOV. The presented optical design has advantages such as compactness, light weight, low cost and superwide FOV with high resolution. The prototypes are promising, and though this is still work-in-progress and display functionality is not yet implemented, it suggests a feasible way to significantly expand the FOV of HMDs.

9 citations

Book ChapterDOI
05 Dec 2005
TL;DR: Using a system that animates a character procedurally, this research provides tools to modify the character’s body movements in real-time, so that they reflect the character's mood, personality, interest, bodily pain, and emotions, all of which make up the current mental state of the character.
Abstract: Virtual agents are used to interact with humans in a myriad of applications. However, the agents often lack the believability necessary to maximize their effectiveness. These agents, or characters, lack personality and emotions, and therefore the capacity to emotionally connect and interact with the human. This deficiency prevents the viewer from identifying with the characters on a personal level. This research explores the possibility of automating the expression of a character’s mental state through its body language. Using a system that animates a character procedurally, we provide tools to modify the character’s body movements in real-time, so that they reflect the character’s mood, personality, interest, bodily pain, and emotions, all of which make up the current mental state of the character.

8 citations

Proceedings ArticleDOI
14 Oct 2019
TL;DR: The results suggest that the incorporation of additional modalities related to eye-movements and muscle activity may improve the efficacy of mobile EEG-based BCI systems, creating the potential for ubiquitous BCI.
Abstract: Brain Computer Interfaces (BCIs) typically utilize electroencephalography (EEG) to enable control of a computer through brain signals. However, EEG is susceptible to a large amount of noise, especially from muscle activity, making it difficult to use in ubiquitous computing environments where mobility and physicality are important features. In this work, we present a novel multimodal approach for classifying the P300 event related potential (ERP) component by coupling EEG signals with nonscalp electrodes (NSE) that measure ocular and muscle artifacts. We demonstrate the effectiveness of our approach on a new dataset where the P300 signal was evoked with participants on a stationary bike under three conditions of physical activity: rest, low-intensity, and high-intensity exercise. We show that intensity of physical activity impacts the performance of both our proposed model and existing state-of-the-art models. After incorporating signals from nonscalp electrodes our proposed model performs significantly better for the physical activity conditions. Our results suggest that the incorporation of additional modalities related to eye-movements and muscle activity may improve the efficacy of mobile EEG-based BCI systems, creating the potential for ubiquitous BCI.

8 citations

Journal ArticleDOI
TL;DR: A new marker and a new detection and identification method that is designed to work under blurred or defocused conditions are proposed that can increase the performance and robustness of AR systems and other vision applications that require detection or tracking of defined markers.
Abstract: Planar markers enable an augmented reality (AR) system to estimate the pose of objects from images containing them. However, conventional markers are difficult to detect in blurred or defocused images. We propose a new marker and a new detection and identification method that is designed to work under such conditions. The problem of conventional markers is that their patterns consist of high-frequency components such as sharp edges which are attenuated in blurred or defocused images. Our marker consists of a single low-frequency component. We call it a mono-spectrum marker. The mono-spectrum marker can be detected in real time with a GPU. In experiments, we confirm that the mono-spectrum marker can be accurately detected in blurred and defocused images in real time. Using these markers can increase the performance and robustness of AR systems and other vision applications that require detection or tracking of defined markers.

8 citations


Cited by
More filters
Journal ArticleDOI
22 Dec 2000-Science
TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Abstract: Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs-30,000 auditory nerve fibers or 10(6) optic nerve fibers-a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

13,652 citations

Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations

Journal ArticleDOI
21 Oct 1999-Nature
TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

11,500 citations

Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations

01 Jan 1999
TL;DR: In this article, non-negative matrix factorization is used to learn parts of faces and semantic features of text, which is in contrast to principal components analysis and vector quantization that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

9,604 citations