scispace - formally typeset
Search or ask a question

Showing papers by "Ioannis Pitas published in 2010"


Journal ArticleDOI
TL;DR: A novel facial expression classification (FEC) method that utilizes an iterative feature selection process that utilizes a class separability measure to create salient feature vectors (SFVs), where each SFV is composed of a selected feature subset.

69 citations


Journal ArticleDOI
TL;DR: An integrated method for automatic color based 2-D image fragment reassembly is presented and it is shown that the most robust algorithms having the best performance are investigated and their results are fed to the next step.
Abstract: The problem of reassembling image fragments arises in many scientific fields, such as forensics and archaeology. In the field of archaeology, the pictorial excavation findings are almost always in the form of painting fragments. The manual execution of this task is very difficult, as it requires great amount of time, skill and effort. Thus, the automation of such a work is very important and can lead to faster, more efficient, painting reassembly and to a significant reduction in the human effort involved. In this paper, an integrated method for automatic color based 2-D image fragment reassembly is presented. The proposed 2-D reassembly technique is divided into four steps. Initially, the image fragments which are probably spatially adjacent, are identified utilizing techniques employed in content based image retrieval systems. The second operation is to identify the matching contour segments for every retained couple of image fragments, via a dynamic programming technique. The next step is to identify the optimal transformation in order to align the matching contour segments. Many registration techniques have been evaluated to this end. Finally, the overall image is reassembled from its properly aligned fragments. This is achieved via a novel algorithm, which exploits the alignment angles found during the previous step. In each stage, the most robust algorithms having the best performance are investigated and their results are fed to the next step. We have experimented with the proposed method using digitally scanned images of actual torn pieces of paper image prints and we produced very satisfactory reassembly results.

51 citations


Proceedings Article
01 Jan 2010
TL;DR: The method has been tested on a variety of sequences with very good results, including a database of video sequences representing human faces changing from the neutral state to the one that represents a fully formed human facial expression.
Abstract: This paper presents a method for generalizing human facial expressions by means of a statistical analysis of human facial expressions coming from various per- sons. The data used for the statistical analysis are obtained by tracking a generic facial wireframe model in video sequences depicting the formation of the different human fa- cial expressions, starting from a neutral state. Wireframe node tracking is performed by a pyramidal variant of the well-known Kanade-Lucas-Tomasi (KLT) tracker. The loss of tracked features is handled through a model deformation procedure that increases the robustness of the tracking algorithm. Tracking initialization is performed in a semi- automatic fashion, i.e., the facial wireframe model is fitted to an image representing a neutral facial expression, exploiting physics-based deformable shape modeling. The dy- namic facial expression output model is MPEG-4 compliant. The method has been tested on a variety of sequences with very good results, including a database of video sequences representing human faces changing from the neutral state to the one that represents a fully formed human facial expression.

35 citations


Proceedings ArticleDOI
10 Dec 2010
TL;DR: A novel view-invariant movement recognition method based on morphological operations and the proportions of the human body that can achieve very satisfactory recognition rates.
Abstract: In this paper a novel view-invariant movement recognition method is presented. A multi-camera setup is used to capture the movement from different observation angles. Identification of the position of each camera with respect to the subject's body is achieved by a procedure based on morphological operations and the proportions of the human body. Binary body masks from frames of all cameras, consistently arranged through the previous procedure, are concatenated to produce the so-called multi-view binary mask. These masks are rescaled and vectorized to create feature vectors in the input space. Fuzzy vector quantization is performed to associate input feature vectors with movement representations and linear discriminant analysis is used to map movements in a low dimensionality discriminant feature space. Experimental results show that the method can achieve very satisfactory recognition rates.

28 citations


Book ChapterDOI
15 Sep 2010
TL;DR: Experiments conducted on the XM2VTS database, demonstrate that PCA+CDA outperforms PCA, LDA and PCA-LDA in Cross Validation inside the database and the behavior of these algorithms, when the size of training set decreases, is explored to demonstrate their robustness.
Abstract: In this paper, the problem of frontal view recognition on still images is confronted, using subspace learning methods. The aim is to acquire the frontal images of a person in order to achieve better results in later face or facial expression recognition. For this purpose, we utilize a relatively new subspace learning technique, Clustering based Discriminant Analysis (CDA) against two well-known in the literature subspace learning techniques for dimensionality reduction, Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). We also concisely describe spectral clustering which is proposed in this work as a preprocessing step to the CDA algorithm. As classifiers, we use the KNearest Neighbor the Nearest Centroid and the novel Nearest Cluster Centroid classifiers. Experiments conducted on the XM2VTS database, demonstrate that PCA+CDA outperforms PCA, LDA and PCA+LDA in Cross Validation inside the database. Finally the behavior of these algorithms, when the size of training set decreases, is explored to demonstrate their robustness.

12 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed system for content-based identification of image replicas can identify replicas with high accuracy and facilitate a wide range of applications such as copyright protection,content-based monitoring, content-aware multimedia management, etc.

11 citations


Book ChapterDOI
01 Jan 2010
TL;DR: In this chapter, emerging biometric modalities that appeared in the last years in order to improve the performance of biometric recognition systems, are presented and are divided in two major categories, intrusive and non-intrusive ones, according to the level of user nuisance that each system sets off.
Abstract: Various anthropometric studies have been conducted in the last decade in order to investigate how different physiological or behavioral human characteristics can be used as identity evidence to prove the individuality of each person. Some of these characteristics are: face, eyes, ears, teeth, fingers, hands, feet, veins, voice, signature, typing style and gait. Since the first biometric security systems appeared in the market, an increasing demand for novel techniques that will cover all different scenarios, has been observed. Every new method appears to outmatch some of its competitors but, at the same time, presents disadvantages compared to others. However, there is still no method that consists a single panacea to all different scenarios and demands for security. This is the reason for which researchers are on a continuous effort for more efficient and generic biometric modalities that can be used in various applications. In this chapter, emerging biometric modalities that appeared in the last years in order to improve the performance of biometric recognition systems, are presented. The presented methods are divided in two major categories, intrusive and non-intrusive ones, according to the level of user nuisance that each system sets off.

7 citations


Journal ArticleDOI
TL;DR: This paper proposes an online shape learning algorithm based on the self-balancing binary search tree data structure for the storage and retrieval of shape templates and introduces a similarity measure with which it can make decisions on how to traverse the tree and even backtrack through the search path to find more candidate matches.

6 citations


Book ChapterDOI
15 Sep 2010
TL;DR: Based on systematic experiments, the database enrichment with translated, scaled and rotated images is proposed for confronting the low robustness of subspace techniques for facial expression recognition.
Abstract: In this paper, the robustness of appearance-based, subspace learning techniques for facial expression recognition in geometrical transformations is explored. A plethora of facial expression recognition algorithms is presented and tested using three well-known facial expression databases. Although, it is common-knowledge that appearance based methods are sensitive to image registration errors, there is no systematic experiment reported in the literature and the problem is considered, a priori, solved. However, when it comes to automatic real-world applications, inaccuracies are expected, and a systematic preprocessing is needed. After a series of experiments we observed a strong correlation between the performance and the bounding box position. The mere investigation of the bounding box's optimal characteristics is insufficient, due to the inherent constraints a real-world application imposes, and an alternative approach is demanded. Based on systematic experiments, the database enrichment with translated, scaled and rotated images is proposed for confronting the low robustness of subspace techniques for facial expression recognition.

6 citations


Proceedings ArticleDOI
23 Aug 2010
TL;DR: A new method for the incremental training of multiclass Support Vector Machines that provides computational efficiency for training problems in the case where the training data collection is sequentially enriched and dynamic adaptation of the classifier is required.
Abstract: We present a new method for the incremental training of multiclass Support Vector Machines that provides computational efficiency for training problems in the case where the training data collection is sequentially enriched and dynamic adaptation of the classifier is required. An auxiliary function that incorporates some desired characteristics in order to provide an upper bound of the objective function which summarizes the multiclass classification task has been designed and the global minimizer for the enriched dataset is found using a warm start algorithm, since faster convergence is expected when starting from the previous global minimum. Experimental evidence on two data collections verified that our method is faster than retraining the classifier from scratch, while the achieved classification accuracy is maintained at the same level.

5 citations


Journal ArticleDOI
01 Oct 2010
TL;DR: This special issue has been put together from extended versions of papers presented in the 2007 IEEE International Workshop on Machine Learning for Signal Processing, held in Thessaloniki, Greece, between August 27 and 29, 2007 and represents a wide range of topics including feature-extraction and classification, nonlinear learning methods, speech separation, image and video processing, inference and programming methods.
Abstract: Machine Learning (ML) is a generic term referring to methods and algorithms that learn based on empirical observations. In recent years, the field has matured considerably in both methodology and real-world application domains. ML methods belong not only to the classical supervised, unsupervised, or reinforcement learning paradigms (often associated with neural networks) but also to an increasingly wide range of methodologies including kernel methods, support vector machines (SVMs), Bayesian learning, etc. Also ML has become particularly important for the solution of problems in signal processing. Machine learning for signal processing combines many ideas from adaptive signal/image processing, optimization theory, learning theory and models, and statistics in order to solve complex real-world signal processing applications. The range of applications is also growing, including pattern recognition, adaptive filtering, computer vision, content based image and video retrieval, data mining, cognitive radio, robot control, data fusion, blind signal processing, sparse component analysis, brain-computer interfaces, etc. This special issue has been put together from extended versions of papers presented in the 2007 IEEE International Workshop on Machine Learning for Signal Processing (MLSP-2007), held in Thessaloniki, Greece, between August 27 and 29, 2007. The guest editor committee invited the authors of the top ranking papers—based on the review scores they received at MLSP-2007—to submit their extended papers to the special issue. The committee also decided to invite the keynote and plenary speakers to submit their contributions in a full paper format. All papers went through a regular reviewing process and were duly revised, if necessary, prior to acceptance. Since the workshop covered the overall area of machine learning the papers represent a wide range of topics including feature-extraction and classification, nonlinear learning methods, speech separation, image and video processing, inference and programming methods. We would like to take this opportunity to thank the rest of the organizing committee of the workshop for their help and support: Tulay Adali, University of Maryland, Baltimore County, USA, Jan Larsen, Technical University of Denmark, Theophilos Papadimitriou, University of Thrace, Greece, Marc Van Hulle, Katholieke Universiteit Leuven, Belgium, Scott Douglas, Southern Methodist University, TX, USA, and Deniz Erdogmus, Oregon Health & Science University, OR, USA. I. Pitas (*) Aristotle University of Thessaloniki, Thessaloniki 54124, Greece e-mail: pitas@aiia.csd.auth.gr

Proceedings ArticleDOI
03 Dec 2010
TL;DR: The aim of this paper is to present a new method for multiview object or human body (or body part) detection using a single view detector in every view of a scene captured by multiple cameras and then combining the results using the 3D information of the scene.
Abstract: The aim of this paper is to present a new method for multiview object or human body (or body part) detection. The basic idea consists of using a single view detector in every view of a scene captured by multiple cameras and then combining the results using the 3D information of the scene. The method can improve the results of the single view detector, while also localizing the object/human in the 3D space. This results in a robust way for rejecting the false detections, amending the missed detections and associating the results of the single view detector across views.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: Three recently devised semantic analysis algorithms are reviewed in this paper, which can be of use in numerous applications, one of them being multimedia postproduction.
Abstract: Semantic analysis of video has witnessed a significant increase of research activities during the last years Human-centered video analysis plays a central role in this research since humans are the most frequently encountered entities in a video Results of human-centered video analysis can be of use in numerous applications, one of them being multimedia postproduction Three recently devised semantic analysis algorithms are reviewed in this paper

Proceedings ArticleDOI
17 Jun 2010
TL;DR: Recent research results in a number of diverse areas, such as face/person detection, human activity recognition and face/facial expression recognition are reviewed, either from a single or multiview visual (image/video) sources.
Abstract: The interest of the scientific community for anthropocentric (human-centered) video analysis stems from the fact that the extracted information (e.g. human presence, identity, body posture, emotional status, body parts movements, activities) can be utilised in various important applications. One such application domain is film and games postproduction, where the anthropocentric video analysis results can be used in various tasks, such as audiovisual material indexing and retrieval or automatic semantic annotation. In this paper, we shall review recent research results in a number of diverse areas, such as face/person detection, human activity recognition and face/facial expression recognition, either from a single or multiview visual (image/video) sources.

Book ChapterDOI
15 Sep 2010
TL;DR: A system capable of dynamically learning shapes in a way that also allows for the dynamic deletion of shapes already learned is presented, which uses a self-balancing Binary Search Tree data structure.
Abstract: In this paper, we present a system capable of dynamically learning shapes in a way that also allows for the dynamic deletion of shapes already learned. It uses a self-balancing Binary Search Tree (BST) data structure in which we can insert shapes that we can later retrieve and also delete inserted shapes. The information concerning the inserted shapes is distributed on the tree's nodes in such a way that it is retained even after the structure of the tree changes due to insertions, deletions and rebalances these two operations can cause. Experiments show that the structure is robust enough to provide similar retrieval rates after many insertions and deletions.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: Results on face images extracted from the XM2VTS dataset show that feeding the DNMF subspace data into the SVM is the approach that provides the best results.
Abstract: In this paper we investigate the potential benefits of combining, within a classification task, a discriminant linear subspace feature extraction technique, namely Discriminant Nonnegative Matrix Factorization (Discriminant NMF or DNMF), with a Support Vector Machine (SVM) classifier. The aim was to investigate whether this combination provides better classification results compared to a template matching method operating on the DNMF space or on the raw data and an SVM classifier operating on the raw data, when applied on the frontal facial pose recognition problem. The latter is a two-class problem (frontal and non-frontal facial images). DNMF is based on a supervised training procedure and works by imposing additional criteria on the NMF objective function that aim at increasing class seperability in the lower dimensionality space. Results on face images extracted from the XM2VTS dataset show that feeding the DNMF subspace data into the SVM is the approach that provides the best results.

Proceedings ArticleDOI
19 Jul 2010
TL;DR: A novel color-based two-step, coarse-to-fine video replica detection system that aims at achieving robustness to attacks and has been evaluated on a database of short videos with good results.
Abstract: A novel color-based two-step, coarse-to-fine video replica detection system is proposed in this paper. The first step uses an R-tree in order to perform a coarse selection of the database (original) videos that potentially match the query video. A training procedure that utilizes attacked versions of the database videos and aims at achieving robustness to attacks is being used. A frame-based voting procedure is also involved. A refinement step that processes the set of videos returned by the first step in order to select the final matching video (if any) follows. The performance of the system has been evaluated on a database of short videos with good results.