Showing papers by "Ioannis Pitas published in 2008"

PDF

Open Access

Journal Article•DOI•

An analysis of facial expression recognition under partial facial image occlusion

[...]

Irene Kotsia¹, Ioan Buciu¹, Ioannis Pitas¹•Institutions (1)

01 Jul 2008-Image and Vision Computing

TL;DR: The way partial occlusion affects human observers when recognizing facial expressions is indicated and conclusions regarding the pairs of facial expressions misclassifications that each type of Occlusion introduces are drawn.

...read moreread less

210 citations

Journal Article•DOI•

Texture and shape information fusion for facial expression and facial action unit recognition

[...]

Irene Kotsia¹, Stefanos Zafeiriou¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Mar 2008-Pattern Recognition

TL;DR: A novel method based on fusion of texture and shape information is proposed for facial expression and Facial Action Unit (FAU) recognition from video sequences and the accuracy achieved is 92.3% when recognizing the seven basic facial expressions.

...read moreread less

114 citations

Journal Article•DOI•

Nonnegative Matrix Factorization in Polynomial Feature Space

[...]

Ioan Buciu¹, Nikos Nikolaidis¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jun 2008-IEEE Transactions on Neural Networks

TL;DR: This paper proposes a generalization of the NMF algorithm by translating the objective function into a Hilbert space (also called feature space) under nonnegativity constraints, and develops an approach that allows high-order dependencies between the basis images while keeping the nonNegativity constraints on both basis images and coefficients.

...read moreread less

Abstract: Plenty of methods have been proposed in order to discover latent variables (features) in data sets. Such approaches include the principal component analysis (PCA), independent component analysis (ICA), factor analysis (FA), etc., to mention only a few. A recently investigated approach to decompose a data set with a given dimensionality into a lower dimensional space is the so-called nonnegative matrix factorization (NMF). Its only requirement is that both decomposition factors are nonnegative. To approximate the original data, the minimization of the NMF objective function is performed in the Euclidean space, where the difference between the original data and the factors can be minimized by employing L 2-norm. In this paper, we propose a generalization of the NMF algorithm by translating the objective function into a Hilbert space (also called feature space) under nonnegativity constraints. With the help of kernel functions, we developed an approach that allows high-order dependencies between the basis images while keeping the nonnegativity constraints on both basis images and coefficients. Two practical applications, namely, facial expression and face recognition, show the potential of the proposed approach.

...read moreread less

88 citations

Journal Article•DOI•

A Virtual System for Cavity Preparation in Endodontics

[...]

Ioannis Marras¹, Nikolaos Nikolaidis¹, Georgios Mikrogeorgis¹, Kleoniki Lyroudia¹, Ioannis Pitas¹ - Show less +1 more•Institutions (1)

Aristotle University of Thessaloniki¹

01 Apr 2008-Journal of Dental Education

TL;DR: A novel virtual teeth drilling system designed to aid dentists, dental students, and researchers in getting acquainted with teeth anatomy, the handling of drilling instruments, and the challenges associated with drilling procedures during endodontic therapy is presented.

...read moreread less

Abstract: This article presents a novel virtual teeth drilling system designed to aid dentists, dental students, and researchers in getting acquainted with teeth anatomy, the handling of drilling instruments, and the challenges associated with drilling procedures during endodontic therapy. The system is designed to be used for educational and research purposes in dental schools. The application features a 3D face and oral cavity model constructed using anatomical data that can be adapted to the characteristics of a specific patient using either facial photographs or 3D data. Animation of the models is also feasible. Virtual drilling using a Phantom Desktop (Sensable Technologies Inc., Woburn, MA) force feedback haptic device is performed within the oral cavity on 3D volumetric and surface models of teeth, obtained from serial cross sections of natural teeth. Final results and intermediate steps of the drilling procedure can be saved on a file for future use. The application has the potential to be a very promising educational and research tool that allows the user to practice virtual teeth drilling for endodontic cavity preparation or other related procedures on high-detail teeth models placed within an adaptable and animated 3D face and oral cavity model.

...read moreread less

56 citations

Journal Article•DOI•

Discriminant Graph Structures for Facial Expression Recognition

[...]

Stefanos Zafeiriou¹, Ioannis Pitas²•Institutions (2)

Imperial College London¹, Aristotle University of Thessaloniki²

01 Dec 2008-IEEE Transactions on Multimedia

TL;DR: A new technique for the selection of the most discriminant facial landmarks for every facial expression (discriminant expression-specific graphs) is applied and a novel kernel-based technique for discriminant feature extraction from graphs is presented.

...read moreread less

Abstract: In this paper, a series of advances in elastic graph matching for facial expression recognition are proposed. More specifically, a new technique for the selection of the most discriminant facial landmarks for every facial expression (discriminant expression-specific graphs) is applied. Furthermore, a novel kernel-based technique for discriminant feature extraction from graphs is presented. This feature extraction technique remedies some of the limitations of the typical kernel Fisher discriminant analysis (KFDA) which provides a subspace of very limited dimensionality (i.e., one or two dimensions) in two-class problems. The proposed methods have been applied to the Cohn-Kanade database in which very good performance has been achieved in a fully automatic manner.

...read moreread less

55 citations

Journal Article•DOI•

Combining Fuzzy Vector Quantization With Linear Discriminant Analysis for Continuous Human Movement Recognition

[...]

Nikolaos Gkalelis¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Nov 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel method based on fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA) that allows for simple Mahalanobis or cosine distance comparison of not aligned human movements, aiding the design of a real-time continuous human movement recognition algorithm.

...read moreread less

Abstract: In this paper, a novel method for continuous human movement recognition based on fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA) is proposed. We regard a movement as a unique combination of basic movement patterns, the so-called dynemes. The proposed algorithm combines FVQ and LDA to discover the most discriminative dynemes as well as represent and discriminate the different human movements in terms of these dynemes. This method allows for simple Mahalanobis or cosine distance comparison of not aligned human movements, taking into account implicitly time shifts and internal speed variations, and, thus, aiding the design of a real-time continuous human movement recognition algorithm. The effectiveness and robustness of this method is shown by experimental results on a standard dataset with videos captured under real conditions, and on a new video dataset created using motion capture data.

...read moreread less

43 citations

Journal Article•DOI•

A Comparative Study of Chaotic and White Noise Signals in Digital Watermarking

[...]

Aidan Mooney¹, John G. Keating¹, Ioannis Pitas²•Institutions (2)

Maynooth University¹, Aristotle University of Thessaloniki²

01 Mar 2008-Chaos Solitons & Fractals

TL;DR: This analysis focuses on the watermarked images after they have been subjected to common image distortion attacks and shows that signals generated from highpass chaotic signals have superior performance than highpass noise signals, in the presence of such attacks.

...read moreread less

Abstract: Digital watermarking is an ever increasing and important discipline, especially in the modern electronically-driven world. Watermarking aims to embed a piece of information into digital documents which their owner can use to prove that the document is theirs, at a later stage. In this paper, performance analysis of watermarking schemes is performed on white noise sequences and chaotic sequences for the purpose of watermark generation. Pseudorandom sequences are compared with chaotic sequences generated from the chaotic skew tent map. In particular, analysis is performed on highpass signals generated from both these watermark generation schemes, along with analysis on lowpass watermarks and white noise watermarks. This analysis focuses on the watermarked images after they have been subjected to common image distortion attacks. It is shown that signals generated from highpass chaotic signals have superior performance than highpass noise signals, in the presence of such attacks. It is also shown that watermarks generated from lowpass chaotic signals have superior performance over the other signal types analysed.

...read moreread less

36 citations

Journal Article•DOI•

Emerging biometric modalities: a survey

[...]

Georgios Goudelis¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Dec 2008-Journal on Multimodal User Interfaces

TL;DR: The emerging biometric modalities are presented and there is still no method able to completely satisfy the current security needs.

...read moreread less

Abstract: Many body parts, personal characteristics and signaling methods have recently been suggested and used for biometrics systems: fingers, hands, feet, faces, eyes, ears, teeth, veins, voices, signatures, typing styles and gaits. A continuously increasing number of biometric techniques have risen in order to fulfill the different kinds of demands in the market. Every method presents a number of advantages compared to the others as each technique has been created to subserve different kinds of requirements. However, there is still no method able to completely satisfy the current security needs. This is the reason why researchers continuously drive their efforts to newer methods that will provide a higher security stage. In this paper, the emerging biometric modalities are presented.

...read moreread less

34 citations

Journal Article•DOI•

Evaluation of FISH image analysis system on assessing HER2 amplification in breast carcinoma cases.

[...]

Zenonas Theodosiou¹, Ioannis Kasampalidis¹, Georgia Karayannopoulou¹, Ioannis Kostopoulos¹, Mattheos Bobos¹, Generoso Bevilacqua², Paolo Aretini, Antonina Starita², Kleoniki Lyroudia¹, Ioannis Pitas¹ - Show less +6 more•Institutions (2)

Aristotle University of Thessaloniki¹, University of Pisa²

01 Feb 2008-The Breast

TL;DR: The evaluation shows that the developed FISH image analysis software can accelerate evaluation of HER2 status in most breast cancer cases.

...read moreread less

32 citations

Journal Article•DOI•

Dynamic training using multistage clustering for face recognition

[...]

Marios Kyperountas¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Mar 2008-Pattern Recognition

TL;DR: Experimental results indicate that the proposed framework provides a promising solution to the face recognition problem.

...read moreread less

21 citations

Journal Article•DOI•

Camera Motion Estimation Using a Novel Online Vector Field Model in Particle Filters

[...]

Symeon Nikitidis¹, Stefanos Zafeiriou¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Aug 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel stochastic vector field model is proposed, which can handle smooth motion patterns derived from long periods of stable camera motion and can also cope with rapid camera motion changes and periods when the camera remains still.

...read moreread less

Abstract: In this paper, a novel algorithm for parametric camera motion estimation is introduced. More particularly, a novel stochastic vector field model is proposed, which can handle smooth motion patterns derived from long periods of stable camera motion and can also cope with rapid camera motion changes and periods when the camera remains still. The stochastic vector field model is established from a set of noisy measurements, such as motion vectors derived, e.g., from block matching techniques, in order to provide an estimation of the subsequent camera motion in the form of a motion vector field. A set of rules for a robust and online update of the camera motion model parameters is also proposed, based on the expectation maximization algorithm. The proposed model is embedded in a particle filters framework in order to predict the future camera motion based on current and prior observations. We estimate the subsequent camera motion by finding the optimum affine transform parameters so that, when applied to the current video frame, the resulting motion vector field to approximate the one estimated by the stochastic model. Extensive experimental results verify the usefulness of the proposed scheme in camera motion pattern classification and in the accurate estimation of the 2D affine camera transform motion parameters. Moreover, the camera motion estimation has been incorporated into an object tracker in order to investigate if the new schema improves its tracking efficiency, when camera motion and tracked object motion are combined.

...read moreread less

Journal Article•DOI•

Audio-Assisted Movie Dialogue Detection

[...]

M. Kotti¹, Dimitrios Ververidis¹, Georgios Evangelopoulos², I. Panagakis¹, Constantine Kotropoulos¹, Petros Maragos², Ioannis Pitas¹ - Show less +3 more•Institutions (2)

Aristotle University of Thessaloniki¹, National and Kapodistrian University of Athens²

01 Nov 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An audio-assisted system is investigated that detects if a movie scene is a dialogue or not based on actor indicator functions, which define if an actor speaks at a certain time instant.

...read moreread less

Abstract: An audio-assisted system is investigated that detects if a movie scene is a dialogue or not. The system is based on actor indicator functions. That is, functions which define if an actor speaks at a certain time instant. In particular, the cross-correlation and the magnitude of the corresponding the cross-power spectral density of a pair of indicator functions are input to various classifiers, such as voted perceptrons, radial basis function networks, random trees, and support vector machines for dialogue/non-dialogue detection. To boost classifier efficiency AdaBoost is also exploited. The aforementioned classifiers are trained using ground truth indicator functions determined by human annotators for 41 dialogue and another 20 non-dialogue audio instances. For testing, actual indicator functions are derived by applying audio activity detection and actor clustering to audio recordings. 23 instances are randomly chosen among the aforementioned 41 dialogue instances, 17 of which correspond to dialogue scenes and 6 to non-dialogue ones. Accuracy ranging between 0.739 and 0.826 is reported.

...read moreread less

Proceedings Article•DOI•

Texture and Shape Information Fusion for Facial Action Unit Recognition

[...]

Irene Kotsia¹, Stefanos Zafeiriou¹, Nikos Nikolaidis¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

10 Feb 2008

TL;DR: A novel method that fuses texture and shape information to achieve Facial Action Unit (FAU) recognition from video sequences is proposed and the accuracy achieved is equal to 92.1% when recognizing the 17 FAUs that are responsible for facial expression development.

...read moreread less

Abstract: A novel method that fuses texture and shape information to achieve Facial Action Unit (FAU) recognition from video sequences is proposed. In order to extract the texture information, a subspace method based on Discriminant Non- negative Matrix Factorization (DNMF) is applied on the difference images of the video sequence, calculated taking under consideration the neutral and the most expressive frame, to extract the desired classification label. The shape information consists of the deformed Candide facial grid (more specifically the grid node displacements between the neutral and the most expressive facial expression frame) that corresponds to the facial expression depicted in the video sequence. The shape information is afterwards classified using a two-class Support Vector Machine (SVM) system. The fusion of texture and shape information is performed using Median Radial Basis Functions (MRBFs) Neural Networks (NNs) in order to detect the set of present FAUs. The accuracy achieved in the Cohn-Kanade database is equal to 92.1% when recognizing the 17 FAUs that are responsible for facial expression development.

...read moreread less

Journal Article•DOI•

Face-Based Digital Signatures for Video Retrieval

[...]

Costas Cotsaces¹, Nikos Nikolaidis¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Apr 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A method is developed that uses the pre-extracted output of face detection and recognition to perform fast semantic query-by-example retrieval of video segments that is resistant to perturbations in the signature and independent of the exact boundaries of the query segment.

...read moreread less

Abstract: The characterization of a video segment by a digital signature is a fundamental task in video processing. It is necessary for video indexing and retrieval, copyright protection, and other tasks. Semantic video signatures are those that are based on high-level content information rather than on low-level features of the video stream. The major advantage of such signatures is that they are highly invariant to nearly all types of distortion. A major semantic feature of a video is the appearance of specific persons in specific video frames. Because of the great amount of research that has been performed on the subject of face detection and recognition, the extraction of such information is generally tractable, or will be in the near future. We have developed a method that uses the pre-extracted output of face detection and recognition to perform fast semantic query-by-example retrieval of video segments. We also give the results of the experimental evaluation of our method on a database of real video. One advantage of our approach is that the evaluation of similarity is convolution-based, and is thus resistant to perturbations in the signature and independent of the exact boundaries of the query segment.

...read moreread less

Book Chapter•DOI•

Language engineering and information theoretic methods in protein sequence similarity studies

[...]

Alina Bogan-Marta¹, Alina Bogan-Marta², A. Hategan³, Ioannis Pitas¹•Institutions (3)

Aristotle University of Thessaloniki¹, University of Oradea², Tampere University of Technology³

01 Jan 2008

TL;DR: This paper aims to demonstrate the efforts towards in-situ applicability of artificial intelligence in the area of signal processing and its applications to machine learning.

...read moreread less

Abstract: 1 University of Oradea, Department of Computers, Universtatii No1, 410087, Oradea, Romania, alinab@uoradea.ro 2 Tampere University of Technology, Institute of Signal Processing, Korkeakoulunkatu 1, P.O. Box 553, FIN-33101, {andrea.hategan,ioan.tabus}@tut.fi 3 Aristotle University of Thessaloniki, Department of Informatics, Artificial Intelligence and Information Analysis Laboratory, Box 451,Thessaloniki, Greece,

...read moreread less

Multi-modal emotion-related data collection within a virtual earthquake emulator

[...]

Dimitrios Ververidis, Irene Kotsia, Constantine Kotropoulos, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jan 2008

TL;DR: It is assessed whether pupils’ spontaneous emotional state can be accurately recognized, using classifiers trained on elicited emotional speech and videos.

...read moreread less

Abstract: The collection of emotion-related signals, such as face video sequences, speech utterances, galvanic skin response, and blood pressure from pupils in a virtual reality environment, when the pupils attempt to evacuate a school during an earthquake, is addressed in this paper. We assess whether pupils’ spontaneous emotional state can be accurately recognized, using classifiers trained on elicited emotional speech and videos.

...read moreread less

Journal Article•DOI•

Piecewise Linear Digital Curve Representation and Compression Using Graph Theory and a Line Segment Alphabet

[...]

Andras Hajdu¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Feb 2008-IEEE Transactions on Image Processing

TL;DR: This paper proposes a graph theory-based algorithm for tracing the curve directly to eliminate the decomposition needs, and assigns a polygon approximation to the curve which consists of letters coming from an alphabet of line segments.

...read moreread less

Abstract: The use of an alphabet of line segments to compose a curve is a possible approach for curve data compression. Many approaches are developed with the drawback that they can process simple curves only. Curves having more sophisticated topology with self-intersections can be handled by methods considering recursive decomposition of the canvas containing the curve. In this paper, we propose a graph theory-based algorithm for tracing the curve directly to eliminate the decomposition needs. This approach obviously improves the compression performance, as longer line segments can be used. We tune our method further by selecting optimal turns at junctions during tracing the curve. We assign a polygon approximation to the curve which consists of letters coming from an alphabet of line segments. We also discuss how other application fields can take advantage of the provided curve description scheme.

...read moreread less

Proceedings Article•DOI•

Single camera pointing gesture recognition for interaction in edutainment applications

[...]

Z. Cernekova¹, C. Malerczyk, Nikos Nikolaidis², Ioannis Pitas²•Institutions (2)

Comenius University in Bratislava¹, Aristotle University of Thessaloniki²

21 Apr 2008

TL;DR: Experiments show very promising results for recognizing the pointing gestures by using a single camera using a GVF-snake to detect the pointing hand of the user.

...read moreread less

Abstract: In this paper, a method for recognizing pointing gestures without markers is proposed. The video-based system uses one camera only, which observes the user in front of a large screen and identifies the 2D position pointed by him/her on this screen, his/her arm being in the fully extended position towards the screen. A GVF-snake is used in order to detect the pointing hand of the user, which is tracked in the following frames using the particle filters tracker. The center of gravity of the snake is used as a feature point and is transformed using linear transformation directly into the canvas coordinates. The method was tested on a large screen using applications designed for a wide range of different and even technically unversed users such as an image exploration for a virtual museum exhibit or intuitive interaction applications for gaming purposes. Experiments show very promising results for recognizing the pointing gestures by using a single camera.

...read moreread less

Journal Article•DOI•

Automated Facial Pose Extraction From Video Sequences Based on Mutual Information

[...]

Georgios Goudelis¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Mar 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The proposed method uses a novel pose estimation algorithm based on mutual information to extract any required facial poses from video sequences and outperforms a principal component analysis reconstruction method that was used as a benchmark.

...read moreread less

Abstract: Estimation of the facial pose in video sequences is one of the major issues in many vision systems such as face-based biometrics, scene understanding for humans, and others. The proposed method uses a novel pose estimation algorithm based on mutual information to extract any required facial poses from video sequences. The method extracts the poses automatically and classifies them according to view angle. Experimental results on the XM2VTS video database and on a new database created for the needs of this research indicated a pose classification rate of 99.2% while it was shown that it outperforms a principal component analysis reconstruction method that was used as a benchmark.

...read moreread less

Proceedings Article•

Gender determination using a Support Vector Machine Variant

[...]

Stefanos Zafeiriou¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Aug 2008

TL;DR: A modified class of Support Vector Machines (SVMs) inspired from the optimization of Fisher's discriminant ratio is presented and a novel class of nonlinear decision surfaces is presented by solving the proposed optimization problem in arbitrary Hilbert spaces defined by Mercer's kernels.

...read moreread less

Abstract: In this paper a modified class of Support Vector Machines (SVMs) inspired from the optimization of Fisher's discriminant ratio is presented. Moreover, we present a novel class of nonlinear decision surfaces by solving the proposed optimization problem in arbitrary Hilbert spaces defined by Mercer's kernels. The effectiveness of the proposed approach is demonstrated by comparing it with the standard SVMs and other classifiers, like Kernel Fisher Discriminant Analysis (KFDA) in gender determination.

...read moreread less

Proceedings Article•DOI•

Face recognition via adaptive discriminant clustering

[...]

Marios Kyperountas¹, Anastasios Tefas¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

12 Dec 2008

TL;DR: Results indicate that the proposed framework provides a promising solution to the face recognition problem by accommodating multiple clustering steps.

...read moreread less

Abstract: This paper presents a methodology that tackles the face recognition problem by accommodating multiple clustering steps. At each clustering step, the test and training faces are projected to a discriminant space and the projected training data are partitioned into clusters using the k-means algorithm. Then a subset of the training data clusters is selected, based on how similar the faces in these clusters are to the test face. In the clustering step that follows a new discriminant space is defined by processing this subset and both the test and training data are projected to this space. This process is repeated until one final cluster is selected and the most similar, to the test face, face class contained is set as the identity match. The UMIST and XM2VTS face databases have been used to evaluate the algorithm and results indicate that the proposed framework provides a promising solution to the face recognition problem.

...read moreread less

Proceedings Article•DOI•

Sparse human movement representation and recognition

[...]

Nikolaos Gkalelis, Anastasios Tefas¹, Ioannis Pitas•Institutions (1)

Aristotle University of Thessaloniki¹

05 Nov 2008

TL;DR: This method allows for simple Mahalanobis or cosine distance comparison of movements, taking implicitly into account time shifts and internal speed variations, and, thus, aiding the design of a real-time movement recognition algorithm.

...read moreread less

Abstract: In this paper a novel method for human movement representation and recognition is proposed. A movement type is regarded as a unique combination of basic movement patterns, the so-called dynemes. The fuzzy c-mean (FCM) algorithm is used to identify the dynemes in the input space and allow the expression of a posture in terms of these dynemes. In the so-called dyneme space, the sparse posture representations of a movement are combined to represent the movement as a single point in that space, and linear discriminant analysis (LDA) is further employed to increase movement type discrimination and compactness of representation. This method allows for simple Mahalanobis or cosine distance comparison of movements, taking implicitly into account time shifts and internal speed variations, and, thus, aiding the design of a real-time movement recognition algorithm.

...read moreread less

Proceedings Article•DOI•

Global Gabor features for rotation invariant object classification

[...]

Ioan Buciu¹, Ioan Nafornita², Ioannis Pitas³•Institutions (3)

University of Oradea¹, Politehnica University of Timișoara², Aristotle University of Thessaloniki³

10 Oct 2008

TL;DR: The robustness of the Gabor approach when globally applied to extract relevant discriminative features and the method out-performs other state-of-the-art techniques compared in the paper such as, principal component analysis (PCA) or linear discriminant analysis (LDA).

...read moreread less

Abstract: The human visual system can rapidly and accurately recognize a large number of various objects in cluttered scenes under widely varying and difficult viewing conditions, such as illuminations changing, occlusion, scaling or rotation. One of the state-of-the-art feature extraction techniques used in image recognition and processing is based on the Gabor wavelet model. This paper deals with the application of the aforementioned model for object classification task with respect to the rotation issue. Three training sample sizes were applied to assess the methodpsilas performance. Experiments ran on the COIL-100 database show the robustness of the Gabor approach when globally applied to extract relevant discriminative features. The method out-performs other state-of-the-art techniques compared in the paper such as, principal component analysis (PCA) or linear discriminant analysis (LDA).

...read moreread less

Book Chapter•DOI•

Movie Analysis with Emphasis to Dialogue and Action Scene Detection

[...]

Emmanouil Benetos¹, Spyridon Siatras¹, Constantine Kotropoulos¹, Nikos Nikolaidis¹, Ioannis Pitas¹ - Show less +1 more•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jan 2008

TL;DR: Semantic content-based video indexing offers a promising solution for efficient digital movie management by extracting, characterizing, and organizing video content by analyzing the visual, aural, and textual information sources of video.

...read moreread less

Abstract: Movies constitute a large portion of the entertainment industry, as over 9.000 hours of video are released every year [21]. As the bandwidth available to users increases, online movie stores – the equivalent of popular digital music stores – are emerging. They provide users an opportunity to build large personal movie repositories. The convenience of digital movie repositories will be in doubt, unless multimedia data management is employed for organizing, navigating, browsing, searching, and viewing multimedia content. Semantic content-based video indexing offers a promising solution for efficient digital movie management. Semantic video indexing aims at extracting, characterizing, and organizing video content by analyzing the visual, aural, and textual information sources of video. The need for content-based audiovisual analysis has been realized by the MPEG committee, leading to the creation of the MPEG-7 standard [16]. The current approaches for automatic movie analysis and annotation mostly focus on the visual information, while the aural information receives little or no attention. However, the integration of the aural information with the visual one can improve semantic movie content analysis. The predominant approach to semantic movie analysis is to initially extract some low-level audiovisual features (such as color and texture from images or energy and pitch from audio), derive some mid-level entities (such as video shots, keyframes, appearance of faces and audio classes), and finally understand video semantic content by analyzing and combining these entities. A hierarchical video indexing structure is displayed in Fig. 7.1. Movie analysis aims at obtaining a structured organization of the movie content and understanding its embedded semantics like humans do. It has been handled in different ways, depending on the analysis level and the assumptions on the film syntax described in Section 7.1. Most movie analysis efforts concentrate on movie scene or shot detection, while other works focus on the separation of dialogue and non-dialogue scenes. Several efforts have been made for dialogue scene detection, some efforts have concentrated to

...read moreread less

Proceedings Article•

An Online Self-Balancing Binary Search Tree for Hierarchical Shape Matching

[...]

Nikolaos Tsapanos¹, Anastasios Tefas, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jan 2008

TL;DR: A similarity measure is introduced with which to make decisions on how to traverse the tree and backtrack to find more possible matches and every basic operation a binary search tree can perform adapted to a tree of shapes is described.

...read moreread less

Abstract: In this paper we propose a self-balanced binary search tree data structure for shape matching This was originally developed as a fast method of silhouette matching in videos recorded from IR cameras by firemen during rescue operations We introduce a similarity measure with which we can make decisions on how to traverse the tree and backtrack to find more possible matches Then we describe every basic operation a binary search tree can perform adapted to a tree of shapes Note that as a binary search tree, all operations can be performed in O(logn) time and are very fast and efficient Finally we present experimental data evaluating the performance of our proposed data structure

...read moreread less

Proceedings Article•

MUSCLE movie-database: a multimodal corpus with rich annotation for dialogue and saliency detection

[...]

Dimitris Spachos, A. Zlantintsi, Vassiliki Moschou¹, P. Antonopoulos, Emmanouil Benetos¹, M. Kotti¹, K. Tzimouli, Constantine Kotropoulos¹, Nikos Nikolaidis¹, Petros Maragos², Ioannis Pitas¹ - Show less +7 more•Institutions (2)

Aristotle University of Thessaloniki¹, National Technical University of Athens²

31 May 2008

TL;DR: An annotated multimodal movie corpus has been collected to be used as a test bed for development and assessment of content-based multimedia processing, such as speaker clustering, speaker turn detection, visual speech activity detection, face detection,Face clustered, scene segmentation, saliency detection, and visual dialogue detection.

...read moreread less

Abstract: Semantic annotation of multimedia content is important for training, testing, and assessing content-based algorithms for indexing, organization, browsing, and retrieval. To this end, an annotated multimodal movie corpus has been collected to be used as a test bed for development and assessment of content-based multimedia processing, such as speaker clustering, speaker turn detection, visual speech activity detection, face detection, face clustering, scene segmentation, saliency detection, and visual dialogue detection. All metadata are saved in XML format following the MPEG-7 ISO prototype to ensure data compatibility and reusability. The entire MUSCLE movie database is available for download through the web. Visual speech activity and dialogue detection algorithms that have been developed within the software package DIVA3D and tested on this database are also briefly described. Furthermore, we review existing annotation tools with emphasis on the novel annotation tool Anthropos7 Editor.

...read moreread less

Proceedings Article•

3D Human Face Modelling From Uncalibrated Images Using Spline Based Deformation.

[...]

Nikos Barbalios¹, Nikos Nikolaidis¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jan 2008

TL;DR: A robust structure from motion (SfM) algorithm is applied over a set of manually selected salient image features to retrieve an estimate of their 3D coordinates, which are utilized to deform a generic 3D face model, using smoothing splines, and adapt it to the characteristics of a human face.

...read moreread less

Abstract: Accurate and plausible 3D face reconstruction remains a difficult problem up to this day, despite the tremendous advances in computer technology and the continuous growth of the applications utilizing 3D face models (e.g. biometrics, movies, gaming). In this paper, a two-step technique for efficient 3D face reconstruction from a set of face images acquired using an uncalibrated camera is presented. Initially, a robust structure from motion (SfM) algorithm is applied over a set of manually selected salient image features to retrieve an estimate of their 3D coordinates. These estimates are further utilized to deform a generic 3D face model, using smoothing splines, and adapt it to the characteristics of a human face.

...read moreread less

Book Chapter•DOI•

Anthropocentric Semantic Information Extraction from Movies

[...]

Nicholas Vretos¹, V. Solachidis¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

01 Jan 2008

TL;DR: This chapter describes new methods for anthropocentric semantic video analysis, and will concentrate its efforts to provide a uniform framework by which media analysis can be rendered more useful for retrieval applications as well as for human–computer interaction based application.

...read moreread less

Abstract: In this chapter we will describe new methods for anthropocentric semantic video analysis, and will concentrate our efforts to provide a uniform framework by which media analysis can be rendered more useful for retrieval applications as well as for human–computer interaction based application. The main idea behind anthropocentric video analysis is that a film is to be viewed as an artwork and not as a mere of frames following each others. We will show that this kind of analysis which is a straightforward approach of human perception of a movie can finally produce some interesting results of the overall annotation of a video content. “Anthropos” which is the greek word for “human” show the intent of our proposition to concentrate in humans in a movie. Humans are the most essential part of a movie and thus we track down all important features that we can get from low-level and mid-level feature algorithms such as face detection, face tracking, eye detection, visual speech recognition, 3D face reconstruction, face clustering, face verification and facial expressions extraction. All these algorithms produce results which are stored in an MPEG-7 inspired description scheme set which implements the way humans are connecting those features. Therefore as a results we have a structured information of all features that can be found for a specific human (e.g. actor). As it will be shown in this chapter this approach as a straightforward approach of human perception provides a new way of media analysis in the semantic level.

...read moreread less

Proceedings Article•

Human movement recognition using fuzzy clustering and discriminant analysis

[...]

Nikolaos Gkalelis, Anastasios Tefas¹, Ioannis Pitas•Institutions (1)

Aristotle University of Thessaloniki¹

01 Aug 2008

TL;DR: A novel method for human movement representation and recognition using principal component analysis plus linear discriminant analysis (PCA plus LDA) to project the postures of a movement to the identified dynemes.

...read moreread less

Abstract: In this paper a novel method for human movement representation and recognition is proposed. A movement is regarded as a sequence of basic movement patterns, the so-called dynemes. Initially, the fuzzy c-mean (FCM) algorithm is used to identify the dynemes in the input space, and then principal component analysis plus linear discriminant analysis (PCA plus LDA) is employed to project the postures of a movement to the identified dynemes. In this space, the posture representations of the movement are combined to represent the movement in terms of its comprising dynemes. This representation allows for efficient Mahalanobis or cosine-based nearest centroid classification of variable length movements.

...read moreread less

Book Chapter•DOI•

Discriminant Non-negative Matrix Factorization and Projected Gradients for Frontal Face Verification

[...]

Irene Kotsia¹, Stefanos Zafeiriou¹, Ioannis Pitas¹•Institutions (1)

Aristotle University of Thessaloniki¹

23 Dec 2008

TL;DR: A novel Discriminant Non-negative Matrix Factorization (DNMF) method that uses projected gradients that guarantees the algorithm's convergence to a stationary point, contrary to the methods introduced so far, that only ensure the non-increasing behavior of the algorithms' cost function.

...read moreread less

Abstract: A novel Discriminant Non-negative Matrix Factorization (DNMF) method that uses projected gradients, is presented in this paper. The proposed algorithm guarantees the algorithm's convergence to a stationary point, contrary to the methods introduced so far, that only ensure the non-increasing behavior of the algorithm's cost function. The proposed algorithm employs some extra modifications that make the method more suitable for classification tasks. The usefulness of the proposed technique to the frontal face verification problem is also demonstrated.

...read moreread less