scispace - formally typeset
Search or ask a question
Author

Ioannis Pitas

Other affiliations: University of Bristol, University of York, University of Toronto  ...read more
Bio: Ioannis Pitas is an academic researcher from Aristotle University of Thessaloniki. The author has contributed to research in topics: Facial recognition system & Digital watermarking. The author has an hindex of 76, co-authored 795 publications receiving 24787 citations. Previous affiliations of Ioannis Pitas include University of Bristol & University of York.


Papers
More filters
Journal ArticleDOI
TL;DR: This work investigates several ways to exploit scene depth information, implicitly available through the modality of stereoscopic disparity in 3D videos, with the purpose of augmenting performance in the problem of recognizing complex human activities in natural settings.
Abstract: This work investigates several ways to exploit scene depth information, implicitly available through the modality of stereoscopic disparity in 3D videos, with the purpose of augmenting performance in the problem of recognizing complex human activities in natural settings. The standard state-of-the-art activity recognition algorithmic pipeline consists in the consecutive stages of video description, video representation and video classification. Multimodal, depth-aware modifications to standard methods are being proposed and studied, both for video description and for video representation, that indirectly incorporate scene geometry information derived from stereo disparity. At the description level, this is made possible by suitably manipulating video interest points based on disparity data. At the representation level, the followed approach represents each video by multiple vectors corresponding to different disparity zones, resulting in multiple activity descriptions defined by disparity characteristics. In both cases, a scene segmentation is thus implicitly implemented, based on the distance of each imaged object from the camera during video acquisition. The investigated approaches are flexible and able to cooperate with any monocular low-level feature descriptor. They are evaluated using a publicly available activity recognition dataset of unconstrained stereoscopic 3D videos, consisting in extracts from Hollywood movies, and compared both against competing depth-aware approaches and a state-of-the-art monocular algorithm. Quantitative evaluation reveals that some of the examined approaches achieve state-of-the-art performance.

11 citations

Proceedings ArticleDOI
13 Nov 1994
TL;DR: Two nonlinear digital filters for multichannel signal processing are presented and a method for the approximate calculation of output mean and variance for one of these filters is proposed.
Abstract: Two nonlinear digital filters for multichannel signal processing are presented in this paper. A method for the approximate calculation of output mean and variance for one of these filters is proposed. The results of simulations, showing great advantages of the described filters, are also presented. >

11 citations

Proceedings ArticleDOI
18 May 2016
TL;DR: A distributed clustering scheme that is based on Trimmed Kernel k-Means, which employs subsampling, in order to be able to efficiently perform clustering on an extremely large dataset, while still providing clustering performance competitive with other state of the art kernel approaches.
Abstract: Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. A classic clustering algorithm is the so-called k-Means. It is very popular, however, it is also unable to handle cases in which the clusters are not linearly separable. Kernel k-Means is a state of the art clustering algorithm, which employs the kernel trick, in order to perform clustering on a higher dimensionality space, thus overcoming the limitations of classic k-Means regarding the non linear separability of the input data. It has recently received a distributed implementation, named Trimmed Kernel k-Means, following the MapReduce distributed computing model. In addition to performing the computations in a distributed manner, Trimmed Kernel k-Means also trims the kernel matrix, in order to reduce the memory requirements and improve performance. The trimming of each row of the kernel matrix is achieved by attempting to estimate the cardinality of the cluster that the corresponding sample belongs to, and removing the kernel matrix entries connecting the sample to samples that probably belong to another cluster. The Spark cluster computing framework was used for the distributed implementation. In this paper, we present a distributed clustering scheme that is based on Trimmed Kernel k-Means, which employs subsampling, in order to be able to efficiently perform clustering on an extremely large dataset. The results indicate that the proposed method run much faster than the original Trimmed Kernel k-Means, while still providing clustering performance competitive with other state of the art kernel approaches.

11 citations

Proceedings ArticleDOI
01 Dec 2014
TL;DR: A method for video summarization that operates on a video segment level by introducing a novel formulation of the Support Vector Data Description method that exploits subclass information in its optimization process and evaluating the proposed approach in three Hollywood movies.
Abstract: In this paper, we describe a method for video summarization that operates on a video segment level. We formulate this problem as the one of automatic video segment selection based on a learning process that employs salient video segment paradigms. We design an hierarchical learning scheme that consists of two steps. At the first step, an unsupervised process is performed in order to determine salient video segment types. The second step is a supervised learning process that is performed for each of the salient video segment type independently. For the latter case, since only salient training examples are available, the problem is stated as an one-class classification problem. In order to take into account subclass information that may appear in the video segment types, we introduce a novel formulation of the Support Vector Data Description method that exploits subclass information in its optimization process. We evaluate the proposed approach in three Hollywood movies, where the performance of the proposed Subclass SVDD (SSVDD) algorithm is compared with that of related methods. Experimental results show that the adoption of both hierarchical learning and the proposed SSVDD method contribute to the final classification performance.

11 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed system for content-based identification of image replicas can identify replicas with high accuracy and facilitate a wide range of applications such as copyright protection,content-based monitoring, content-aware multimedia management, etc.

11 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Abstract: As one of the most successful applications of image analysis and understanding, face recognition has recently received significant attention, especially during the past several years. At least two reasons account for this trend: the first is the wide range of commercial and law enforcement applications, and the second is the availability of feasible technologies after 30 years of research. Even though current machine recognition systems have reached a certain level of maturity, their success is limited by the conditions imposed by many real applications. For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition, relevant topics such as psychophysical studies, system evaluation, and issues of illumination and pose variation are covered.

6,384 citations

Journal ArticleDOI
TL;DR: In this article, the authors categorize and evaluate face detection algorithms and discuss relevant issues such as data collection, evaluation metrics and benchmarking, and conclude with several promising directions for future research.
Abstract: Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.

3,894 citations