scispace - formally typeset
Search or ask a question
Author

Ioannis Pitas

Other affiliations: University of Bristol, University of York, University of Toronto  ...read more
Bio: Ioannis Pitas is an academic researcher from Aristotle University of Thessaloniki. The author has contributed to research in topics: Facial recognition system & Digital watermarking. The author has an hindex of 76, co-authored 795 publications receiving 24787 citations. Previous affiliations of Ioannis Pitas include University of Bristol & University of York.


Papers
More filters
Proceedings ArticleDOI
10 Oct 2008
TL;DR: The robustness of the Gabor approach when globally applied to extract relevant discriminative features and the method out-performs other state-of-the-art techniques compared in the paper such as, principal component analysis (PCA) or linear discriminant analysis (LDA).
Abstract: The human visual system can rapidly and accurately recognize a large number of various objects in cluttered scenes under widely varying and difficult viewing conditions, such as illuminations changing, occlusion, scaling or rotation. One of the state-of-the-art feature extraction techniques used in image recognition and processing is based on the Gabor wavelet model. This paper deals with the application of the aforementioned model for object classification task with respect to the rotation issue. Three training sample sizes were applied to assess the methodpsilas performance. Experiments ran on the COIL-100 database show the robustness of the Gabor approach when globally applied to extract relevant discriminative features. The method out-performs other state-of-the-art techniques compared in the paper such as, principal component analysis (PCA) or linear discriminant analysis (LDA).

4 citations

Proceedings ArticleDOI
23 May 2022
TL;DR: A novel DPP-based regularizer is proposed that exploits a pretrained DNN-based image captioner in order to additionally enforce maximal key-frame diversity from the perspective of textual semantic content.
Abstract: Most unsupervised Deep Neural Networks (DNNs) for video summarization rely on adversarial learning, autoencoding and training without utilizing any ground-truth summary. In several cases, the Convolutional Neural Network (CNN)-derived video frame representations are sequentially fed to a Long Short-Term Memory (LSTM) network, which selects key-frames and, during training, attempts to reconstruct the original/full video from the summary, while confusing an adversarially optimized Discriminator. Additionally, regularizers aiming at maximizing the summary’s visual semantic diversity can be employed, such as the Determinantal Point Process (DPP) loss term. In this paper, a novel DPP-based regularizer is proposed that exploits a pretrained DNN-based image captioner in order to additionally enforce maximal key-frame diversity from the perspective of textual semantic content. Thus, the selected key-frames are encouraged to differ not only with regard to what objects they depict, but also with regard to their textual descriptions, which may additionally capture activities, scene context, etc. Empirical evaluation indicates that the proposed regularizer leads to state-of-the-art performance.

4 citations

Journal Article
TL;DR: An inflator for an inflatable vehicle occupant protection device, such as an air bag, includes a housing storing a non-pressurized source of inflation fluid that contains a tubular filter structure formed of gas-permeable material for removing additional particulate matter from the inflation gas.

3 citations

Book ChapterDOI
11 Nov 2005
TL;DR: Several classes are proposed to extend the MPEG-7 standard so that it can handle the video media data, in a more uniform and anthropocentric way, and it is shown that the corresponding scheme produce a new profile which is more flexible in all types of applications.
Abstract: MPEG-7 has emerged as the standard for multimedia data content description. As it is in its early age, it tries to evolve towards a direction in which semantic content description can be implemented. In this paper we provide a number of classes to extend the MPEG-7 standard so that it can handle the video media data, in a more uniform and anthropocentric way. Many descriptors (Ds) and description schemes (DSs) already provided by the MPEG-7 standard can help to implement semantics of a media. However, by grouping together several MPEG-7 classes and adding new Ds, better results in the video production and video analysis tasks can be produced. Several classes are proposed in this context and we show that the corresponding scheme produce a new profile which is more flexible in all types of applications as they are described in [1].

3 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Abstract: As one of the most successful applications of image analysis and understanding, face recognition has recently received significant attention, especially during the past several years. At least two reasons account for this trend: the first is the wide range of commercial and law enforcement applications, and the second is the availability of feasible technologies after 30 years of research. Even though current machine recognition systems have reached a certain level of maturity, their success is limited by the conditions imposed by many real applications. For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition, relevant topics such as psychophysical studies, system evaluation, and issues of illumination and pose variation are covered.

6,384 citations

Journal ArticleDOI
TL;DR: In this article, the authors categorize and evaluate face detection algorithms and discuss relevant issues such as data collection, evaluation metrics and benchmarking, and conclude with several promising directions for future research.
Abstract: Images containing faces are essential to intelligent vision-based human-computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face, regardless of its 3D position, orientation and lighting conditions. Such a problem is challenging because faces are non-rigid and have a high degree of variability in size, shape, color and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.

3,894 citations