Showing papers by "Hazim Kemal Ekenel published in 2011"

PDF

Open Access

Proceedings Article•DOI•

A common framework for real-time emotion recognition and facial action unit detection

[...]

Tobias Gehrig¹, Hazim Kemal Ekenel¹•Institutions (1)

20 Jun 2011

TL;DR: A common framework for realtime action unit detection and emotion recognition that is developed for the emotion recognition andaction unit detection sub-challenges of the FG 2011 Facial Expression Recognition and Analysis Challenge is presented.

...read moreread less

Abstract: In this paper, we present a common framework for realtime action unit detection and emotion recognition that we have developed for the emotion recognition and action unit detection sub-challenges of the FG 2011 Facial Expression Recognition and Analysis Challenge. For these tasks we employed a local appearance-based face representation approach using discrete cosine transform, which has been shown to be very effective and robust for face recognition. Using these features, we trained multiple one-versus-all support vector machine classifiers corresponding to the individual classes of the specific task. With this framework we achieve 24.2% and 7.6% absolute improvement over the overall baseline results on the emotion recognition and action unit detection sub-challenge, respectively.

...read moreread less

30 citations

Journal Article•DOI•

Person re-identification in TV series using robust face recognition and user feedback

[...]

Mika Fischer¹, Hazim Kemal Ekenel¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Oct 2011-Multimedia Tools and Applications

TL;DR: The proposed system employs a face tracker that can track faces up to full profile views and provides temporal association of the face images in the video, so that instead of using single images for query or target, whole tracks can be used.

...read moreread less

Abstract: In this paper, we present a system for person re-identification in TV series. In the context of video retrieval, person re-identification refers to the task where a user clicks on a person in a video frame and the system then finds other occurrences of the same person in the same or different videos. The main characteristic of this scenario is that no previously collected training data is available, so no person-specific models can be trained in advance. Additionally, the query data is limited to the image that the user clicks on. These conditions pose a great challenge to the re-identification system, which has to find the same person in other shots despite large variations in the person's appearance. In the study, facial appearance is used as the re-identification cue, since, in contrast to surveillance-oriented re-identification studies, the person can have different clothing in different shots. In order to increase the amount of available face data, the proposed system employs a face tracker that can track faces up to full profile views. This makes it possible to use a profile face image as query image and also to retrieve images with non-frontal poses. It also provides temporal association of the face images in the video, so that instead of using single images for query or target, whole tracks can be used. A fast and robust face recognition algorithm is used to find matching faces. If the match result is highly confident, our system adds the matching face track to the query set. Finally, if the user is not satisfied with the number of returned results, the system can present a small number of candidate face images and lets the user confirm the ones that belong to the queried person. These features help to increase the variation in the query set, making it possible to retrieve results with different poses, illumination conditions, etc. The system is extensively evaluated on two episodes of the TV series Coupling, showing very promising results.

...read moreread less

26 citations

Proceedings Article•DOI•

Facial action unit detection using kernel partial least squares

[...]

Tobias Gehrig¹, Hazim Kemal Ekenel¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Nov 2011

TL;DR: This work proposes a framework for simultaneously detecting the presence of multiple facial action units using kernel partial least square regression (KPLS), which has the advantage of being easily extensible to learn more face related labels, while at the same time being computationally efficient.

...read moreread less

Abstract: In this work, we propose a framework for simultaneously detecting the presence of multiple facial action units using kernel partial least square regression (KPLS). This method has the advantage of being easily extensible to learn more face related labels, while at the same time being computationally efficient. We compare the approach to linear and non-linear support vector machines (SVM) and evaluate its performance on the extended Cohn-Kanade (CK+) dataset and the GEneva Multimodal Emotion Portrayals (GEMEP-FERA) dataset, as well as across databases. It is shown that KPLS achieves around 2% absolute improvement over the SVM-based approach in terms of the two alternative forced choice (2AFC) score when trained on CK+ and tested on CK+ and GEMEP-FERA. It achieves around 6% absolute improvement over the SVM-based approach when trained on GEMEP-FERA and tested on CK+. We also show that KPLS is handling non-additive AU combinations better than SVM-based approaches trained to detect single AUs only.

...read moreread less

23 citations

Proceedings Article•DOI•

Boosting Pseudo Census Transform Features for Face Alignment

[...]

Hua Gao¹, Hazim Kemal Ekenel¹, Mika Fischer¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Jan 2011

TL;DR: This paper presents a new discriminative face model based on boosting pseudo census transform features, considered to be less sensitive to illumination changes, which yields a more robust alignment algorithm.

...read moreread less

Abstract: Face alignment using deformable face model has attracted broad interest in recent years for its wide range of applications in facial analysis. Previous work has shown that discriminative deformable models have better generalization capacity compared to generative models [8, 9]. In this paper, we present a new discriminative face model based on boosting pseudo census transform features. This feature is considered to be less sensitive to illumination changes, which yields a more robust alignment algorithm. The alignment is based on maximizing the scores of boosted strong classifier, which indicate whether the current alignment is a correct or incorrect one. The proposed approach has been evaluated extensively on several databases. The experimental results show that our approach generalizes better on unseen data compared to the Haar feature-based approach. Moreover, its training procedure is much faster due to the low dimensionality of the configuration space of the proposed feature.

...read moreread less

7 citations

Proceedings Article•DOI•

Effective discretization of Gabor features for real-time face detection

[...]

Feijun Jiang¹, Bertram E. Shi¹, Mika Fischer², Hazim Kemal Ekenel²•Institutions (2)

Hong Kong University of Science and Technology¹, Karlsruhe Institute of Technology²

29 Dec 2011

TL;DR: A real-time face detector based on Gabor features is described and an efficient discrete encoding method for the Gabor feature vector is proposed that enables us to use a computationally efficient multi-stage classifier based on boosting and winnowing.

...read moreread less

Abstract: We describe a real-time face detector based on Gabor features. While Gabor features often lead to improved performance, they are often avoided as they are perceived as being computationally expensive. We address this in two ways. First, we propose an efficient discrete encoding method for the Gabor feature vector. This enables us to use a computationally efficient multi-stage classifier based on boosting and winnowing. Second, we accelerate computationally complex computations using the parallelization provided by graphics processing units (GPUs). With these innovations, the resulting detector runs at 16.8 fps for 640 × 480 images on a PC equipped with an i5 CPU and a GTX 465 graphic card.

...read moreread less

6 citations

Proceedings Article•DOI•

Real-time face swapping in video sequences: Magic Mirror

[...]

N. Murat Arar¹, N. Kaan Bekmezci¹, Fatma Güney¹, Hazim Kemal Ekenel•Institutions (1)

Boğaziçi University¹

20 Apr 2011

TL;DR: Magic Mirror is a face swapping tool that replaces the user's face with a selected famous person's face in the database via a user interface which enables the selection of the replacement face and directly reflects the changed appearance.

...read moreread less

Abstract: Magic Mirror is a face swapping tool that replaces the user's face with a selected famous person's face in the database. System interacts with the user via a user interface which enables the selection of the replacement face and directly reflects the changed appearance. First, we apply a face detection mechanism to locate the face in the frame coming from the capturing device. Then, we feed the detection result to an active appearance model to get the exact shape of the face. By using extracted information, we replace the user's face with selected target face.We display output after some post processing for color and lighting adjustments.

...read moreread less

5 citations

Quaero at TRECVID 2011: Semantic Indexing and Multimedia Event Detection.

[...]

Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi, Franck Thollard, Georges Quénot, Hervé Jégou, Tobias Gehrig¹, Hazim Kemal Ekenel¹, Rainer Stiefelhagen¹ - Show less +5 more•Institutions (1)

Karlsruhe Institute of Technology¹

15 Nov 2011

TL;DR: The Quaero group is a consortium of French and German organizations working on Multimedia Indexing and Retrieval and LIG and KIT participated to the semantic indexing task and the organization of this task, with a system derived from the generic one they have for general purpose concept indexing in videos.

...read moreread less

Abstract: The Quaero group is a consortium of French and German organizations working on Multimedia Indexing and Retrieval. LIG and KIT participated to the semantic indexing task and LIG participated to the organization of this task. LIG also participated to the multimedia event detection task. This paper describes these participations. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classifi cation, fusion of descriptor variants, higher-level fusion, and re-ranking. We used a number of diff erent descriptors and a hierarchical fusion strategy. We also used conceptual feedback by adding a vector of classi fication score to the pool of descriptors. The best Quaero run has a Mean Inferred Average Precision of 0.1529, which ranked us 3rd out of 19 participants. We participated to the multimedia event detection task with a system derived from the generic one we have for general purpose concept indexing in videos considering the target events as concepts. Detection scores on videos are produced from the scores on shots.

...read moreread less

4 citations

Proceedings Article•DOI•

Appearance-based human gesture recognition using multimodal features for human computer interaction

[...]

Dan Luo¹, Hua Gao, Hazim Kemal Ekenel, Jun Ohya¹•Institutions (1)

Waseda University¹

02 Feb 2011-Proceedings of SPIE

TL;DR: An appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera is presented.

...read moreread less

Abstract: The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

...read moreread less

1 citations

Proceedings Article•

Identifying Important People in Broadcast News Videos

[...]

Hua Gao¹, Hazim Kemal Ekenel¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Jan 2011

TL;DR: A system which automatically detects a list of important targets such as anchor speakers or active politicians in broadcast news videos and achieves a very high precision with a reasonable recall rate is proposed.

...read moreread less

Abstract: Automatic face identification in multimedia archives such as broadcast news videos is useful for indexing or retrieving documents based on important persons that appear in the video. In this paper, we propose a system which automatically detects a list of important targets such as anchor speakers or active politicians in broadcast news videos. This involves several steps including detecting faces in various conditions, associating faces to tracks and identifying whether a face track contains certain faces defined in a watch list. We evaluated this system on a database, which contains about 36 hours of broadcast news videos. Experiments show that our system achieves a very high precision with a reasonable recall rate.

...read moreread less

1 citations

KIT at MediaEval 2011 - Content-based genre classification on web-videos.

[...]

Tomas Semela¹, Hazim Kemal Ekenel¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Jan 2011

TL;DR: This evaluation is to assess the content-based system’s performance on the diversied content of the blip.tv web-video corpus, which is described in detail in [5].

...read moreread less

Abstract: In this paper, we run our content-based video genre classication system on the MediaEval evaluation corpus. Our system is based on several low level audio-visual cues, as well as cognitive and structural information. The purpose of this evaluation is to assess our content-based system’s performance on the diversied content of the blip.tv web-video corpus, which is described in detail in [5].

...read moreread less

1 citations

Filmler için Yüz Tanima Tabanli IMDB Eklentisi Face Recognition-based IMDB Plug-in for Movies

[...]

Sezer Ulukaya¹, Guney Kayim, Hazim Kemal Ekenel²•Institutions (2)

Bahçeşehir University¹, Karlsruhe Institute of Technology²

01 Jan 2011

TL;DR: This study addressed the following three interesting points: matching between face sequence and face image sets, the effect of automatically collected noisy training images from the web on the performance, and finally, the performance effect of utilizing prior information of cast list and performing the classification within a limited number of classes.

...read moreread less

Abstract: In this paper, we present an initial study on an IMDB plug-in for cast identification in movies. In the system, training face images are collected by using Google image search. While watching a movie, the user clicks on the face of the person he is interested to acquire information. Afterwards, the system first tries to detect close to frontal faces, if it cannot find any, then it runs a profile face detector. The found face are then tracked backwards and forwards in the shot and this way a face sequence is obtained. Matching is performed between the extracted face sequence from the movie and the face image sets collected from the web. IMDB page links of the closest three persons resulted from the matching process is then presented to the user. In this study, we addressed the following three interesting points: matching between face sequence and face image sets, the effect of automatically collected noisy training images from the web on the performance, and finally, the performance effect of utilizing prior information of cast list and performing the classification within a limited number of classes. , ,

...read moreread less

Quaero at TRECVID 2011: Semantic Indexing and Multimedi Event Detection (draft)

[...]

Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi, Franck Thollard, Georges Qu, Tobias Gehrig, Hazim Kemal Ekenel, Rainer Stifelhagen - Show less +4 more

01 Jan 2011

...read moreread less

Abstract: The Quaero group is a consortium of French and German organizations working on Multimedia Indexing and Retrieval 1 . LIG and KIT participated to the semantic indexing task and LIG participated to the organization of this task. LIG also participated to the multimedia event detection task. This paper describes these participations. For the semantic indexing task, our approach uses a sixstages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classication, fusion of descriptor variants,

...read moreread less

Proceedings Article•DOI•

Face recognition-based IMDB plug-in for movies

[...]

Sezer Ulukaya¹, Guney Kayim¹, Hazim Kemal Ekenel²•Institutions (2)

Bahçeşehir University¹, Boğaziçi University²

20 Apr 2011

TL;DR: In this paper, an IMDB plug-in for cast identification in movies is presented, where the user clicks on the face of the person he is interested to acquire information and then runs a profile face detector.

...read moreread less