Top 18 papers published by Hazim Kemal Ekenel from Istanbul Technical University in 2017

Journal Article•DOI•

Strengths and Weaknesses of Deep Learning Models for Face Recognition Against Image Degradations

[...]

Klemen Grm, Vitomir Struc, Anais Artiges, Matthieu Caron, Hazim Kemal Ekenel - Show less +1 more

04 Oct 2017-arXiv: Machine Learning

TL;DR: The results indicate that high levels of noise, blur, missing pixels, and brightness have a detrimental effect on the verification performance of all models, whereas the impact of contrast changes and compression artifacts is limited.

...read moreread less

Abstract: Deep convolutional neural networks (CNNs) based approaches are the state-of-the-art in various computer vision tasks, including face recognition. Considerable research effort is currently being directed towards further improving deep CNNs by focusing on more powerful model architectures and better learning techniques. However, studies systematically exploring the strengths and weaknesses of existing deep models for face recognition are still relatively scarce in the literature. In this paper, we try to fill this gap and study the effects of different covariates on the verification performance of four recent deep CNN models using the Labeled Faces in the Wild (LFW) dataset. Specifically, we investigate the influence of covariates related to: image quality -- blur, JPEG compression, occlusion, noise, image brightness, contrast, missing pixels; and model characteristics -- CNN architecture, color information, descriptor computation; and analyze their impact on the face verification performance of AlexNet, VGG-Face, GoogLeNet, and SqueezeNet. Based on comprehensive and rigorous experimentation, we identify the strengths and weaknesses of the deep learning models, and present key areas for potential future research. Our results indicate that high levels of noise, blur, missing pixels, and brightness have a detrimental effect on the verification performance of all models, whereas the impact of contrast changes and compression artifacts is limited. It has been found that the descriptor computation strategy and color information does not have a significant influence on performance.

...read moreread less

120 citations

Proceedings Article•DOI•

The unconstrained ear recognition challenge

[...]

Ziga Emersic¹, Dejan Stepec¹, Vitomir Struc¹, Peter Peer¹, Anjith George², Adii Ahmad³, Elshibani Omar³, Terrance E. Boult³, Reza Safdaii⁴, Yuxiang Zhou⁵, Stefanos Zafeiriou⁵, Dogucan Yaman, Fevziye Irem Eyiokur, Hazim Kemal Ekenel - Show less +10 more•Institutions (5)

University of Ljubljana¹, Indian Institute of Technology Kharagpur², University of Colorado Colorado Springs³, Islamic Azad University⁴, Imperial College London⁵

01 Oct 2017-International Journal of Central Banking

TL;DR: The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset, but still exhibited a significant performance drop when the entire dataset comprising 3,704 subjects was used for testing.

...read moreread less

Abstract: In this paper we present the results of the Unconstrained Ear Recognition Challenge (UERC), a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions. The goal of the challenge was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future. Five groups from three continents participated in the challenge and contributed six ear recognition techniques for the evaluation, while multiple baselines were made available for the challenge by the UERC organizers. A comprehensive analysis was conducted with all participating approaches addressing essential research questions pertaining to the sensitivity of the technology to head rotation, flipping, gallery size, large-scale recognition and others. The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset (with 180 subjects) regardless of image characteristics, but still exhibited a significant performance drop when the entire dataset comprising 3,704 subjects was used for testing.

...read moreread less

50 citations

Journal Article•DOI•

Face deidentification with generative deep neural networks

[...]

Blaž Meden, Refik Can Malli, Sebastjan Fabijan, Hazim Kemal Ekenel, Vitomir Struc, Peter Peer - Show less +2 more

01 Dec 2017-Iet Signal Processing

TL;DR: This work presents a novel face deidentification pipeline, which ensures anonymity by synthesizing artificial surrogate faces using generative neural networks (GNNs) to deidentify subjects in images or video, while preserving non-identity-related aspects of the data and consequently enabling data utilization.

...read moreread less

Abstract: Face deidentification is an active topic amongst privacy and security researchers. Early deidentification methods relying on image blurring or pixelisation have been replaced in recent years with techniques based on formal anonymity models that provide privacy guaranties and retain certain characteristics of the data even after deidentification. The latter aspect is important, as it allows the deidentified data to be used in applications for which identity information is irrelevant. In this work, the authors present a novel face deidentification pipeline, which ensures anonymity by synthesising artificial surrogate faces using generative neural networks (GNNs). The generated faces are used to deidentify subjects in images or videos, while preserving non-identity-related aspects of the data and consequently enabling data utilisation. Since generative networks are highly adaptive and can utilise diverse parameters (pertaining to the appearance of the generated output in terms of facial expressions, gender, race etc.), they represent a natural choice for the problem of face deidentification. To demonstrate the feasibility of the authors’ approach, they perform experiments using automated recognition tools and human annotators. Their results show that the recognition performance on deidentified images is close to chance, suggesting that the deidentification process based on GNNs is effective.

...read moreread less

50 citations

Book Chapter•DOI•

A Computer Vision System to Localize and Classify Wastes on the Streets

[...]

Mohammad Saeed Rad¹, Andreas von Kaenel, Andre Droux, Francois Tieche², Nabil Ouerhani², Hazim Kemal Ekenel³, Jean-Philippe Thiran¹ - Show less +3 more•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, École Normale Supérieure², Istanbul Technical University³

10 Jul 2017

TL;DR: A fully automated computer vision application for littering quantification based on images taken from the streets and sidewalks using a deep learning based framework to localize and classify different types of wastes.

...read moreread less

Abstract: Littering quantification is an important step for improving cleanliness of cities. When human interpretation is too cumbersome or in some cases impossible, an objective index of cleanliness could reduce the littering by awareness actions. In this paper, we present a fully automated computer vision application for littering quantification based on images taken from the streets and sidewalks. We have employed a deep learning based framework to localize and classify different types of wastes. Since there was no waste dataset available, we built our acquisition system mounted on a vehicle. Collected images containing different types of wastes. These images are then annotated for training and benchmarking the developed system. Our results on real case scenarios show accurate detection of littering on variant backgrounds.

...read moreread less

50 citations

Proceedings Article•DOI•

Combining LiDAR space clustering and convolutional neural networks for pedestrian detection

[...]

Damien Matti¹, Hazim Kemal Ekenel¹, Jean-Philippe Thiran¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 Aug 2017

TL;DR: In this article, LiDAR data is utilized to generate region proposals by processing the three dimensional point cloud that it provides and these candidate regions are then further processed by a state-of-the-art CNN classifier that has been fine-tuned for pedestrian detection.

...read moreread less

Abstract: Pedestrian detection is an important component for safety of autonomous vehicles, as well as for traffic and street surveillance. There are extensive benchmarks on this topic and it has been shown to be a challenging problem when applied on real use-case scenarios. In purely image-based pedestrian detection approaches, the state-of-the-art results have been achieved with convolutional neural networks (CNN) and surprisingly few detection frameworks have been built upon multi-cue approaches. In this work, we develop a new pedestrian detector for autonomous vehicles that exploits LiDAR data, in addition to visual information. In the proposed approach, LiDAR data is utilized to generate region proposals by processing the three dimensional point cloud that it provides. These candidate regions are then further processed by a state-of-the-art CNN classifier that we have fine-tuned for pedestrian detection. We have extensively evaluated the proposed detection process on the KITTI dataset. The experimental results show that the proposed LiDAR space clustering approach provides a very efficient way of generating region proposals leading to higher recall rates and fewer misses for pedestrian detection. This indicates that LiDAR data can provide auxiliary information for CNN-based approaches.

...read moreread less

41 citations

Journal Article•DOI•

Face Deidentification with Generative Deep Neural Networks

[...]

Blaž Meden, Refik Can Malli, Sebastjan Fabijan, Hazim Kemal Ekenel, Vitomir Struc, Peter Peer - Show less +2 more

28 Jul 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a face deidentification pipeline is presented, which ensures anonymity by synthesizing artificial surrogate faces using generative neural networks (GNNs), which are used to deidentify subjects in images or video, while preserving non-identity-related aspects of the data and consequently enabling data utilization.

...read moreread less

Abstract: Face deidentification is an active topic amongst privacy and security researchers. Early deidentification methods relying on image blurring or pixelization were replaced in recent years with techniques based on formal anonymity models that provide privacy guaranties and at the same time aim at retaining certain characteristics of the data even after deidentification. The latter aspect is particularly important, as it allows to exploit the deidentified data in applications for which identity information is irrelevant. In this work we present a novel face deidentification pipeline, which ensures anonymity by synthesizing artificial surrogate faces using generative neural networks (GNNs). The generated faces are used to deidentify subjects in images or video, while preserving non-identity-related aspects of the data and consequently enabling data utilization. Since generative networks are very adaptive and can utilize a diverse set of parameters (pertaining to the appearance of the generated output in terms of facial expressions, gender, race, etc.), they represent a natural choice for the problem of face deidentification. To demonstrate the feasibility of our approach, we perform experiments using automated recognition tools and human annotators. Our results show that the recognition performance on deidentified images is close to chance, suggesting that the deidentification process based on GNNs is highly effective.

...read moreread less

39 citations

Book Chapter•DOI•

A Computer Vision System to Localize and Classify Wastes on the Streets

[...]

Mohammad Saeed Rad¹, Andreas von Kaenel, Andre Droux, Francois Tieche², Nabil Ouerhani², Hazim Kemal Ekenel³, Jean-Philippe Thiran¹ - Show less +3 more•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, École Normale Supérieure², Istanbul Technical University³

31 Oct 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a fully automated computer vision application for littering quantification based on images taken from the streets and sidewalks is presented. But there was no waste dataset available, so they built their acquisition system mounted on a vehicle and collected images containing different types of wastes.

...read moreread less

Abstract: Littering quantification is an important step for improving cleanliness of cities. When human interpretation is too cumbersome or in some cases impossible, an objective index of cleanliness could reduce the littering by awareness actions. In this paper, we present a fully automated computer vision application for littering quantification based on images taken from the streets and sidewalks. We have employed a deep learning based framework to localize and classify different types of wastes. Since there was no waste dataset available, we built our acquisition system mounted on a vehicle. Collected images containing different types of wastes. These images are then annotated for training and benchmarking the developed system. Our results on real case scenarios show accurate detection of littering on variant backgrounds.

...read moreread less

16 citations

Posted Content•

The Unconstrained Ear Recognition Challenge

[...]

Žiga Emeršič, Dejan Stepec, Vitomir Struc, Peter Peer, Anjith George, Adil Ahmad, Elshibani Omar, Terrance E. Boult, Reza Safdari, Yuxiang Zhou, Stefanos Zafeiriou, Dogucan Yaman, Fevziye Irem Eyiokur, Hazim Kemal Ekenel - Show less +10 more

23 Aug 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: The Unconstrained Ear Recognition Challenge (UERC) as mentioned in this paper was a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions, where the goal was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future.

...read moreread less

Abstract: In this paper we present the results of the Unconstrained Ear Recognition Challenge (UERC), a group benchmarking effort centered around the problem of person recognition from ear images captured in uncontrolled conditions. The goal of the challenge was to assess the performance of existing ear recognition techniques on a challenging large-scale dataset and identify open problems that need to be addressed in the future. Five groups from three continents participated in the challenge and contributed six ear recognition techniques for the evaluation, while multiple baselines were made available for the challenge by the UERC organizers. A comprehensive analysis was conducted with all participating approaches addressing essential research questions pertaining to the sensitivity of the technology to head rotation, flipping, gallery size, large-scale recognition and others. The top performer of the UERC was found to ensure robust performance on a smaller part of the dataset (with 180 subjects) regardless of image characteristics, but still exhibited a significant performance drop when the entire dataset comprising 3,704 subjects was used for testing.

...read moreread less

15 citations

Proceedings Article•DOI•

Exploiting Convolution Filter Patterns for Transfer Learning

[...]

Mehmet Aygun¹, Yusuf Aytar², Hazim Kemal Ekenel³•Institutions (3)

Technische Universität München¹, Massachusetts Institute of Technology², École Polytechnique Fédérale de Lausanne³

01 Oct 2017

TL;DR: In this article, Gaussian Mixture Models (GMMs) are used to capture statistical relationships among convolution filters learned from a well-trained network and transfer this knowledge to another network.

...read moreread less

Abstract: In this paper, we introduce a new regularization technique for transfer learning. The aim of the proposed approach is to capture statistical relationships among convolution filters learned from a well-trained network and transfer this knowledge to another network. Since convolution filters of the prevalent deep Convolutional Neural Network (CNN) models share a number of similar patterns, in order to speed up the learning procedure, we capture such correlations by Gaussian Mixture Models (GMMs) and transfer them using a regularization term. We have conducted extensive experiments on the CIFAR10, Places2, and CM-Places datasets to assess generalizability, task transferability, and cross-model transferability of the proposed approach, respectively. The experimental results show that the feature representations have efficiently been learned and transferred through the proposed statistical regularization scheme. Moreover, our method is an architecture independent approach, which is applicable for a variety of CNN architectures.

...read moreread less

13 citations

Book Chapter•DOI•

Visual Speech Recognition Using PCA Networks and LSTMs in a Tandem GMM-HMM System

[...]

Marina Zimmermann¹, Mostafa Mehdipour Ghazi², Hazim Kemal Ekenel³, Jean-Philippe Thiran¹•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, Sabancı University², Istanbul Technical University³

19 Oct 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work shows that the proposed method has outperformed the baseline techniques applied to the OuluVS2 audiovisual database for phrase recognition with the frontal view cross-validation and testing sentence correctness reaching 79% and 73%, respectively, as compared to the baseline of 74% on cross- validation.

...read moreread less

Abstract: Automatic visual speech recognition is an interesting problem in pattern recognition especially when audio data is noisy or not readily available. It is also a very challenging task mainly because of the lower amount of information in the visual articulations compared to the audible utterance. In this work, principle component analysis is applied to the image patches - extracted from the video data - to learn the weights of a two-stage convolutional network. Block histograms are then extracted as the unsupervised learning features. These features are employed to learn a recurrent neural network with a set of long short-term memory cells to obtain spatiotemporal features. Finally, the obtained features are used in a tandem GMM-HMM system for speech recognition. Our results show that the proposed method has outperformed the baseline techniques applied to the OuluVS2 audiovisual database for phrase recognition with the frontal view cross-validation and testing sentence correctness reaching 79% and 73%, respectively, as compared to the baseline of 74% on cross-validation.

...read moreread less

5 citations

Proceedings Article•DOI•

A deep learning based approach for classification of CerbB2 tumor cells in breast cancer

[...]

Gozde Ayse Tataroglu¹, Anil Genc¹, Kaan Aykut Kabakci¹, Abdulkerim Capar¹, B. Ugur Toreyin¹, Hazim Kemal Ekenel¹, İlknur Türkmen², Asli Cakir² - Show less +4 more•Institutions (2)

Istanbul Technical University¹, Istanbul Medipol University²

01 May 2017

TL;DR: CerbB2 tumor scores were generated for the cell fragments were classified with high performance by the aid of convolutional neural networks (CNN).

...read moreread less

Abstract: This study proposes a unique approach to classify CerbB2 tumor cell scores in breast cancer based on deep learning models. Another contribution of the study is the creation of a dataset from original breast cancer tissues. On the purpose of training, validating and testing with deep learning models cell fragments were generated from sample tissue images. CerbB2 tumor scores were generated for the cell fragments were classified with high performance by the aid of convolutional neural networks (CNN).

...read moreread less

Posted Content•

Combining LiDAR Space Clustering and Convolutional Neural Networks for Pedestrian Detection

[...]

Damien Matti¹, Hazim Kemal Ekenel¹, Jean-Philippe Thiran¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

17 Oct 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: The experimental results show that the proposed LiDar space clustering approach provides a very efficient way of generating region proposals leading to higher recall rates and fewer misses for pedestrian detection, indicating that LiDAR data can provide auxiliary information for CNN-based approaches.

...read moreread less

Abstract: Pedestrian detection is an important component for safety of autonomous vehicles, as well as for traffic and street surveillance. There are extensive benchmarks on this topic and it has been shown to be a challenging problem when applied on real use-case scenarios. In purely image-based pedestrian detection approaches, the state-of-the-art results have been achieved with convolutional neural networks (CNN) and surprisingly few detection frameworks have been built upon multi-cue approaches. In this work, we develop a new pedestrian detector for autonomous vehicles that exploits LiDAR data, in addition to visual information. In the proposed approach, LiDAR data is utilized to generate region proposals by processing the three dimensional point cloud that it provides. These candidate regions are then further processed by a state-of-the-art CNN classifier that we have fine-tuned for pedestrian detection. We have extensively evaluated the proposed detection process on the KITTI dataset. The experimental results show that the proposed LiDAR space clustering approach provides a very efficient way of generating region proposals leading to higher recall rates and fewer misses for pedestrian detection. This indicates that LiDAR data can provide auxiliary information for CNN-based approaches.

...read moreread less

Proceedings Article•DOI•

Comparison of convolutional neural network models for document image classification

[...]

Doggucan Yaman¹, Fevziye Irem Eyiokur¹, Hazim Kemal Ekenel¹•Institutions (1)

Istanbul Technical University¹

15 May 2017

TL;DR: This study used state-of-the-art convolutional neural network models to satisfy the need for correctly labeled and classified documents for their need to be archived in an accessible manner.

...read moreread less

Abstract: Despite the increase in digitization, the use of documents is still very common today. It is essential that these documents are correctly labeled and classified for their need to be archived in an accessible manner. In this study, we used state-of-the-art convolutional neural network models to satisfy this need. Convolutional Neural Networks achieve high performance compared to alternative methods in the field of classification, due to the strong and rich features they can learn from large data through deep architecture. For the experiments, we have used a dataset containing 400,000 images of 16 different document classes. The state-of-the-art deep learning models have been fine-tuned and compared in detail. VGG-16 architecture has achieved the best performance on this dataset with 90.93% correct classification rate.

...read moreread less

Proceedings Article•DOI•

Distinguishing levels of challenge from physiological signals for the robot-assisted rehabilitation system, RehabRoby

[...]

Yunus Palaska¹, Huseyin Erdogan², Hazim Kemal Ekenel³, Engin Masazade¹, Duygun Erol Barkana¹ - Show less +1 more•Institutions (3)

Yeditepe University¹, Middle East Technical University², Istanbul Technical University³

01 Apr 2017

TL;DR: This paper aims to distinguish whether the subject is under-challenged or over-Challenged using psychophysiological signal data collected from biofeedback sensors while executing the tasks with RehabRoby.

...read moreread less

Abstract: Investigation into robot-assisted rehabilitation systems, and robot-assisted systems that are capable of detecting patient's emotions and then modifying the rehabilitation task to better suit the patients' abilities by taking account their emotions have gained momentum in recent years. In this paper, our aim is to distinguish whether the subject is under-challenged or over-challenged using psychophysiological signal data collected from biofeedback sensors while executing the tasks with RehabRoby. Initially, features are extracted from the physiological signals (Blood Volume Pulse (BVP), Skin Conductance (SC), and Skin Temperature (ST)). The extracted features are examined in terms of their contribution to the classification of the overstressed/over-challenged, boredom/under-challenged using variance analysis (ANOVA). The most significant features are selected, and various classification methods are used to classify overstressed/over-challenged, boredom/under-challenged.

...read moreread less

Proceedings Article•DOI•

Combining Multiple Views for Visual Speech Recognition

[...]

Marina Zimmermann, Mostafa Mehdipour Ghazi, Hazim Kemal Ekenel, Jean-Philippe Thiran

25 Aug 2017

TL;DR: Results show that the complementary information contained in recordings from different view angles improves the results significantly and the sentence correctness on the test set is increased from 76% for the highest performing single view to up to 83% when combining this view with the frontal and $60^\circ$ view angles.

...read moreread less

Abstract: Visual speech recognition is a challenging research problem with a particular practical application of aiding audio speech recognition in noisy scenarios. Multiple camera setups can be beneficial for the visual speech recognition systems in terms of improved performance and robustness. In this paper, we explore this aspect and provide a comprehensive study on combining multiple views for visual speech recognition. The thorough analysis covers fusion of all possible view angle combinations both at feature level and decision level. The employed visual speech recognition system in this study extracts features through a PCA-based convolutional neural network, followed by an LSTM network. Finally, these features are processed in a tandem system, being fed into a GMM-HMM scheme. The decision fusion acts after this point by combining the Viterbi path log-likelihoods. The results show that the complementary information contained in recordings from different view angles improves the results significantly. For example, the sentence correctness on the test set is increased from 76% for the highest performing single view (30 degrees) to up to 83% when combining this view with the frontal and (60 degrees) view angles

...read moreread less

Posted Content•

Combining Multiple Views for Visual Speech Recognition

[...]

Marina Zimmermann, Mostafa Mehdipour Ghazi, Hazim Kemal Ekenel, Jean-Philippe Thiran

19 Oct 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a comprehensive study on combining multiple views for visual speech recognition is presented, which covers fusion of all possible view angle combinations both at feature level and decision level, and the results show that complementary information contained in recordings from different view angles improves the results significantly.

...read moreread less

Abstract: Visual speech recognition is a challenging research problem with a particular practical application of aiding audio speech recognition in noisy scenarios. Multiple camera setups can be beneficial for the visual speech recognition systems in terms of improved performance and robustness. In this paper, we explore this aspect and provide a comprehensive study on combining multiple views for visual speech recognition. The thorough analysis covers fusion of all possible view angle combinations both at feature level and decision level. The employed visual speech recognition system in this study extracts features through a PCA-based convolutional neural network, followed by an LSTM network. Finally, these features are processed in a tandem system, being fed into a GMM-HMM scheme. The decision fusion acts after this point by combining the Viterbi path log-likelihoods. The results show that the complementary information contained in recordings from different view angles improves the results significantly. For example, the sentence correctness on the test set is increased from 76% for the highest performing single view ($30^\circ$) to up to 83% when combining this view with the frontal and $60^\circ$ view angles.

...read moreread less

Proceedings Article•DOI•

Facial expression pair matching

[...]

Deniz Engin, Hazim Kemal Ekenel¹, Bilgisayar Muhendisligi Bolumu¹•Institutions (1)

Istanbul Technical University¹

01 May 2017

TL;DR: The Extended Cohn-Kanade (CK+) dataset which is commonly used for classification of facial expression is chosen and match and mismatch facial expressions are classified by using support vector machines to provide a baseline approach for the proposed pair matching formulation.

...read moreread less

Abstract: In this study, facial expression recognition is defined as a pair matching problem. Our objectives to formulate this talk in this way are to be able to decide whether the facial expressions of the unlabeled images of two people are the same or different and to benefit from the proposed pair matching methods that have been studied for many years in the face recognition field. The Extended Cohn-Kanade (CK+) dataset which is commonly used for classification of facial expression is chosen to obtain match and mismatch pairs. To provide a baseline approach for the proposed pair matching formulation, in our paper, feature extraction by using local binary pattern is applied and match and mismatch facial expressions are classified by using support vector machines. 99.28% matching accuracy was achieved.

...read moreread less

Posted Content•

Exploiting Convolution Filter Patterns for Transfer Learning

[...]

Mehmet Aygun¹, Yusuf Aytar², Hazim Kemal Ekenel³•Institutions (3)

Technische Universität München¹, Massachusetts Institute of Technology², École Polytechnique Fédérale de Lausanne³

23 Aug 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, Gaussian Mixture Models (GMMs) are used to capture statistical relationships among convolution filters learned from a well-trained network and transfer this knowledge to another network.

...read moreread less

Abstract: In this paper, we introduce a new regularization technique for transfer learning. The aim of the proposed approach is to capture statistical relationships among convolution filters learned from a well-trained network and transfer this knowledge to another network. Since convolution filters of the prevalent deep Convolutional Neural Network (CNN) models share a number of similar patterns, in order to speed up the learning procedure, we capture such correlations by Gaussian Mixture Models (GMMs) and transfer them using a regularization term. We have conducted extensive experiments on the CIFAR10, Places2, and CMPlaces datasets to assess generalizability, task transferability, and cross-model transferability of the proposed approach, respectively. The experimental results show that the feature representations have efficiently been learned and transferred through the proposed statistical regularization scheme. Moreover, our method is an architecture independent approach, which is applicable for a variety of CNN architectures.

...read moreread less

Showing papers by "Hazim Kemal Ekenel published in 2017"