Venkatesh K. Subramanian
Other affiliations: Indian Institutes of Technology
Bio: Venkatesh K. Subramanian is an academic researcher from Indian Institute of Technology Kanpur. The author has contributed to research in topics: Mobile robot & Discriminator. The author has an hindex of 13, co-authored 40 publications receiving 565 citations. Previous affiliations of Venkatesh K. Subramanian include Indian Institutes of Technology.
TL;DR: Simulation results validate that the proposed AFERS is more efficient as compared to the existing approaches and the recognition results obtained from fused features are found to be distinctly superior to both recognition using individual features as well as recognition with a direct concatenation of the individual feature vectors.
Abstract: This paper presents a novel automatic facial expressions recognition system (AFERS) using the deep network framework. The proposed AFERS consists of four steps: 1) geometric features extraction; 2) regional local binary pattern (LBP) features extraction; 3) fusion of both the features using autoencoders; and 4) classification using Kohonen self-organizing map (SOM)-based classifier. This paper makes three distinct contributions. The proposed deep network consisting of autoencoders and the SOM-based classifier is computationally more efficient and performance wise more accurate. The fusion of geometric features with LBP features using autoencoders provides better representation of facial expression. The SOM-based classifier proposed in this paper has been improved by making use of a soft-threshold logic and a better learning algorithm. The performance of the proposed approach is validated on two widely used databases (DBs): 1) MMI and 2) extended Cohn-Kanade (CK+). An average recognition accuracy of 97.55% in MMI DB and 98.95% in CK+ DB are obtained using the proposed algorithm. The recognition results obtained from fused features are found to be distinctly superior to both recognition using individual features as well as recognition with a direct concatenation of the individual feature vectors. Simulation results validate that the proposed AFERS is more efficient as compared to the existing approaches.
TL;DR: The experimental results show that the proposed model is very efficient in recognizing six basic emotions while ensuring significant increase in average classification accuracy over radial basis function and multi-layered perceptron.
Abstract: This paper presents a novel emotion recognition model using the system identification approach. A comprehensive data driven model using an extended Kohonen self-organizing map (KSOM) has been developed whose input is a 26 dimensional facial geometric feature vector comprising eye, lip and eyebrow feature points. The analytical face model using this 26 dimensional geometric feature vector has been effectively used to describe the facial changes due to different expressions. This paper thus includes an automated generation scheme of this geometric facial feature vector. The proposed non-heuristic model has been developed using training data from MMI facial expression database. The emotion recognition accuracy of the proposed scheme has been compared with radial basis function network, multi-layered perceptron model and support vector machine based recognition schemes. The experimental results show that the proposed model is very efficient in recognizing six basic emotions while ensuring significant increase in average classification accuracy over radial basis function and multi-layered perceptron. It also shows that the average recognition rate of the proposed method is comparatively better than multi-class support vector machine. HighlightsWe propose an emotion recognition model using system identification.Twenty six dimensional geometric feature vector is extracted using three different algorithms.Classification using an intermediate Kohonen self-organizing map layer.A comparative study with Radial basis function, Multi-layer perceptron and Support vector machine.Efficient recognition results with significant increase in average recognition accuracy over radial basis function and multi-layer perceptron. Marginal improvement over support vector machine.
TL;DR: An improved version of a visual servo controller is proposed that uses feedback linearization to overcome the chattering phenomenon present in sliding mode-based controllers used previously.
Abstract: The ability to follow a human is an important requirement for a service robot designed to work along side humans in homes or in work places. This paper describes the development and implementation of a novel robust visual controller for the human-following robot. This visual controller consists of two parts: 1) a robust algorithm that tracks a human visible in its camera view and 2) a servo controller that generates necessary motion commands so that the robot can follow the target human. The tracking algorithm uses point-based features, like speeded up robust feature, to detect human under challenging conditions, such as, variation in illumination, pose change, full or partial occlusion, and abrupt camera motion. The novel contributions in the tracking algorithm include the following: 1) a dynamic object model that evolves over time to deal with short-term changes, while maintaining stability over long run; 2) an online K-D tree-based classifier along with a Kalman filter is used to differentiate a case of pose change from a case of partial or full occlusion; and 3) a method is proposed to detect pose change due to out-of-plane rotations , which is a difficult problem that leads to frequent tracking failures in a human following robot. An improved version of a visual servo controller is proposed that uses feedback linearization to overcome the chattering phenomenon present in sliding mode-based controllers used previously. The efficacy of the proposed approach is demonstrated through various simulations and real-life experiments with an actual mobile robot platform.
••01 Jan 2021
TL;DR: In this paper, the authors proposed a domain adaptation method based on a generative framework, where the trained classifier is used for generating samples from the source classes and a new classifier was also adapted for the target domain.
Abstract: Unsupervised Domain adaptation methods solve the adaptation problem for an unlabeled target set, assuming that the source dataset is available with all labels. However, the availability of actual source samples is not always possible in practical cases. It could be due to memory constraints, privacy concerns, and challenges in sharing data. This practical scenario creates a bottleneck in the domain adaptation problem. This paper addresses this challenging scenario by proposing a domain adaptation technique that does not need any source data. Instead of the source data, we are only provided with a classifier that is trained on the source data. Our proposed approach is based on a generative framework, where the trained classifier is used for generating samples from the source classes. We learn the joint distribution of data by using the energy-based modeling of the trained classifier. At the same time, a new classifier is also adapted for the target domain. We perform various ablation analysis under different experimental setups and demonstrate that the proposed approach achieves better results than the baseline models in this extremely novel scenario.
••30 Mar 2011
TL;DR: Novel techniques using the basic concepts of facial geometry, are proposed to locate the mouth position, nose position and eyes position, and an algorithm, using the H-plane of the HSV color space is proposed for detecting eye pupil from the eye detected region.
Abstract: Automatic detection of facial features in an image is important stage for various facial image interpretation work, such as face recognition, facial expression recognition, 3Dface modeling and facial features tracking. Detection of facial features like eye, pupil, mouth, nose, nostrils, lip corners, eye corners etc., with different facial expression and illumination is a challenging task. In this paper, we presented different methods for fully automatic detection of facial features. Viola-Jones' object detector along with haar-like cascaded features are used to detect face, eyes and nose. Novel techniques using the basic concepts of facial geometry, are proposed to locate the mouth position, nose position and eyes position. The estimation of detection region for features like eye, nose and mouth enhanced the detection accuracy significantly. An algorithm, using the H-plane of the HSV color space is proposed for detecting eye pupil from the eye detected region. FEI database of frontal face images is mainly used to test the algorithm. Proposed algorithm is tested over 100 frontal face images with two different facial expression (neutral face and smiling face). The results obtained are found to be 100% accurate for lip, lip corners, nose and nostrils detection. The eye corners, and eye pupil detection is giving approximately 95% accurate results.
TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.
Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.
01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.
TL;DR: The limitations of IoT for multimedia computing are explored and the relationship between the M-IoT and emerging technologies including event processing, feature extraction, cloud computing, Fog/Edge computing and Software-Defined-Networks (SDNs) is presented.
Abstract: The immense increase in multimedia-on-demand traffic that refers to audio, video, and images, has drastically shifted the vision of the Internet of Things (IoT) from scalar to Multimedia Internet of Things (M-IoT). IoT devices are constrained in terms of energy, computing, size, and storage memory. Delay-sensitive and bandwidth-hungry multimedia applications over constrained IoT networks require revision of IoT architecture for M-IoT. This paper provides a comprehensive survey of M-IoT with an emphasis on architecture, protocols, and applications. This article starts by providing a horizontal overview of the IoT. Then, we discuss the issues considering the characteristics of multimedia and provide a summary of related M-IoT architectures. Various multimedia applications supported by IoT are surveyed, and numerous use cases related to road traffic management, security, industry, and health are illustrated to show how different M-IoT applications are revolutionizing human life. We explore the importance of Quality-of-Experience (QoE) and Quality-of-Service (QoS) for multimedia transmission over IoT. Moreover, we explore the limitations of IoT for multimedia computing and present the relationship between the M-IoT and emerging technologies including event processing, feature extraction, cloud computing, Fog/Edge computing and Software-Defined-Networks (SDNs). We also present the need for better routing and Physical-Medium Access Control (PHY-MAC) protocols for M-IoT. Finally, we present a detailed discussion on the open research issues and several potential research areas related to emerging multimedia communication in IoT.
01 Jan 2010
TL;DR: This research investigates the combination of domain adaptation, dictionary learning, object recognition, activity recognition, and shape representation in machine learning to solve the challenge of sparse representation in signal/Image processing.
Abstract: Research Interests Security and privacy: Active authentication, biometrics template protection, biometrics recognition. Computer vision: Domain adaptation, dictionary learning, object recognition, activity recognition, shape representation. Machine learning: Dimensionality reduction, clustering, kernel methods, weakly-supervised learning. Signal/Image processing: Sparse representation, compressive sampling, synthetic aperture radar imaging, millimeter wave imaging.