scispace - formally typeset
Search or ask a question
Author

Umapada Pal

Other affiliations: University of Mysore
Bio: Umapada Pal is an academic researcher from Indian Statistical Institute. The author has contributed to research in topics: Feature extraction & Handwriting recognition. The author has an hindex of 47, co-authored 478 publications receiving 9925 citations. Previous affiliations of Umapada Pal include University of Mysore.


Papers
More filters
Book ChapterDOI
19 Oct 2020
TL;DR: In this article, the authors used different data representation and modelling techniques for capturing as much information as possible for the specific task of digit recognition, paving the way for Brain Computer Interfacing (BCI).
Abstract: After promising results of deep learning in numerous fields, researchers have started exploring Electroencephalography (EEG) data for human behaviour and emotion recognition tasks that have a wide range of practical applications. But it has been a huge challenge to study EEG data collected in a non-invasive manner due to its heterogeneity, vulnerability to various noise signals and variant nature to different subjects and mental states. Though several methods have been applied to classify EEG data for the aforementioned tasks, multi-class classification like digit recognition, using this type of data is yet to show satisfactory results. In this paper we have tried to address these issues using different data representation and modelling techniques for capturing as much information as possible for the specific task of digit recognition, paving the way for Brain Computer Interfacing (BCI). A public dataset collected using the MUSE headband with four electrodes (TP9, AF7, AF8, TP10) has been used for this work of categorising digits (0–9). Popular deep learning methodologies like CNN (Convolutional Neural Network) model on DWT (Discrete Wavelet Transform) scalogram, CNN model on connectivity matrix (mutual information of time series against another), MLP (Multilayer Perceptron) model on extracted statistical features from EEG signals and 1D CNN on time domain EEG signals have been well experimented with in this study. Additionally, methodologies like SVC (Support Vector Classifier), Random Forest and AdaBoost on extracted features have also been showcased. Nevertheless, the study provides an insight in choosing the best suited methodology for multi-class classification of EEG signals like digit recognition for further studies.
Journal ArticleDOI
09 Oct 2021
TL;DR: In this paper, a hybrid model for classification of action-oriented video images is presented, which reduces the complexity of the problem to improve text detection and recognition performance, and integrates the outputs of the three above-mentioned models using a fully connected neural network.
Abstract: For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity of the problem to improve text detection and recognition performance. Here, we consider the following five categories of genres, namely concert, cooking, craft, teleshopping and yoga. For classifying action-oriented video images, we explore ResNet50 for learning the general pixel-distribution level information and the VGG16 network is implemented for learning the features of Maximally Stable Extremal Regions and again another VGG16 is used for learning facial components obtained by a multitask cascaded convolutional network. The approach integrates the outputs of the three above-mentioned models using a fully connected neural network for classification of five action-oriented image classes. We demonstrated the efficacy of the proposed method by testing on our dataset and two other standard datasets, namely, Scene Text Dataset dataset which contains 10 classes of scene images with text information, and the Stanford 40 Actions dataset which contains 40 action classes without text information. Our method outperforms the related existing work and enhances the class-specific performance of text detection and recognition, significantly.
Posted Content
TL;DR: In this article, a mutual information optimization based loss function for contrastive learning is proposed, where they model contrastive learn into a binary classification problem to predict if a pair is positive or not.
Abstract: Self-supervised contrastive learning is one of the domains which has progressed rapidly over the last few years. Most of the state-of-the-art self-supervised algorithms use a large number of negative samples, momentum updates, specific architectural modifications, or extensive training to learn good representations. Such arrangements make the overall training process complex and challenging to realize analytically. In this paper, we propose a mutual information optimization based loss function for contrastive learning where we model contrastive learning into a binary classification problem to predict if a pair is positive or not. This formulation not only helps us to track the problem mathematically but also helps us to outperform existing algorithms. Unlike the existing methods that only maximize the mutual information in a positive pair, the proposed loss function optimizes the mutual information in both positive and negative pairs. We also present a mathematical expression for the parameter gradients flowing into the projector and the displacement of the feature vectors in the feature space. This helps us to get a mathematical insight into the working principle of contrastive learning. An additive $L_2$ regularizer is also used to prevent diverging of the feature vectors and to improve performance. The proposed method outperforms the state-of-the-art algorithms on benchmark datasets like STL-10, CIFAR-10, CIFAR-100. After only 250 epochs of pre-training, the proposed model achieves the best accuracy of 85.44\%, 60.75\%, 56.81\% on CIFAR-10, STL-10, CIFAR-100 datasets, respectively.

Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Reference EntryDOI
15 Oct 2004

2,118 citations