scispace - formally typeset
Search or ask a question
Author

Mohit Agarwal

Bio: Mohit Agarwal is an academic researcher from Indraprastha Institute of Information Technology. The author has contributed to research in topics: Automatic summarization & Object detection. The author has an hindex of 2, co-authored 4 publications receiving 13 citations.

Papers
More filters
Proceedings ArticleDOI
01 Jun 2019
TL;DR: A new multimodal dataset is presented, which provides data for deception detection by the aid of various modalities, such as video, audio, EEG and gazeData explores the cognitive aspect of deception and combines it with vision.
Abstract: Deception detection is a pervasive issue in security. It has been widely studied using traditional modalities, such as video, audio and transcripts; however, there has been a lack of investigation in using modalities such as EEG and Gaze data due to the scarcity of a publicly available dataset. In this paper, a new multimodal dataset is presented, which provides data for deception detection by the aid of various modalities, such as video, audio, EEG and gaze data. The dataset explores the cognitive aspect of deception and combines it with vision. The presented dataset is collected in a realistic scenario and has 35 unique subjects providing 325 annotated data points with an even distribution of truth (163) and lie (162). The benefits provided by incorporating multiple modalities for fusion on the proposed dataset is also investigated. It is our assertion that the availability of this dataset will facilitate the development of better deception detection algorithms which are more relevant to real world scenarios.

35 citations

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A simple approach to video summarization using Kernel Temporal Segmentation (KTS) for shot segmentation and a global attention based modified memory network module with LSTM for shot score learning is presented.
Abstract: Videos are one of the most engaging and interesting mediums of effective information delivery and constitute the majority of the content generated online today. As human attention span shrinks, it is imperative to shorten videos while maintaining most of its information. The premier challenge is that summaries more intuitive to a human are difficult for machines to generalize. We present a simple approach to video summarization using Kernel Temporal Segmentation (KTS) for shot segmentation and a global attention based modified memory network module with LSTM for shot score learning. The modified memory network termed as Global Attention Memory Module (GAMM) increases the learning capability of the model and with the addition of LSTM, it is further able to learn better contextual features. Experiments on the benchmark datasets TVSum and SumMe show that our method outperforms the current state of the art by about 15%.

6 citations

Book ChapterDOI
08 Sep 2018
TL;DR: Experimental results on a primate dataset of over 80 identities show the effect of bias in this research problem, and whether the knowledge of human faces and recent methods learned from human face detection and recognition can be extended to primate faces.
Abstract: Deforestation and loss of habitat have resulted in rapid decline of certain species of primates in forests. On the other hand, uncontrolled growth of a few species of primates in urban areas has led to safety issues and nuisance for the local residents. Hence, identifying individual primates has become the need of the hour - not only for conservation and effective mitigation in the wild but also in zoological parks and wildlife sanctuaries. Primates and human faces share a lot of common features like position and shape of eyes, nose and mouth. It is worth exploring whether the knowledge of human faces and recent methods learned from human face detection and recognition can be extended to primate faces. However, similar challenges relating to bias in human faces will also occur in primates. The quality and orientation of primate images along with different species of primates - ranging from monkeys to gorillas and chimpanzees will contribute to bias in effective detection and recognition. Experimental results on a primate dataset of over 80 identities show the effect of bias in this research problem.

5 citations

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A novel Triplet Transform Learning (TTL) model for learning discriminative representations of primate faces is proposed, where it outperforms the existing approaches and attains state-of-the-art performance on the primates database.
Abstract: Automated primate face recognition has enormous potential in effective conservation of species facing endangerment or extinction. The task is characterized by lack of training data, low inter-class variations, and large intra-class differences. Owing to the challenging nature of the problem, limited research has been performed to automate the process of primate face recognition. In this research, we propose a novel Triplet Transform Learning (TTL) model for learning discriminative representations of primate faces. The proposed model reduces the intra-class variations and increases the inter-class variations to obtain robust sparse representations for the primate faces. It is utilized to present a novel framework for primate face recognition, which is evaluated on the primate dataset, comprising of 80 identities including monkeys, gorillas, and chimpanzees. Experimental results demonstrate the efficacy of the proposed approach, where it outperforms the existing approaches and attains state-of-the-art performance on the primates database.

1 citations


Cited by
More filters
Journal ArticleDOI
05 May 2021
TL;DR: This work surveys research on bias and unfairness in several computer science domains, distinguishing between data management publications and other domains, and argues for a novel data-centered approach overcoming the limitations of current algorithmic-centered methods.
Abstract: The increasing use of data-driven decision support systems in industry and governments is accompanied by the discovery of a plethora of bias and unfairness issues in the outputs of these systems. Multiple computer science communities, and especially machine learning, have started to tackle this problem, often developing algorithmic solutions to mitigate biases to obtain fairer outputs. However, one of the core underlying causes for unfairness is bias in training data which is not fully covered by such approaches. Especially, bias in data is not yet a central topic in data engineering and management research. We survey research on bias and unfairness in several computer science domains, distinguishing between data management publications and other domains. This covers the creation of fairness metrics, fairness identification, and mitigation methods, software engineering approaches and biases in crowdsourcing activities. We identify relevant research gaps and show which data management activities could be repurposed to handle biases and which ones might reinforce such biases. In the second part, we argue for a novel data-centered approach overcoming the limitations of current algorithmic-centered methods. This approach focuses on eliciting and enforcing fairness requirements and constraints on data that systems are trained, validated, and used on. We argue for the need to extend database management systems to handle such constraints and mitigation methods. We discuss the associated future research directions regarding algorithms, formalization, modelling, users, and systems.

25 citations

Proceedings ArticleDOI
TL;DR: The Deception Detection and Physiological Monitoring (DDPM) dataset as mentioned in this paper contains almost 13 hours of recordings of 70 subjects, and over 8 million visible-light, near-infrared, and thermal video frames, along with appropriate meta, audio and pulse oximeter data.
Abstract: We present the Deception Detection and Physiological Monitoring (DDPM) dataset and initial baseline results on this dataset. Our application context is an interview scenario in which the interviewee attempts to deceive the interviewer on selected responses. The interviewee is recorded in RGB, near-infrared, and long-wave infrared, along with cardiac pulse, blood oxygenation, and audio. After collection, data were annotated for interviewer/interviewee, curated, ground-truthed, and organized into train / test parts for a set of canonical deception detection experiments. Baseline experiments found random accuracy for micro-expressions as an indicator of deception, but that saccades can give a statistically significant response. We also estimated subject heart rates from face videos (remotely) with a mean absolute error as low as 3.16 bpm. The database contains almost 13 hours of recordings of 70 subjects, and over 8 million visible-light, near-infrared, and thermal video frames, along with appropriate meta, audio and pulse oximeter data. To our knowledge, this is the only collection offering recordings of five modalities in an interview scenario that can be used in both deception detection and remote photoplethysmography research.

17 citations

Posted Content
TL;DR: This work presents a data-driven deep neural algorithm, DeceptiveWalk, that is the first algorithm to detect deceptive behavior using non-verbal cues of gait and gesture and trains an LSTM-based deep neural network to obtain deep features.
Abstract: We present a data-driven deep neural algorithm for detecting deceptive walking behavior using nonverbal cues like gaits and gestures. We conducted an elaborate user study, where we recorded many participants performing tasks involving deceptive walking. We extract the participants' walking gaits as series of 3D poses. We annotate various gestures performed by participants during their tasks. Based on the gait and gesture data, we train an LSTM-based deep neural network to obtain deep features. Finally, we use a combination of psychology-based gait, gesture, and deep features to detect deceptive walking with an accuracy of 88.41%. This is an improvement of 10.6% over handcrafted gait and gesture features and an improvement of 4.7% and 9.2% over classifiers based on the state-of-the-art emotion and action classification algorithms, respectively. Additionally, we present a novel dataset, DeceptiveWalk, that contains gaits and gestures with their associated deception labels. To the best of our knowledge, ours is the first algorithm to detect deceptive behavior using non-verbal cues of gait and gesture.

15 citations

Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed a novel deep convolution neural network (DCNN) named LieNet to detect the multiscale variations of deception automatically, which is a combination of contact and noncontact-based approaches.
Abstract: Nowadays, automatic deception detection has received considerable attention in the machine learning community owing to this research interest to its vast applications in the fields of social media, interviews, law enforcement, and the military. In this study, a novel deep convolution neural network (DCNN) named LieNet is proposed to precisely detect the multiscale variations of deception automatically. Our approach is a combination of contact and noncontact-based approaches. First, 20 frames from each video are fetched and concatenated to form a single image. Moreover, an audio signal is extracted from video and treated as image input by plotting the signal into 2-D plane. Furthermore, 13 channels of electroencephalogram signals are plotted into 2-D plane and concatenated to generate an image. Second, the LieNet model extracts features from each modality separately. Third, scores are estimated using a softmax classifier for all the modalities. Finally, three scores are combined using score level fusion to obtain a score, which gives support in favor of either deception or truth. The LieNet is validated on the “Bag-of-Lies (BoL),” “ real-life (RL) trail,” and “Miami University Deception Detection (MU3D)” databases by considering four evaluation indexes, viz., accuracy, precision, recall, and F1-score. Experimental outcomes depict that the LieNet defeats an initial work on Set-A and Set-B of the BoL database with average accuracies of 95.91% and 96.04%. respectively. The accuracies obtained by the LieNet are 97% and 98% on RL trail and MU3D databases respectively.

13 citations

Journal ArticleDOI
TL;DR: Results show that attentional LSTM networks are able to adequately model the gaze and speech feature sequences, outperforming a reference Support Vector Machine (SVM)-based system with compact features, suggesting that gaze and speeches carry complementary information for the task of deception detection that can be effectively exploited by using LSTMs.
Abstract: The automatic detection of deceptive behaviors has recently attracted the attention of the research community due to the variety of areas where it can play a crucial role, such as security or criminology. This work is focused on the development of an automatic deception detection system based on gaze and speech features. The first contribution of our research on this topic is the use of attention Long Short-Term Memory (LSTM) networks for single-modal systems with frame-level features as input. In the second contribution, we propose a multimodal system that combines the gaze and speech modalities into the LSTM architecture using two different combination strategies: Late Fusion and Attention-Pooling Fusion. The proposed models are evaluated over the Bag-of-Lies dataset, a multimodal database recorded in real conditions. On the one hand, results show that attentional LSTM networks are able to adequately model the gaze and speech feature sequences, outperforming a reference Support Vector Machine (SVM)-based system with compact features. On the other hand, both combination strategies produce better results than the single-modal systems and the multimodal reference system, suggesting that gaze and speech modalities carry complementary information for the task of deception detection that can be effectively exploited by using LSTMs.

10 citations