Visual Traffic Surveillance: A Concise Survey
01 Jan 2020-pp 32-41
About: The article was published on 2020-01-01 and is currently open access. It has received 2 citations till now.
12 Mar 2021
TL;DR: A real-time architecture for pedestrian identification using motion-controlled DNN and a feature selection method named Bayesian modeling along with LSVM is proposed, which shows the effectiveness of the proposed approach.
Abstract: In the computer vision applications such as security surveillance and robotics, pedestrian identification shows much attention in the last decade. This is usually achieved by human biometrics. Besides human biometrics, sometimes it is required to identify pedestrians at a distance. This could be accomplished based on a fact of different whole-body appearances. The real-time pedestrian identification is a challenging task due to several factors such as illumination effects, noise, change in viewpoint, and video resolution. The more recent, the deep neural network (DNN) shows a massive performance for various real-world applications. In this article, we present a real-time architecture for pedestrian identification using motion-controlled DNN. In the proposed architecture, the motion vectors are calculating using optical flow and then utilized in the next step, named features extraction. Two types of features, such as HOG and DNN, are computing. The pre-trained VGG19 CNN model is employing and trained through transfer learning. The deep learning features are extracted from two layers—fully connected layers 7 and 8. Also, we proposed a feature selection method named Bayesian modeling along with LSVM. The best selected features of both HOG and DNN are finally fused in one matrix for final identification. The multi-class support vector machine classifier is used for final identification. The videos are recording in the real-time environment for the experimental process and achieve an average accuracy of 98.62%. Overall, identification accuracy shows the effectiveness of the proposed approach.
TL;DR: In this paper, an Artificial Neural Network (ANN) and Recurrent Neural Network-Long Short Term Memory (LSTM) based deep learning (DL) architectures have been proposed for analysing the data.
Abstract: Attention is the mental awareness of human on a particular object or a piece of information. The level of attention indicates how intense the focus is on an object or an instance. In this study, several types of human attention level have been observed. After introducing image segmentation and detection technique for facial features, eyeball movement and gaze estimation were measured. Eye movement were assessed using the video data, and a total of 10197 data instances were manually labelled for the attention level. Then Artificial Neural Network (ANN) and Recurrent Neural Network-Long Short Term Memory (LSTM) based Deep learning (DL) architectures have been proposed for analysing the data. Next, the trained DL model has been implanted into a robotic system that is capable of detecting various features; ultimately leading to the calculation of visual attention for reading, browsing, and writing purposes. This system is capable of checking the attention level of the participants and also can detect if participants are present or not. Based on a certain level of visual focus of attention (VFOA), this system interacts with the person, generates awareness and establishes verbal or visual communication with that person. The proposed ML techniques have achieved almost 99.24% validation accuracy and 99.43% test accuracy. It is also shown in the comparative study that, since the dataset volumes are limited, ANN is more suitable for attention level calculation than RNN-LSTM. We hope that the implemented robotic structure manifests the real-world implication of the proposed method.