scispace - formally typeset
Search or ask a question
Author

Yousef Atoum

Other affiliations: Yarmouk University, General Motors
Bio: Yousef Atoum is an academic researcher from Michigan State University. The author has contributed to research in topics: Facial recognition system & Motion estimation. The author has an hindex of 9, co-authored 14 publications receiving 506 citations. Previous affiliations of Yousef Atoum include Yarmouk University & General Motors.

Papers
More filters
Proceedings ArticleDOI
TL;DR: A novel two-stream CNN-based approach for face anti-spoofing is proposed, by extracting the local features and holistic depth maps from the face images, which facilitate CNN to discriminate the spoof patches independent of the spatial face areas.
Abstract: The face image is the most accessible biometric modality which is used for highly accurate face recognition systems, while it is vulnerable to many different types of presentation attacks. Face anti-spoofing is a very critical step before feeding the face image to biometric systems. In this paper, we propose a novel two-stream CNN-based approach for face anti-spoofing, by extracting the local features and holistic depth maps from the face images. The local features facilitate CNN to discriminate the spoof patches independent of the spatial face areas. On the other hand, holistic depth map examine whether the input image has a face-like depth. Extensive experiments are conducted on the challenging databases (CASIA-FASD, MSU-USSA, and Replay Attack), with comparison to the state of the art.

349 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: In this article, a novel AutoEncoder framework is proposed to explicitly disentangle pose and appearance features from RGB imagery and the LSTM-based integration of pose features over time produces the gait feature.
Abstract: Gait, the walking pattern of individuals, is one of the most important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as the gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and view angle. To remedy this issue, we propose a novel AutoEncoder framework to explicitly disentangle pose and appearance features from RGB imagery and the LSTM-based integration of pose features over time produces the gait feature. In addition, we collect a Frontal-View Gait (FVG) dataset to focus on gait recognition from frontal-view walking, which is a challenging problem since it contains minimal gait cues compared to other views. FVG also includes other important variations,e.g., walking speed, carrying, and clothing. With extensive experiments on CASIA-B, USF and FVG datasets, our method demonstrates superior performance to the-state-of-the-arts quantitatively, the ability of feature disentanglement qualitatively, and promising computational efficiency.

170 citations

Journal ArticleDOI
TL;DR: A multimedia analytics system that performs automatic online exam proctoring and collects multimedia (audio and visual) data from subjects performing various types of cheating while taking online exams is presented.
Abstract: Massive open online courses and other forms of remote education continue to increase in popularity and reach. The ability to efficiently proctor remote online examinations is an important limiting factor to the scalability of this next stage in education. Presently, human proctoring is the most common approach of evaluation, by either requiring the test taker to visit an examination center, or by monitoring them visually and acoustically during exams via a webcam. However, such methods are labor intensive and costly. In this paper, we present a multimedia analytics system that performs automatic online exam proctoring. The system hardware includes one webcam, one wearcam, and a microphone for the purpose of monitoring the visual and acoustic environment of the testing location. The system includes six basic components that continuously estimate the key behavior cues: user verification, text detection, voice detection, active window detection, gaze estimation, and phone detection. By combining the continuous estimation components, and applying a temporal sliding window, we design higher level features to classify whether the test taker is cheating at any moment during the exam. To evaluate our proposed system, we collect multimedia (audio and visual) data from $\text{24}$ subjects performing various types of cheating while taking online exams. Extensive experimental results demonstrate the accuracy, robustness, and efficiency of our online exam proctoring system.

142 citations

Journal ArticleDOI
TL;DR: This paper introduces an efficient visual signal processing system to continuously control the feeding process of fish in aquaculture tanks by controlling the amount of feed at an optimal rate using a two-stage approach.
Abstract: This paper introduces an efficient visual signal processing system to continuously control the feeding process of fish in aquaculture tanks. The aim is to improve the production profit in fish farms by controlling the amount of feed at an optimal rate. The automatic feeding control includes two components: 1) a continuous decision on whether the fish are actively consuming feed, and 2) automatic detection of the number of excess feed populated on the water surface of the tank using a two-stage approach. The amount of feed is initially detected using the correlation filer applied to an optimum local region within the video frame, and then followed by a SVM-based refinement classifier to suppress the falsely detected feed. Having both measures allows us to accurately control the feeding process in an automated manner. Experimental results show that our system can accurately and efficiently estimate both measures.

66 citations

Posted Content
TL;DR: A novel AutoEncoder framework to explicitly disentangle pose and appearance features from RGB imagery and the LSTM-based integration of pose features over time produces the gait feature is proposed.
Abstract: Gait, the walking pattern of individuals, is one of the most important biometrics modalities. Most of the existing gait recognition methods take silhouettes or articulated body models as the gait features. These methods suffer from degraded recognition performance when handling confounding variables, such as clothing, carrying and view angle. To remedy this issue, we propose a novel AutoEncoder framework to explicitly disentangle pose and appearance features from RGB imagery and the LSTM-based integration of pose features over time produces the gait feature. In addition, we collect a Frontal-View Gait (FVG) dataset to focus on gait recognition from frontal-view walking, which is a challenging problem since it contains minimal gait cues compared to other views. FVG also includes other important variations, e.g., walking speed, carrying, and clothing. With extensive experiments on CASIA-B, USF and FVG datasets, our method demonstrates superior performance to the state of the arts quantitatively, the ability of feature disentanglement qualitatively, and promising computational efficiency.

36 citations


Cited by
More filters
01 Jan 2006

3,012 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: A novel density-aware multi-stream densely connected convolutional neural network-based algorithm, called DID-MDN, for joint rain density estimation and de-raining, which achieves significant improvements over the recent state-of-the-art methods.
Abstract: Single image rain streak removal is an extremely challenging problem due to the presence of non-uniform rain densities in images. We present a novel density-aware multi-stream densely connected convolutional neural network-based algorithm, called DID-MDN, for joint rain density estimation and de-raining. The proposed method enables the network itself to automatically determine the rain-density information and then efficiently remove the corresponding rain-streaks guided by the estimated rain-density label. To better characterize rain-streaks with different scales and shapes, a multi-stream densely connected de-raining network is proposed which efficiently leverages features from different scales. Furthermore, a new dataset containing images with rain-density labels is created and used to train the proposed density-aware network. Extensive experiments on synthetic and real datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. In addition, an ablation study is performed to demonstrate the improvements obtained by different modules in the proposed method. The code can be downloaded at https://github.com/hezhangsprinter/DID-MDN

535 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper argues the importance of auxiliary supervision to guide the learning toward discriminative and generalizable cues, and introduces a new face anti-spoofing database that covers a large range of illumination, subject, and pose variations.
Abstract: Face anti-spoofing is crucial to prevent face recognition systems from a security breach. Previous deep learning approaches formulate face anti-spoofing as a binary classification problem. Many of them struggle to grasp adequate spoofing cues and generalize poorly. In this paper, we argue the importance of auxiliary supervision to guide the learning toward discriminative and generalizable cues. A CNN-RNN model is learned to estimate the face depth with pixel-wise supervision, and to estimate rPPG signals with sequence-wise supervision. The estimated depth and rPPG are fused to distinguish live vs. spoof faces. Further, we introduce a new face anti-spoofing database that covers a large range of illumination, subject, and pose variations. Experiments show that our model achieves the state-of-the-art results on both intra- and cross-database testing.

502 citations

Journal ArticleDOI
TL;DR: The best model is the deep VGG16 model trained with transfer learning, which yields an overall accuracy of 90.4% on the hold-out test set.
Abstract: Automatic and accurate estimation of disease severity is essential for food security, disease management, and yield loss prediction. Deep learning, the latest breakthrough in computer vision, is promising for fine-grained disease severity classification, as the method avoids the labor-intensive feature engineering and threshold-based segmentation. Using the apple black rot images in the PlantVillage dataset, which are further annotated by botanists with four severity stages as ground truth, a series of deep convolutional neural networks are trained to diagnose the severity of the disease. The performances of shallow networks trained from scratch and deep models fine-tuned by transfer learning are evaluated systemically in this paper. The best model is the deep VGG16 model trained with transfer learning, which yields an overall accuracy of 90.4% on the hold-out test set. The proposed deep learning model may have great potential in disease control for modern agriculture.

412 citations