scispace - formally typeset
Search or ask a question
Author

Domenick Poster

Other affiliations: Johns Hopkins University
Bio: Domenick Poster is an academic researcher from West Virginia University. The author has contributed to research in topics: Overfitting & Facial recognition system. The author has an hindex of 3, co-authored 7 publications receiving 26 citations. Previous affiliations of Domenick Poster include Johns Hopkins University.

Papers
More filters
Proceedings ArticleDOI
01 Jan 2021
TL;DR: This paper presents extensive benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.
Abstract: Thermal face imagery, which captures the naturally emitted heat from the face, is limited in availability compared to face imagery in the visible spectrum. To help address this scarcity of thermal face imagery for research and algorithm development, we present the DEVCOM Army Research Laboratory Visible-Thermal Face Dataset (ARL-VTF). With over 500,000 images from 395 subjects, the ARL-VTF dataset represents, to the best of our knowledge, the largest collection of paired visible and thermal face images to date. The data was captured using a modern long wave infrared (LWIR) camera mounted alongside a stereo setup of three visible spectrum cameras. Variability in expressions, pose, and eyewear has been systematically recorded. The dataset has been curated with extensive annotations, metadata, and standardized protocols for evaluation. Furthermore, this paper presents extensive benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.

35 citations

Proceedings ArticleDOI
16 Jun 2019
TL;DR: The ability of modern landmark detection algorithms to cope with the adversarial conditions present in the thermal domain is analyzed by exploring the strengths and weaknesses of three deep-learning based landmark detection architectures originally developed for visible images: the Deep Alignment Network (DAN), Multi-task Convolutional Neural Network (MTCNN), and a Multi-class Patch-based fullyconvolutional neural network (PBC).
Abstract: Thermal-to-visible face recognition is an emerging technology for low-light and nighttime human identification, for which detection of fiducial landmarks is a critical step required for face alignment prior to recognition. However, thermal images with their low contrast, low resolution, and lack of textural information have proven a challenging obstacle for the detection of the fiducial landmarks used for image alignment. This paper analyzes the ability of modern landmark detection algorithms to cope with the adversarial conditions present in the thermal domain by exploring the strengths and weaknesses of three deep-learning based landmark detection architectures originally developed for visible images: the Deep Alignment Network (DAN), Multi-task Convolutional Neural Network (MTCNN), and a Multi-class Patch-based fullyconvolutional neural network (PBC). Our experiments yield a normalized mean squared error of 0.04 at an offset distance of 2.5 meters using the DAN architecture, indicating an ability for cascaded shape regression neural networks to adapt to thermal images. However, we find that even small alignment errors disproportionately reduce correct recognition rates. With images aligned using the best performing model, an 8.2% drop in EER is observed as compared with ground truth alignments, leaving further room for improvement in this area.

16 citations

Journal ArticleDOI
TL;DR: In this article, a coupled convolutional network architecture was proposed to leverage visible face data when training a model for thermal-only face landmark detection, which achieved a 65% - 95% improvement on the DEVCOM ARL Multi-modal Thermal Face Dataset and a 4% improvement over the baseline model.
Abstract: There has been increasing interest in face recognition in the thermal infrared spectrum A critical step in this process is face landmark detection However, landmark detection in the thermal spectrum presents a unique set of challenges compared to in the visible spectrum: inherently lower spatial resolution due to longer wavelength, differences in phenomenology, and limited availability of labeled thermal face imagery for algorithm development and training Thermal infrared imaging does have the advantage of being able to passively acquire facial heat signatures without the need for active or ambient illumination in low light and nighttime environments In such scenarios, thermal imaging must operate by itself without corresponding/paired visible imagery Mindful of this constraint, we propose visible-to-thermal parameter transfer learning using a coupled convolutional network architecture as a means to leverage visible face data when training a model for thermal-only face landmark detection This differentiates our approach from models trained either solely on thermal images or models which require a fusion of visible and thermal images at test time In this work, we implement and analyze four types of parameter transfer learning methods in the context of thermal face landmark detection: Siamese (shared) layers, Linear Layer Regularization (LLR), Linear Kernel Regularization (LKR), and Residual Parameter Transformations (RPT) These transfer learning approaches are compared against a baseline version of the network and an Active Appearance Model (AAM), both of which are trained only on thermal data We achieve a 65% - 95% improvement on the DEVCOM ARL Multi-modal Thermal Face Dataset and a 4% improvement on the RWTH Aachen University Thermal Face Dataset over the baseline model We show that LLR, LKR, and RPT all result in improved thermal face landmark detection performance compared to the baseline and AAM, demonstrating that transfer learning leveraging visible spectrum data improves thermal face landmarking

13 citations

Proceedings ArticleDOI
01 Sep 2018
TL;DR: This approach builds upon existing hand-crafted image features and neural network architectures by optimally selecting and combining the most useful set of features into a single model and achieves roughly a four times increase in performance over the state-of-the-art on three benchmark textured lens datasets.
Abstract: Distinguishing between images of irises wearing textured lenses versus those wearing transparent lenses or no lenses is a challenging problem due to the subtle and fine-grained visual differences. Our approach builds upon existing hand-crafted image features and neural network architectures by optimally selecting and combining the most useful set of features into a single model. We build multiple, parallel subnetworks corresponding to the various feature descriptors and learn the best subset of features through group sparsity. We avoid overfitting such a wide and deep model through a selective transfer learning technique and a novel group Dropout regularization strategy. This model achieves roughly a four times increase in performance over the state-of-the-art on three benchmark textured lens datasets and equals the near-perfect state-of-the-art accuracy on two others. Furthermore, the generic nature of the architecture allows it to be extended to other image features, forms of spoofing attacks, or problem domains.

11 citations

Posted Content
TL;DR: In this paper, an algorithm agnostic meta-learning framework is proposed to improve existing UDA methods instead of proposing a new UDA strategy, which facilitates the adaptation process with fine updates without overfitting or getting stuck at local optima.
Abstract: Object detectors trained on large-scale RGB datasets are being extensively employed in real-world applications. However, these RGB-trained models suffer a performance drop under adverse illumination and lighting conditions. Infrared (IR) cameras are robust under such conditions and can be helpful in real-world applications. Though thermal cameras are widely used for military applications and increasingly for commercial applications, there is a lack of robust algorithms to robustly exploit the thermal imagery due to the limited availability of labeled thermal data. In this work, we aim to enhance the object detection performance in the thermal domain by leveraging the labeled visible domain data in an Unsupervised Domain Adaptation (UDA) setting. We propose an algorithm agnostic meta-learning framework to improve existing UDA methods instead of proposing a new UDA strategy. We achieve this by meta-learning the initial condition of the detector, which facilitates the adaptation process with fine updates without overfitting or getting stuck at local optima. However, meta-learning the initial condition for the detection scenario is computationally heavy due to long and intractable computation graphs. Therefore, we propose an online meta-learning paradigm which performs online updates resulting in a short and tractable computation graph. To this end, we demonstrate the superiority of our method over many baselines in the UDA setting, producing a state-of-the-art thermal detector for the KAIST and DSIAC datasets.

4 citations


Cited by
More filters
Proceedings ArticleDOI
01 Jan 2021
TL;DR: This paper presents extensive benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.
Abstract: Thermal face imagery, which captures the naturally emitted heat from the face, is limited in availability compared to face imagery in the visible spectrum. To help address this scarcity of thermal face imagery for research and algorithm development, we present the DEVCOM Army Research Laboratory Visible-Thermal Face Dataset (ARL-VTF). With over 500,000 images from 395 subjects, the ARL-VTF dataset represents, to the best of our knowledge, the largest collection of paired visible and thermal face images to date. The data was captured using a modern long wave infrared (LWIR) camera mounted alongside a stereo setup of three visible spectrum cameras. Variability in expressions, pose, and eyewear has been systematically recorded. The dataset has been curated with extensive annotations, metadata, and standardized protocols for evaluation. Furthermore, this paper presents extensive benchmark results and analysis on thermal face landmark detection and thermal-to-visible face verification by evaluating state-of-the-art models on the ARL-VTF dataset.

35 citations

Posted Content
TL;DR: SpeakingFaces is presented as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition.
Abstract: We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human-computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases. Data were collected from 142 subjects, yielding over 13,000 instances of synchronized data (~3.8 TB). For technical validation, we demonstrate two baseline examples. The first baseline shows classification by gender, utilizing different combinations of the three data streams in both clean and noisy environments. The second example consists of thermal-to-visual facial image translation, as an instance of domain transfer.

22 citations

Journal ArticleDOI
16 May 2021-Sensors
TL;DR: The SpeakingFaces dataset as discussed by the authors is a large-scale multimodal dataset consisting of aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases.
Abstract: We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases. Data were collected from 142 subjects, yielding over 13,000 instances of synchronized data (∼3.8 TB). For technical validation, we demonstrate two baseline examples. The first baseline shows classification by gender, utilizing different combinations of the three data streams in both clean and noisy environments. The second example consists of thermal-to-visual facial image translation, as an instance of domain transfer.

20 citations

Journal ArticleDOI
24 Sep 2019-Sensors
TL;DR: A system that utilizes a range of image processing algorithms to allow fully automated thermal face analysis under both laboratory and real-world conditions and achieves a performance comparable to current stand-alone state-of-the-art methods for thermal face and landmark datection.
Abstract: We present a system that utilizes a range of image processing algorithms to allow fully automated thermal face analysis under both laboratory and real-world conditions. We implement methods for face detection, facial landmark detection, face frontalization and analysis, combining all of these into a fully automated workflow. The system is fully modular and allows implementing own additional algorithms for improved performance or specialized tasks. Our suggested pipeline contains a histogtam of oriented gradients support vector machine (HOG-SVM) based face detector and different landmark detecion methods implemented using feature-based active appearance models, deep alignment networks and a deep shape regression network. Face frontalization is achieved by utilizing piecewise affine transformations. For the final analysis, we present an emotion recognition system that utilizes HOG features and a random forest classifier and a respiratory rate analysis module that computes average temperatures from an automatically detected region of interest. Results show that our combined system achieves a performance which is comparable to current stand-alone state-of-the-art methods for thermal face and landmark datection and a classification accuracy of 65.75% for four basic emotions.

18 citations

Journal ArticleDOI
01 Aug 2020
TL;DR: Three distinct models based on the ensemble of convolutional and residual blocks are proposed to enrich heterogeneous (cross-sensor) iris recognition and it is inferred that the proposed approach constitutes vital discerning iris features and can recognize that the micro-patterns exist inside the iris region.
Abstract: Despite the prominent advancements in iris recognition, unconstrained image acquisition through heterogeneous sensors has been a major obstacle in applying it for large-scale applications. In recent years, deep convolutional networks have achieved remarkable performance in the field of computer vision and have been employed in iris applications. In this study, three distinct models based on the ensemble of convolutional and residual blocks are proposed to enrich heterogeneous (cross-sensor) iris recognition. In order to analyze their quantitative performances, extensive experiments are carried out on two publicly available iris databases, ND-iris-0405 dataset and ND-CrossSensor-Iris-2013 dataset. Further, the final model has been scrutinized based on the least error rate and then fused using score-level fusion with two preeminent feature extraction methods, i.e., scale-invariant feature transform and binarized statistical information features. The resultant model is examined for cross-sensor iris recognition and reported the top two error rates as 1.01% and 1.12%. It infers that the proposed approach constitutes vital discerning iris features and can recognize that the micro-patterns exist inside the iris region. Furthermore, a comparative study is carried out with the state of the art, where the proposed approach obtains significantly improved performance.

16 citations