Showing papers by "Jana Kosecka published in 2021"

PDF

Open Access

Proceedings Article•DOI•

Hand Pose Guided 3D Pooling for Word-level Sign Language Recognition

[...]

Al Amin Hosain¹, Panneer Selvam Santhalingam¹, Parth H. Pathak¹, Huzefa Rangwala¹, Jana Kosecka¹ - Show less +1 more•Institutions (1)

George Mason University¹

01 Jan 2021

TL;DR: In this paper, a pose-guided pooling method was proposed for word-level sign recognition from American Sign Language (ASL) using video, which uses both motion and hand shape cues while being robust to variations of execution.

...read moreread less

Abstract: Gestures in American Sign Language (ASL) are characterized by fast, highly articulate motion of upper body, including arm movements with complex hand shapes and facial expressions. In this work, we propose a new method for word-level sign recognition from American Sign Language (ASL) using video. Our method uses both motion and hand shape cues while being robust to variations of execution. We exploit the knowledge of the body pose, estimated from an off-the-shelf pose estimator. Using the pose as a guide, we pool spatio-temporal feature maps from different layers of a 3D convolutional neural network. We train separate classifiers using pose guided pooled features from different resolutions and fuse their prediction scores during test time. This leads to a significant improvement in performance on the WLASL benchmark dataset [25]. The proposed approach achieves 10%, 12%, 9.5% and 6.5% performance gain on WLASL100, WLASL300, WLASL1000, WLASL2000 subsets respectively. To demonstrate the robustness of the pose guided pooling and proposed fusion mechanism, we also evaluate our method by fine tuning the model on another dataset. This yields 10% performance improvement for the proposed method using only 0.4% training data during fine tuning stage.

...read moreread less

25 citations

Proceedings Article•DOI•

Diverse Knowledge Distillation (DKD): A Solution for Improving The Robustness of Ensemble Models Against Adversarial Attacks

[...]

Ali Mirzaeian¹, Jana Kosecka¹, Houman Homayoun², Tinoosh Mohsenin³, Avesta Sasan¹ - Show less +1 more•Institutions (3)

George Mason University¹, University of California, Davis², University of Maryland, College Park³

07 Apr 2021

TL;DR: In this article, the loss function is regulated by a reverse knowledge distillation, forcing the new member to learn different features and map to a latent space safely distanced from those of existing members.

...read moreread less

Abstract: This paper proposes an ensemble learning model that is resistant to adversarial attacks. To build resilience, we introduced a training process where each member learns a radically distinct latent space. Member models are added one at a time to the ensemble. Simultaneously, the loss function is regulated by a reverse knowledge distillation, forcing the new member to learn different features and map to a latent space safely distanced from those of existing members. We assessed the security and performance of the proposed solution on image classification tasks using CIFAR10 and MNIST datasets and showed security and performance improvement compared to the state of the art defense methods.

...read moreread less

7 citations

Posted Content•

Uncertainty Aware Proposal Segmentation for Unknown Object Detection

[...]

Yimeng Li, Jana Kosecka¹•Institutions (1)

George Mason University¹

25 Nov 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors exploit additional predictions of semantic segmentation models and quantifying its confidences, followed by classification of object hypotheses as known vs. unknown, out of distribution objects.

...read moreread less

Abstract: Recent efforts in deploying Deep Neural Networks for object detection in real world applications, such as autonomous driving, assume that all relevant object classes have been observed during training. Quantifying the performance of these models in settings when the test data is not represented in the training set has mostly focused on pixel-level uncertainty estimation techniques of models trained for semantic segmentation. This paper proposes to exploit additional predictions of semantic segmentation models and quantifying its confidences, followed by classification of object hypotheses as known vs. unknown, out of distribution objects. We use object proposals generated by Region Proposal Network (RPN) and adapt distance aware uncertainty estimation of semantic segmentation using Radial Basis Functions Networks (RBFN) for class agnostic object mask prediction. The augmented object proposals are then used to train a classifier for known vs. unknown objects categories. Experimental results demonstrate that the proposed method achieves parallel performance to state of the art methods for unknown object detection and can also be used effectively for reducing object detectors' false positive rate. Our method is well suited for applications where prediction of non-object background categories obtained by semantic segmentation is reliable.

...read moreread less

4 citations

Posted Content•

SLAW: Scaled Loss Approximate Weighting for Efficient Multi-Task Learning.

[...]

Michael Crawshaw¹, Jana Kosecka¹•Institutions (1)

George Mason University¹

16 Sep 2021-arXiv: Learning

TL;DR: SLAW as discussed by the authors balances learning between tasks by estimating the magnitudes of each task's gradient without performing any extra backward passes, which matches the performance of the best existing methods while being much more efficient.

...read moreread less

Abstract: Multi-task learning (MTL) is a subfield of machine learning with important applications, but the multi-objective nature of optimization in MTL leads to difficulties in balancing training between tasks. The best MTL optimization methods require individually computing the gradient of each task's loss function, which impedes scalability to a large number of tasks. In this paper, we propose Scaled Loss Approximate Weighting (SLAW), a method for multi-task optimization that matches the performance of the best existing methods while being much more efficient. SLAW balances learning between tasks by estimating the magnitudes of each task's gradient without performing any extra backward passes. We provide theoretical and empirical justification for SLAW's estimation of gradient magnitudes. Experimental results on non-linear regression, multi-task computer vision, and virtual screening for drug discovery demonstrate that SLAW is significantly more efficient than strong baselines without sacrificing performance and applicable to a diverse range of domains.

...read moreread less