Home
/
Authors
/
Mahdi Pourmirzaei

Author

Mahdi Pourmirzaei

Bio: Mahdi Pourmirzaei is an academic researcher. The author has contributed to research in topics: Supervised learning & Phosphorylation. The author has an hindex of 2, co-authored 5 publications receiving 8 citations.

Papers

PDF

Open Access

More filters

Posted Content•

Using Self-Supervised Co-Training to Improve Facial Representation.

[...]

Mahdi Pourmirzaei, Farzaneh Esmaili, Gholam Ali Montazer

13 May 2021

TL;DR: In this article, a Hybrid Learning (HL) framework was proposed for standard Supervised Learning (SL), which used Self-Supervised co-training with SL in Multi-Task Learning (MTL) manner.

...read moreread less

Abstract: In this paper, at first, the impact of ImageNet pre-training on Facial Expression Recognition (FER) was tested under different augmentation levels. It could be seen from the results that training from scratch could reach better performance compared to ImageNet fine-tuning at stronger augmentation levels. After that, a framework was proposed for standard Supervised Learning (SL), called Hybrid Learning (HL) which used Self-Supervised co-training with SL in Multi-Task Learning (MTL) manner. Leveraging Self-Supervised Learning (SSL) could gain additional information from input data like spatial information from faces which helped the main SL task. It is been investigated how this method could be used for FER problems with self-supervised pre-tasks such as Jigsaw puzzling and in-painting. The supervised head (SH) was helped by these two methods to lower the error rate under different augmentations and low data regime in the same training settings. The state-of-the-art was reached on AffectNet via two completely different HL methods, without utilizing additional datasets. Moreover, HL's effect was shown on two different facial-related problem, head poses estimation and gender recognition, which concluded to reduce in error rate by up to 9% and 1% respectively. Also, we saw that the HL methods prevented the model from reaching overfitting.

...read moreread less

6 citations

Posted Content•

Using Self-Supervised Auxiliary Tasks to Improve Fine-Grained Facial Representation

[...]

Mahdi Pourmirzaei, Gholam Ali Montazer, Farzaneh Esmaili¹•Institutions (1)

Tarbiat Modares University¹

13 May 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the impact of ImageNet pre-training on fine-grained facial expression recognition (FER) was tested and it was shown from the results that training from scratch is better than ImageNet fine-tuning at stronger augmentation levels.

...read moreread less

Abstract: Over the past few years, best SSL methods, gradually moved from the pre-text task learning to the Contrastive learning. But contrastive methods have some drawbacks which could not be solved completely, such as performing poor on fine-grained visual tasks compare to supervised learning methods. In this study, at first, the impact of ImageNet pre-training on fine-grained Facial Expression Recognition (FER) was tested. It could be seen from the results that training from scratch is better than ImageNet fine-tuning at stronger augmentation levels. After that, a framework was proposed for standard Supervised Learning (SL), called Hybrid Multi-Task Learning (HMTL) which merged Self-Supervised as auxiliary task to the SL training setting. Leveraging Self-Supervised Learning (SSL) can gain additional information from input data than labels which can help the main fine-grained SL task. It is been investigated how this method could be used for FER by designing two customized version of common pre-text techniques, Jigsaw puzzling and in-painting. The state-of-the-art was reached on AffectNet via two types of HMTL, without utilizing pre-training on additional datasets. Moreover, we showed the difference between SS pre-training and HMTL to demonstrate superiority of proposed method. Furthermore, the impact of proposed method was shown on two other fine-grained facial tasks, Head Poses estimation and Gender Recognition, which concluded to reduce in error rate by 11% and 1% respectively.

...read moreread less

3 citations

Posted Content•

How Self-Supervised Learning Can be Used for Fine-Grained Head Pose Estimation?

[...]

Mahdi Pourmirzaei, Gholam Ali Montazer, Farzaneh Esmaili

10 Aug 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, modified versions of jigsaw puzzling and rotation are used as SSL pre-text tasks and the best architecture for Hybrid Multi-Task Learning (HMTL) is found.

...read moreread less

Abstract: Recent progress of Self-Supervised Learning (SSL) demonstrates the capability of these methods in computer vision field. However, this progress could not show any promises for fine-grained tasks such as Head Pose estimation. In this article, we have tried to answer a question: How SSL can be used for Head Pose estimation? In general, there are two main approaches to use SSL: (1) Using pre-trained weights which can be done via SSL tasks, (2) Leveraging SSL as an auxiliary co-training task besides of Supervised Learning (SL) tasks at the same time. In this study, modified versions of jigsaw puzzling and rotation as SSL pre-text tasks are used and the best architecture for our proposed Hybrid Multi-Task Learning (HMTL) is found. Finally, the HopeNet method as a baseline SL is selected and the impact of SSL pre-training and ImageNet pre-training on both HMTL and SL are compared. The error rate reduced by the HMTL method up to 13% compare to the SL. Moreover, HMTL method showed that it was good with all kinds of initial weights: random, ImageNet and SSL pre-training weights. Also, it was observed, when puzzled images are used for SL alone, the average error rate placed between SL and HMTL, showed the importance of local spatial features compare to global spatial features.

...read moreread less

Posted Content•

Customizing an Affective Tutoring System Based on Facial Expression and Head Pose Estimation

[...]

Mahdi Pourmirzaei, Gholam Ali Montazer, Ebrahim Mousavi

21 Nov 2021-arXiv: Human-Computer Interaction

TL;DR: In this article, the authors designed, implemented, and evaluated a system to personalize the learning environment based on the facial emotions recognition, head pose estimation, and cognitive style of learners.

...read moreread less

Abstract: In recent years, the main problem in e-learning has shifted from analyzing content to personalization of learning environment by Intelligence Tutoring Systems (ITSs). Therefore, by designing personalized teaching models, learners are able to have a successful and satisfying experience in achieving their learning goals. Affective Tutoring Systems (ATSs) are some kinds of ITS that can recognize and respond to affective states of learner. In this study, we designed, implemented, and evaluated a system to personalize the learning environment based on the facial emotions recognition, head pose estimation, and cognitive style of learners. First, a unit called Intelligent Analyzer (AI) created which was responsible for recognizing facial expression and head angles of learners. Next, the ATS was built which mainly made of two units: ITS, IA. Results indicated that with the ATS, participants needed less efforts to pass the tests. In other words, we observed when the IA unit was activated, learners could pass the final tests in fewer attempts than those for whom the IA unit was deactivated. Additionally, they showed an improvement in terms of the mean passing score and academic satisfaction.

...read moreread less

Posted Content•

A Brief Review of Machine Learning Techniques for Protein Phosphorylation Sites Prediction.

[...]

Farzaneh Esmaili, Mahdi Pourmirzaei, Shahin Ramazi, Elham Yavari¹•Institutions (1)

Tarbiat Modares University¹

10 Aug 2021-arXiv: Quantitative Methods

TL;DR: In this paper, the authors comprehensively reviewed all reversible post-translational modification (PTM) data and showed that there are basically two main approaches for phosphorylation prediction by machine learning: end-to-end and conventional.

...read moreread less

Abstract: Reversible Post-Translational Modifications (PTMs) have vital roles in extending the functional diversity of proteins and effect meaningfully the regulation of protein functions in prokaryotic and eukaryotic organisms. PTMs have happened as crucial molecular regulatory mechanisms that are utilized to regulate diverse cellular processes. Nevertheless, among the most well-studied PTMs can say mainly types of proteins are containing phosphorylation and significant roles in many biological processes. Disorder in this modification can be caused by multiple diseases including neurological disorders and cancers. Therefore, it is necessary to predict the phosphorylation of target residues in an uncharacterized amino acid sequence. Most experimental techniques for predicting phosphorylation are time-consuming, costly, and error-prone. By the way, computational methods have replaced these techniques. These days, a vast amount of phosphorylation data is publicly accessible through many online databases. In this study, at first, all datasets of PTMs that include phosphorylation sites (p-sites) were comprehensively reviewed. Furthermore, we showed that there are basically two main approaches for phosphorylation prediction by machine learning: End-to-End and conventional. We gave an overview for both of them. Also, we introduced 15 important feature extraction techniques which mostly have been used for conventional machine learning methods

...read moreread less

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Facial expression and attributes recognition based on multi-task learning of lightweight neural networks

[...]

Andrey V. Savchenko¹•Institutions (1)

National Research University – Higher School of Economics¹

31 Mar 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the multi-task learning of lightweight convolutional neural networks is studied for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins.

...read moreread less

Abstract: In this paper, the multi-task learning of lightweight convolutional neural networks is studied for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins The necessity to fine-tune these networks to predict facial expressions is highlighted Several models are presented based on MobileNet, EfficientNet and RexNet architectures It was experimentally demonstrated that they lead to near state-of-the-art results in age, gender and race recognition on the UTKFace dataset and emotion classification on the AffectNet dataset Moreover, it is shown that the usage of the trained models as feature extractors of facial regions in video frames leads to 45% higher accuracy than the previously known state-of-the-art single models for the AFEW and the VGAF datasets from the EmotiW challenges The models and source code are publicly available at this https URL

...read moreread less

105 citations

Proceedings Article•DOI•

Facial expression and attributes recognition based on multi-task learning of lightweight neural networks

[...]

Andrey V. Savchenko¹•Institutions (1)

National Research University – Higher School of Economics¹

16 Sep 2021

...read moreread less

Abstract: In this paper, the multi-task learning of lightweight convolutional neural networks is studied for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins. The necessity to fine-tune these networks to predict facial expressions is highlighted. Several models are presented based on lightweight architectures, such as MobileNet, EfficientNet and RexNet. It was experimentally demonstrated that they lead to near state-of-the-art results in age, gender and race recognition on the UTKFace dataset and emotion classification on the AffectNet dataset. Moreover, it is shown that the usage of the trained models as feature extractors of facial regions in video frames leads to 4.5% higher accuracy than the previously known state-of-the-art single models for the AFEW and the VGAF datasets from the EmotiW challenges.

...read moreread less

61 citations

Proceedings Article•DOI•

Self-supervised Contrastive Learning of Multi-view Facial Expressions

[...]

Shuvendu Roy¹, Ali Etemad¹•Institutions (1)

Queen's University¹

15 Aug 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposed Contrastive Learning of Multi-view facial Expressions (CL-MEx) to exploit facial images captured simultaneously from different angles towards FER, which achieved state-of-the-art performance on two multi-view FER datasets.

...read moreread less

Abstract: Facial expression recognition (FER) has emerged as an important component of human-computer interaction systems. Despite recent advancements in FER, performance often drops significantly for non-frontal facial images. We propose Contrastive Learning of Multi-view facial Expressions (CL-MEx) to exploit facial images captured simultaneously from different angles towards FER. CL-MEx is a two-step training framework. In the first step, an encoder network is pre-trained with the proposed self-supervised contrastive loss, where it learns to generate view-invariant embeddings for different views of a subject. The model is then fine-tuned with labeled data in a supervised setting. We demonstrate the performance of the proposed method on two multi-view FER datasets, KDEF and DDCF, where state-of-the-art performances are achieved. Further experiments show the robustness of our method in dealing with challenging angles and reduced amounts of labeled data.

...read moreread less

26 citations

Proceedings Article•DOI•

Self-supervised Contrastive Learning of Multi-view Facial Expressions

[...]

Shuvendu Roy¹, Ali Etemad¹•Institutions (1)

Queen's University¹

18 Oct 2021

...read moreread less

21 citations

Posted Content•

Comparing Facial Expression Recognition in Humans and Machines: Using CAM, GradCAM, and Extremal Perturbation

[...]

Serin Park¹, Christian Wallraven¹•Institutions (1)

Korea University¹

09 Oct 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors compared the performance and attention patterns of humans and machines during a two-alternative forced-choice FER task, and found that humans outperformed machines quite significantly.

...read moreread less

Abstract: Facial expression recognition (FER) is a topic attracting significant research in both psychology and machine learning with a wide range of applications. Despite a wealth of research on human FER and considerable progress in computational FER made possible by deep neural networks (DNNs), comparatively less work has been done on comparing the degree to which DNNs may be comparable to human performance. In this work, we compared the recognition performance and attention patterns of humans and machines during a two-alternative forced-choice FER task. Human attention was here gathered through click data that progressively uncovered a face, whereas model attention was obtained using three different popular techniques from explainable AI: CAM, GradCAM and Extremal Perturbation. In both cases, performance was gathered as percent correct. For this task, we found that humans outperformed machines quite significantly. In terms of attention patterns, we found that Extremal Perturbation had the best overall fit with the human attention map during the task.

...read moreread less

1 citations