scispace - formally typeset
Search or ask a question

Showing papers by "Majid Mirmehdi published in 2018"


Journal ArticleDOI
TL;DR: This study presents a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor and concludes that the proposed approach is suitable for home monitoring in a controlled environment.
Abstract: Deriving a person's energy expenditure accurately forms the foundation for tracking physical activity levels across many health and lifestyle monitoring tasks. In this study, the authors present a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor. The proposed individual-independent framework fuses information from both modalities which leads to improved estimates beyond the accuracy of single modality and manual metabolic equivalents of task (MET) lookup table based methods. For evaluation, the authors introduce a new dataset called SPHERE_RGBD + Inertial_calorie, for which visual and inertial data are simultaneously obtained with indirect calorimetry ground truth measurements based on gas exchange. Experiments show that the fusion of visual and inertial data reduces the estimation error by 8 and 18% compared with the use of visual only and inertial sensor only, respectively, and by 33% compared with a MET-based approach. The authors conclude from their results that the proposed approach is suitable for home monitoring in a controlled environment.

15 citations


Journal ArticleDOI
TL;DR: A depth-based whole body photoplethysmography approach, which reduces motion artifacts in depth–time data and highly improves the accuracy of depth- based computed measures, is introduced, which establishes the potential for unconstrained remote respiratory monitoring and diagnosis.
Abstract: Objective: We propose a novel depth-based photoplethysmography (dPPG) approach to reduce motion artifacts in respiratory volume–time data and improve the accuracy of remote pulmonary function testing (PFT) measures. Method: Following spatial and temporal calibration of two opposing RGB-D sensors, a dynamic three-dimensional model of the subject performing PFT is reconstructed and used to decouple trunk movements from respiratory motions. Depth-based volume–time data is then retrieved, calibrated, and used to compute 11 clinical PFT measures for forced vital capacity and slow vital capacity spirometry tests. Results: A dataset of 35 subjects (298 sequences) was collected and used to evaluate the proposed dPPG method by comparing depth-based PFT measures to the measures provided by a spirometer. Other comparative experiments between the dPPG and the single Kinect approach, such as Bland–Altman analysis, similarity measures performance, intra-subject error analysis, and statistical analysis of tidal volume and main effort scaling factors, all show the superior accuracy of the dPPG approach. Conclusion : We introduce a depth-based whole body photoplethysmography approach, which reduces motion artifacts in depth-based volume–time data and highly improves the accuracy of depth-based computed measures. Significance: The proposed dPPG method remarkably drops the $L_2$ error mean and standard deviation of FEF $_{50\%}$ , FEF $_{75\%}$ , FEF $_{25-75\%}$ , IC , and ERV measures by half, compared to the single Kinect approach. These significant improvements establish the potential for unconstrained remote respiratory monitoring and diagnosis.

13 citations


Posted Content
TL;DR: In this article, a joint classification-regression recurrent model is proposed to detect the completion moment of an action from a given frame and then integrates frame-level contributions to detect sequence-level completion moments.
Abstract: We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting node that predicts the frame's relative position of the completion moment by either classification or regression. The method is also capable of detecting incompletion. For example, the method is capable of detecting a missed ball-catch, as well as the moment at which the ball is safely caught. We test the method on 16 actions from three public datasets, covering sports as well as daily actions. Results show that when combining contributions from frames prior to the completion moment as well as frames post completion, the completion moment is detected within one second in 89% of all tested sequences.

11 citations


Proceedings Article
03 Sep 2018
TL;DR: A joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment, and a recurrent voting node that predicts the frame's relative position of the completion moment by either classification or regression.
Abstract: We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting node that predicts the frame's relative position of the completion moment by either classification or regression. The method is also capable of detecting incompletion. For example, the method is capable of detecting a missed ball-catch, as well as the moment at which the ball is safely caught. We test the method on 16 actions from three public datasets, covering sports as well as daily actions. Results show that when combining contributions from frames prior to the completion moment as well as frames post completion, the completion moment is detected within one second in 89% of all tested sequences.

8 citations


Posted Content
TL;DR: A novel deep fusion architecture, CaloriNet, for the online estimation of energy expenditure for free living monitoring in private environments, where RGB data is discarded and replaced by silhouettes, outperforming alternative, standard and single-modal techniques.
Abstract: We propose a novel deep fusion architecture, CaloriNet, for the online estimation of energy expenditure for free living monitoring in private environments, where RGB data is discarded and replaced by silhouettes. Our fused convolutional neural network architecture is trainable end-to-end, to estimate calorie expenditure, using temporal foreground silhouettes alongside accelerometer data. The network is trained and cross-validated on a publicly available dataset, SPHERE_RGBD + Inertial_calorie. Results show state-of-the-art minimum error on the estimation of energy expenditure (calories per minute), outperforming alternative, standard and single-modal techniques.

7 citations


Proceedings ArticleDOI
01 Oct 2018
TL;DR: A vision-based trunk-motion tolerant approach which estimates lung volume–time data remotely in forced vital capacity (FVC) and slow vital capacity(SVC) spirometry tests by filtering complex trunk motions using only the tidal volume scaling factor, which reduces the state-of-the-art average normalised L2 error.
Abstract: We present a vision-based trunk-motion tolerant approach which estimates lung volume–time data remotely in forced vital capacity (FVC) and slow vital capacity (SVC) spirometry tests After temporal modelling of trunk shape, generated using two opposing Kinects in a sequence, the chest-surface respiratory pattern is computed by performing principal component analysis on temporal geometrical features extracted from the chest and posterior shapes We evaluate our method on a publicly available dataset of 35 subjects (300 sequences) and compare against the state-of-the-art By filtering complex trunk motions, our proposed method calibrates the entire volume–time data using only the tidal volume scaling factor which reduces the state-of-the-art average normalised L2 error from 0136 to 005

5 citations


Proceedings ArticleDOI
26 Apr 2018
TL;DR: This paper proposes a method for deriving calorific expenditure based on deep convolutional neural network features (within a healthcare scenario) and shows that the proposed approach gives high accuracy in activity recognition and low normalised root mean square error in calorIFIC expenditure prediction.
Abstract: Accurately estimating a person's energy expenditure is an important tool in tracking physical activity levels for healthcare and sports monitoring tasks, amongst other applications. In this paper, we propose a method for deriving calorific expenditure based on deep convolutional neural network features (within a healthcare scenario). Our evaluation shows that the proposed approach gives high accuracy in activity recognition (82.3%) and low normalised root mean square error in calorific expenditure prediction (0.41). It is compared against the current state-ofthe-art calorific expenditure estimation method, based on a classical approach, and exhibits an improvement of 7.8% in the calorific expenditure prediction task. The proposed method is suitable for home monitoring in a controlled environment.

5 citations


Proceedings ArticleDOI
01 Jan 2018
TL;DR: It is shown that cupboard door sensors produce useful data about access to certain non mechanical processes and items, while being cheap and simple, and positively impact the activity recognition performance of the model through their addition.
Abstract: Smart home systems are becoming increasingly relevant with every passing year, but while the technology is more available than ever, other issues such as cost and intrusiveness are becoming more apparent. To this end, we consider the types of sensors which are most useful for fine-grained activity recognition in the kitchen in terms of cost, intrusiveness, durability and ease of installation. We install sensors into a conventional residence for testing, and propose a system which meets the design challenges such an environment presents. We show that cupboard door sensors produce useful data about access to certain non mechanical processes and items, while being cheap and simple. We also show that they positively impact the activity recognition performance of our model through their addition, while providing information that we can make use of in future studies.

4 citations



Posted Content
TL;DR: This work proposes to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a semantic classifier to explicitly control the domain specificity of the generation process.
Abstract: We present a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks. In particular, we propose to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a semantic classifier to explicitly control the domain specificity of the generation process. Thereby, we encode information in the classifier network which can be utilized to steer adversarial synthesis, and which fuels our CondenseNet ID-network training. We provide a quantitative and qualitative analysis of the approach and its variants on a number of datasets, obtaining results that outperform the state-of-the-art on the LIMA dataset for long-term monitoring in indoor living spaces.

3 citations


Posted Content
TL;DR: A list of some data quality problems (both known to exist in the dataset(s) and potential ones), their workarounds, and other information important to people working with the SPHERE data, software, and hardware is included.
Abstract: The SPHERE project has developed a multi-modal sensor platform for health and behavior monitoring in residential environments. So far, the SPHERE platform has been deployed for data collection in approximately 50 homes for duration up to one year. This technical document describes the format and the expected content of the SPHERE dataset(s) under preparation. It includes a list of some data quality problems (both known to exist in the dataset(s) and potential ones), their workarounds, and other information important to people working with the SPHERE data, software, and hardware. This document does not aim to be an exhaustive descriptor of the SPHERE dataset(s); it also does not aim to discuss or validate the potential scientific uses of the SPHERE data.

Posted Content
TL;DR: This work proposes a two-stage convolutional neural network architecture for robust recognition of hand gestures, called HGR-Net, where the first stage performs accurate semantic segmentation to determine hand regions, and the second stage identifies the gesture.
Abstract: We propose a two-stage convolutional neural network (CNN) architecture for robust recognition of hand gestures, called HGR-Net, where the first stage performs accurate semantic segmentation to determine hand regions, and the second stage identifies the gesture. The segmentation stage architecture is based on the combination of fully convolutional residual network and atrous spatial pyramid pooling. Although the segmentation sub-network is trained without depth information, it is particularly robust against challenges such as illumination variations and complex backgrounds. The recognition stage deploys a two-stream CNN, which fuses the information from the red-green-blue and segmented images by combining their deep representations in a fully connected layer before classification. Extensive experiments on public datasets show that our architecture achieves almost as good as state-of-the-art performance in segmentation and recognition of static hand gestures, at a fraction of training time, run time, and model size. Our method can operate at an average of 23 ms per frame.

Book ChapterDOI
08 Sep 2018
TL;DR: In this article, a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks is presented.
Abstract: We present a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks. In particular, we propose to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a semantic classifier to explicitly control the domain specificity of the generation process. Thereby, we encode information in the classifier network which can be utilized to steer adversarial synthesis, and which fuels our CondenseNet ID-network training. We provide a quantitative and qualitative analysis of the approach and its variants on a number of datasets, obtaining results that outperform the state-of-the-art on the LIMA dataset for long-term monitoring in indoor living spaces.