Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

doi:10.1109/CVPR.2017.721

Open AccessProceedings ArticleDOI

Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

Katsuyuki Nakamura, +3 more

- pp 6817-6826

Chats0

TLDR

A model for reasoning on multimodal data to jointly predict activities and energy expenditures is proposed and heart rate signals are used as privileged self-supervision to derive energy expenditure in a training stage.

Abstract:

Physiological signals such as heart rate can provide valuable information about an individuals state and activity. However, existing work on computer vision has not yet explored leveraging these signals to enhance egocentric video understanding. In this work, we propose a model for reasoning on multimodal data to jointly predict activities and energy expenditures. We use heart rate signals as privileged self-supervision to derive energy expenditure in a training stage. A multitask objective is used to jointly optimize the two tasks. Additionally, we introduce a dataset that contains 31 hours of egocentric video augmented with heart rate and acceleration signals. This study can lead to new applications such as a visual calorie counter.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning

Vasili Ramanishka, +3 more

TL;DR: The Honda Research Institute Driving Dataset (HDD) as discussed by the authors is a dataset of 104 hours of real human driving in the San Francisco Bay Area collected using an instrumented vehicle equipped with different sensors.

...read moreread less

Journal ArticleDOI

A Review of Multimodal Human Activity Recognition with Special Emphasis on Classification, Applications, Challenges and Future Directions

Santosh Kumar Yadav, +5 more

- 08 Jul 2021 -

Knowledge Based Systems

TL;DR: A comprehensive review of multimodal human activity recognition methods where different types of sensors are being used along with their analytical approaches and fusion methods is presented in this paper, where the authors present classification and discussion of existing work within seven rational aspects.

...read moreread less

Proceedings ArticleDOI

Convolutional Relational Machine for Group Activity Recognition

Sina Mokhtarzadeh Azar, +3 more

TL;DR: In this article, an end-to-end deep Convolutional Neural Network (CRM) is proposed for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video.

...read moreread less

Journal ArticleDOI

Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances

- 14 Feb 2022 -

Sensors

TL;DR: A comprehensive analysis of the current advancements, developing trends, and major challenges for wearable-based human activity recognition (HAR) can be found in this paper , where the authors also present cutting-edge frontiers and future directions for deep learning-based HAR.

...read moreread less

Posted Content

Convolutional Relational Machine for Group Activity Recognition

Sina Mokhtarzadeh Azar, +3 more

- 05 Apr 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: An end-to-end deep Convolutional Neural Network called CRM for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video to produce an intermediate spatial representation based on individual and group activities.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Proceedings ArticleDOI

Learning Spatiotemporal Features with 3D Convolutional Networks

Du Tran, +5 more

TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

...read moreread less

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Andrej Karpathy, +5 more

Proceedings ArticleDOI

Large-Scale Video Classification with Convolutional Neural Networks

Andrej Karpathy, +5 more

TL;DR: This work studies multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggests a multiresolution, foveated architecture as a promising way of speeding up the training.

...read moreread less

Collapse

Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

Citations

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning

A Review of Multimodal Human Activity Recognition with Special Emphasis on Classification, Applications, Challenges and Future Directions

Convolutional Relational Machine for Group Activity Recognition

Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances

Convolutional Relational Machine for Group Activity Recognition

References

ImageNet: A large-scale hierarchical image database

Going deeper with convolutions

Learning Spatiotemporal Features with 3D Convolutional Networks

Large-scale Video Classiﬁcation with Convolutional Neural Networks

Large-Scale Video Classification with Convolutional Neural Networks

Related Papers (5)

Detecting activities of daily living in first-person camera views

Long short-term memory

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Large-Scale Video Classification with Convolutional Neural Networks