scispace - formally typeset
Search or ask a question

Object-based Activity Recognition with Heterogeneous Sensors on Wrist

14 Jul 2018-
TL;DR: This paper explains how to achieve activities of daily living using a sensor device attached to a users wrist that contains a camera, a microphone and an accelerometer and suggests a method that will protect the privacy of the user, as the camera and the microphone can records part of the users private life.
Abstract: Recent development of wearable technology has opened great opportunities for human performance evaluation applications in various domains. In order to measure the physical activities of an individual, wrist-worn sensors embedded in smart-watches, fitness bands, and clip-on devices can be used to collect various types of data while the subject performs regular daily activities.In this paper we are going to explain how to achieve activities of daily living(ADLs) using a sensor device attached to a users wrist. This device contains a camera, a microphone and an accelerometer. In this experience we will collect the data from the sensors in our device and try to analyse it, in order to recognise the type of the activity. In this way we will be able to recognize ADLs that contain manual use of objects such as making a drink or cooking. Finally, we will be able to say, that the camera plays the major rule in this experience and without it would be difficult to achieve our goal. We will also suggest a method that will protect the privacy of the user, as the camera and the microphone can records part of the users private life.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
21 Apr 2018
TL;DR: Through a series of evaluations, it is demonstrated Wall++ can enable robust room-scale interactive and context-aware applications and can track users' touch and gestures, as well as estimate body pose if they are close.
Abstract: Human environments are typified by walls, homes, offices, schools, museums, hospitals and pretty much every indoor context one can imagine has walls. In many cases, they make up a majority of readily accessible indoor surface area, and yet they are static their primary function is to be a wall, separating spaces and hiding infrastructure. We present Wall++, a low-cost sensing approach that allows walls to become a smart infrastructure. Instead of merely separating spaces, walls can now enhance rooms with sensing and interactivity. Our wall treatment and sensing hardware can track users' touch and gestures, as well as estimate body pose if they are close. By capturing airborne electromagnetic noise, we can also detect what appliances are active and where they are located. Through a series of evaluations, we demonstrate Wall++ can enable robust room-scale interactive and context-aware applications.

90 citations


Cites methods from "Object-based Activity Recognition w..."

  • ...[43] utilized wrist-worn cameras to detect what object was currently being used....

    [...]

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A model for reasoning on multimodal data to jointly predict activities and energy expenditures is proposed and heart rate signals are used as privileged self-supervision to derive energy expenditure in a training stage.
Abstract: Physiological signals such as heart rate can provide valuable information about an individuals state and activity. However, existing work on computer vision has not yet explored leveraging these signals to enhance egocentric video understanding. In this work, we propose a model for reasoning on multimodal data to jointly predict activities and energy expenditures. We use heart rate signals as privileged self-supervision to derive energy expenditure in a training stage. A multitask objective is used to jointly optimize the two tasks. Additionally, we introduce a dataset that contains 31 hours of egocentric video augmented with heart rate and acceleration signals. This study can lead to new applications such as a visual calorie counter.

61 citations


Cites background or methods from "Object-based Activity Recognition w..."

  • ...[44] use a wrist-mounted camera and sensors to detect 15 classes of ADL recognition....

    [...]

  • ...Activity recognition in visual data is a widely studied problem in computer vision [22, 63, 35, 2], and a number of works have investigated this task in the domain of egocentric video [48, 8, 5, 30] and in combination with other wearable sensors [59, 44]....

    [...]

Journal ArticleDOI
TL;DR: The general architecture of HAR system is presented, along with the description of its main components, and different challenges and issues online versus offline also using deep learning versus traditional machine learning for human activity recognition based on accelerometer sensors are concluded.
Abstract: Human activity recognition is an important area of machine learning research as it has many utilization in different areas such as sports training, security, entertainment, ambient-assisted living, and health monitoring and management Studying human activity recognition shows that researchers are interested mostly in the daily activities of the human Therefore, the general architecture of HAR system is presented in this paper, along with the description of its main components The state of the art in human activity recognition based on accelerometer is surveyed According to this survey, Most of the researches recently used deep learning for recognizing HAR, but they focused on CNN even though there are other deep learning types achieved a satisfied accuracy The paper displays a two-level taxonomy in accordance with machine learning approach (either traditional or deep learning) and the processing mode (either online or offline) Forty eight studies are compared in terms of recognition accuracy, classifier, activities types, and used devices Finally, the paper concludes different challenges and issues online versus offline also using deep learning versus traditional machine learning for human activity recognition based on accelerometer sensors

47 citations


Cites background or methods from "Object-based Activity Recognition w..."

  • ...Studies included in our survey mainly used triaxial acceleration signal, while some of them used additional signals to improve recognition accuracy such as [29], [32-34]....

    [...]

  • ...[33] KIT(7),DLY(1),HOS(1),SELF(1) MEN, ENG, FETP , FF AdaBoost, DT Precision 58....

    [...]

  • ...[33] compared two traditional algorithms, AdaBoost and DT, for examining the efficiency of classifier when collecting the data from heterogeneous sensors....

    [...]

  • ...Self-care Activities Applying makeup, Brushing hair, Shaving, Toileting, Flushing the toilet, Getting dressed, Brushing teeth, Washing hands, Washing face, Washing clothes, Drying hair, Taking medication [29] [32-35][53][60]...

    [...]

  • ...Daily Activities Ironing, Eating, Drinking, Using phone, Watching TV, Using computer, Reading book/magazine, Listening music/radio, Taking part in conversations, Getup bed, Sleeping, Note-pc, Carrying a box, Getting up [29][32-35][46][53][56][60][64][77]...

    [...]

Journal ArticleDOI
TL;DR: The concept of activity recognition, its taxonomy and familiarised the reader with sub-classes of sensor-based AR are introduced and the hierarchical taxonomy of human behaviour analysis tasks is introduced.

28 citations


Cites methods from "Object-based Activity Recognition w..."

  • ...As with every machine-learning approach, this HMM approach to AR requires training data, which has to be “acquired in each user’s environment because these sensor data are environment dependent” [67]....

    [...]

  • ...In [67] hybrid discriminative/generative approach with hidden Markov model (HMM) is used for object-based AR....

    [...]

Proceedings ArticleDOI
01 Jan 2018
TL;DR: A hierarchical framework for zero-shot human-activity recognition that recognizes unseen activities by the combinations of preliminarily learned basic actions and involved objects that brings competitive advantage to industry in terms of the service-deployment cost.
Abstract: We present a hierarchical framework for zero-shot human-activity recognition that recognizes unseen activities by the combinations of preliminarily learned basic actions and involved objects. The presented framework consists of gaze-guided object recognition module, myo-armband based action recognition module, and the activity recognition module, which combines results from both action and object module to detect complex activities. Both object and action recognition modules are based on deep neural network. Unlike conventional models, the proposed framework does not need retraining for recognition of an unseen activity, if the activity can be represented by a combination of the predefined basic actions and objects. This framework brings competitive advantage to industry in terms of the service-deployment cost. The experimental results showed that the proposed model could recognize three types of activities with precision of 77% and recall rate of 82%, which is comparable to a baseline method based on supervised learning.

19 citations


Cites background or methods from "Object-based Activity Recognition w..."

  • ...Maekawa et al. (Maekawa et al., 2010) used a wrist-mounted camera and sensors to detect activities in daily living (ADL)....

    [...]

  • ...(Maekawa et al., 2010) used a wrist-mounted camera and sensors to detect activities in daily living (ADL)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed, which employs a metric derived from the Bhattacharyya coefficient as similarity measure, and uses the mean shift procedure to perform the optimization.
Abstract: A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed. The feature histogram-based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples, the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only a few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking.

4,996 citations


"Object-based Activity Recognition w..." refers methods in this paper

  • ...chieve fast object recognition [2,3] by comparing histograms and object models prepared in advance....

    [...]

  • ...Some studies also achieve fast object recognition [2, 3] by comparing histograms and object models prepared in advance....

    [...]

Book ChapterDOI
15 Apr 1996
TL;DR: The mathematical foundations of the technique are described and the results of experiments which compare robustness and recognition rates for different local neighborhood operators and histogram similarity measurements are presented.
Abstract: This paper presents a technique to determine the identity of objects in a scene using histograms of the responses of a vector of local linear neighborhood operators (receptive fields). This technique can be used to determine the most probable objects in a scene, independent of the object's position, image-plane orientation and scale. In this paper we describe the mathematical foundations of the technique and present the results of experiments which compare robustness and recognition rates for different local neighborhood operators and histogram similarity measurements.

286 citations


"Object-based Activity Recognition w..." refers background in this paper

  • ...Many studies try to detect objects from images while taking occlusion, rotation, scale, and blur into account [5, 6]....

    [...]

Proceedings ArticleDOI
21 Apr 2018
TL;DR: Through a series of evaluations, it is demonstrated Wall++ can enable robust room-scale interactive and context-aware applications and can track users' touch and gestures, as well as estimate body pose if they are close.
Abstract: Human environments are typified by walls, homes, offices, schools, museums, hospitals and pretty much every indoor context one can imagine has walls. In many cases, they make up a majority of readily accessible indoor surface area, and yet they are static their primary function is to be a wall, separating spaces and hiding infrastructure. We present Wall++, a low-cost sensing approach that allows walls to become a smart infrastructure. Instead of merely separating spaces, walls can now enhance rooms with sensing and interactivity. Our wall treatment and sensing hardware can track users' touch and gestures, as well as estimate body pose if they are close. By capturing airborne electromagnetic noise, we can also detect what appliances are active and where they are located. Through a series of evaluations, we demonstrate Wall++ can enable robust room-scale interactive and context-aware applications.

90 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A model for reasoning on multimodal data to jointly predict activities and energy expenditures is proposed and heart rate signals are used as privileged self-supervision to derive energy expenditure in a training stage.
Abstract: Physiological signals such as heart rate can provide valuable information about an individuals state and activity. However, existing work on computer vision has not yet explored leveraging these signals to enhance egocentric video understanding. In this work, we propose a model for reasoning on multimodal data to jointly predict activities and energy expenditures. We use heart rate signals as privileged self-supervision to derive energy expenditure in a training stage. A multitask objective is used to jointly optimize the two tasks. Additionally, we introduce a dataset that contains 31 hours of egocentric video augmented with heart rate and acceleration signals. This study can lead to new applications such as a visual calorie counter.

61 citations

Dissertation
01 Jan 2004
TL;DR: This thesis investigates techniques to recognise environmental non-speech sounds and their direction, with the purpose of using these techniques in an autonomous mobile surveillance robot, and presents advanced methods to improve the accuracy and efficiency of these techniques.
Abstract: Sound is one of a human beings most important senses. After vision, it is the sense most used to gather information about the environment. Despite this, comparatively little research has been done into the field of sound recognition. The research that has been done mainly centres around the recognition of speech and music. Our auditory environment is made up of many sounds other than speech and music. This sound information can be taped into for the benefit of specific applications such as security systems. Currently, most researchers are ignoring this sound information. This thesis investigates techniques to recognise environmental non-speech sounds and their direction, with the purpose of using these techniques in an autonomous mobile surveillance robot. It also presents advanced methods to improve the accuracy and efficiency of these techniques. Initially, this report presents an extensive literature survey, looking at the few existing techniques for non-speech environmental sound recognition. This survey also, by necessity, investigates existing techniques used for sound recognition in speech and music. It also examines techniques used for direction detection of sounds. The techniques that have been identified are then comprehensively compared to determine the most appropriate techniques for non-speech sound recognition. A comprehensive comparison is performed using non-speech sounds and several runs are performed to ensure accuracy. These techniques are then ranked based on their effectiveness. The best technique is found to be either Continuous Wavelet Transform feature extraction with Dynamic Time Warping or Mel-Frequency Cepstral Coefficients with Dynamic Time Warping. Both of these techniques achieve a 70% recognition rate. Once the best of the existing classification techniques is identified, the problem of uncountable sounds in the environment can be addressed. Unlike speech recognition, non-speech sound recognition requires recognition from a much wider library of sounds. Due to this near-infinite set of example sounds, the characteristics and complexity of non-speech sound recognition techniques increases. To address this problem, a systematic scheme needs to be developed for non-speech sound classification. Several different approaches are examined. Included is a new design for an environmental sound taxonomy based on an environmental sound alphabet. This taxonomy works over three levels and classifies sounds based on their physical characteristics. Its performance is compared with a technique that generates a structured tree automatically. These structured techniques are compared for different data sets and results are analysed. Comparable results are achieved for these techniques with the same data set as previously used. In addition, the results and greater information from these experiments is used to infer some information about the structure of environmental sounds in general. Finally, conclusions are drawn on both sets of techniques and areas of future research stemming from this thesis are explored.

51 citations


"Object-based Activity Recognition w..." refers background in this paper

  • ...In [4], the Mel-Frequency Cepstral Coefficient (MFCC) is reported to be the best transformation scheme for environmental sound recognition....

    [...]