scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

AnnoTainted: Automating Physical Activity Ground Truth Collection Using Smartphones

TL;DR: This work provides motivation for a zero-effort crowdsensing task: auto-annotated ground truth collection for physical activity recognition, and proposes a Generic Classifier with ≥ 95% accuracy for this purpose.
Abstract: In this work, we provide motivation for a zero-effort crowdsensing task: auto-annotated ground truth collection for physical activity recognition. Data obtained through Smartphones for classification of human activities is prone to discrepancies, which reiterates the need for better and larger activity datasets. Artificial data generation algorithms fail to efficiently generate quality instances for minority data. In the proposed model, crowd-sourced sensor data is classified by a robust classifier built by researchers ground up. We nominate a Generic Classifier with ≥ 95% accuracy for this purpose. Data collection and distribution models which ensure that the crowd client receives non-skewed, quality data from locations with higher degree of activity occurrence are elucidated upon. Also integrated within our proposed model are Location-Specific Classifiers, which can be utilized by developers to optimize on location-specific tasks. Effective validation of classified activities using diverse sensor data streams improves the proposed classifier systems and boosts ground-truth accuracy.
Citations
More filters
Proceedings ArticleDOI

[...]

27 Sep 2018
TL;DR: In this article, the authors examined the physical activity recognition model to access the relationship between sensors, the single triaxial accelerometer and single tri-axial gyroscope, and fitness recognition (sitting, standing, walking, and running).
Abstract: This experiment examined the physical activity recognition model to access the relationship between sensors, the single tri-axial accelerometer and single tri-axial gyroscope, and fitness recognition (sitting, standing, walking, and running). We experimented with sixteen students (62.5% male and 37.5% female, age between eighteen through twenty- three year old) of the Informatics school at Walailak University. We had the experimental setup to evaluate model performance for baseline models, booting ensemble model, and begging ensemble model. When we measured model’s performance, we found the follows results. First, the baseline models performance has the highest accuracy level with KNN: k-Nearest Neighbor with k = 9 is 95.47%. Second, The Boosting ensemble models performance has the highest accuracy level with C5.0: C5.0 is 95.63%. Third, The Bagging ensemble models performance has the highest accuracy level with RF: Random Forest is 95.69%. Fourth, The Stacking ensemble models performance has the highest accuracy level with KNN is 95.52%. So, we concluded that RF has the highest performance with accuracy level at 95.69%. In the future work, we planned to get more accuracy model by adding more features from another sensor, heart rate. Mining data collected from sensors provide valuable result in the physical activity recognition area. The improvement in performance is required especially in the healthcare field. The more increasing of using the wearable device, the broader opportunity in the data mining research area can be.
Proceedings ArticleDOI

[...]

08 Oct 2018
TL;DR: My research interests lie in developing systems for event detection and pattern recognition using smartphone sensor data, and leveraging micro-events for deep context mining including human-human interactions and behavior.
Abstract: I am a fifth year Ph.D. Student (enrolled January '14) working in Data Science (Department of Mathematics) at Shiv Nadar University, India. I have an undergraduate major in Computer Science, and my research interests lie in two primary areas - (i) developing systems for event detection and pattern recognition using smartphone sensor data, and (ii) leveraging micro-events for deep context mining including human-human interactions and behavior. Having been an active part of the ubiquitous computing research community during my undergraduate as well as during my Ph.D. (expected completion Summer '19), my future aspirations are to join as a post-doctoral researcher in the same field, and continue to build upon my skills and contribute both as an active researcher as well as a mentor for budding undergraduate students.
References
More filters
Journal ArticleDOI

[...]

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

11,512 citations

Journal ArticleDOI

[...]

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of oversampling the minority (abnormal)cla ss and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space)tha n only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space)t han varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC)and the ROC convex hull strategy.

11,077 citations

Book ChapterDOI

[...]

21 Apr 2004
TL;DR: This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves, and suggests that multiple accelerometers aid in recognition.
Abstract: In this work, algorithms are developed and evaluated to de- tect physical activities from data acquired using five small biaxial ac- celerometers worn simultaneously on different parts of the body. Ac- celeration data was collected from 20 subjects without researcher su- pervision or observation. Subjects were asked to perform a sequence of everyday tasks but not told specifically where or how to do them. Mean, energy, frequency-domain entropy, and correlation of acceleration data was calculated and several classifiers using these features were tested. De- cision tree classifiers showed the best performance recognizing everyday activities with an overall accuracy rate of 84%. The results show that although some activities are recognized well with subject-independent training data, others appear to require subject-specific training data. The results suggest that multiple accelerometers aid in recognition because conjunctions in acceleration feature values can effectively discriminate many activities. With just two biaxial accelerometers - thigh and wrist - the recognition performance dropped only slightly. This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves.

3,075 citations


"AnnoTainted: Automating Physical Ac..." refers background in this paper

  • [...]

Journal ArticleDOI

[...]

TL;DR: This work describes and evaluates a system that uses phone-based accelerometers to perform activity recognition, a task which involves identifying the physical activity a user is performing, and has a wide range of applications, including automatic customization of the mobile device's behavior based upon a user's activity.
Abstract: Mobile devices are becoming increasingly sophisticated and the latest generation of smart cell phones now incorporates many diverse and powerful sensors These sensors include GPS sensors, vision sensors (ie, cameras), audio sensors (ie, microphones), light sensors, temperature sensors, direction sensors (ie, magnetic compasses), and acceleration sensors (ie, accelerometers) The availability of these sensors in mass-marketed communication devices creates exciting new opportunities for data mining and data mining applications In this paper we describe and evaluate a system that uses phone-based accelerometers to perform activity recognition, a task which involves identifying the physical activity a user is performing To implement our system we collected labeled accelerometer data from twenty-nine users as they performed daily activities such as walking, jogging, climbing stairs, sitting, and standing, and then aggregated this time series data into examples that summarize the user activity over 10- second intervals We then used the resulting training data to induce a predictive model for activity recognition This work is significant because the activity recognition model permits us to gain useful knowledge about the habits of millions of users passively---just by having them carry cell phones in their pockets Our work has a wide range of applications, including automatic customization of the mobile device's behavior based upon a user's activity (eg, sending calls directly to voicemail if a user is jogging) and generating a daily/weekly activity profile to determine if a user (perhaps an obese child) is performing a healthy amount of exercise

2,069 citations


"AnnoTainted: Automating Physical Ac..." refers background or methods in this paper

  • [...]

  • [...]

Journal ArticleDOI

[...]

TL;DR: A general method for combining the classifiers generated on the binary problems is proposed, and a general empirical multiclass loss bound is proved given the empirical loss of the individual binary learning algorithms.
Abstract: We present a unifying framework for studying the solution of multiclass categorization problems by reducing them to multiple binary problems that are then solved using a margin-based binary learning algorithm. The proposed framework unifies some of the most popular approaches in which each class is compared against all others, or in which all pairs of classes are compared to each other, or in which output codes with error-correcting properties are used. We propose a general method for combining the classifiers generated on the binary problems, and we prove a general empirical multiclass loss bound given the empirical loss of the individual binary learning algorithms. The scheme and the corresponding bounds apply to many popular classification learning algorithms including support-vector machines, AdaBoost, regression, logistic regression and decision-tree algorithms. We also give a multiclass generalization error analysis for general output codes with AdaBoost as the binary learner. Experimental results with SVM and AdaBoost show that our scheme provides a viable alternative to the most commonly used multiclass algorithms.

1,943 citations


"AnnoTainted: Automating Physical Ac..." refers background in this paper

  • [...]