scispace - formally typeset
Open AccessJournal ArticleDOI

Preprocessing techniques for context recognition from accelerometer data

TLDR
This article presents a survey of the techniques for extracting specific activity information from raw accelerometer data, and presents experimental results to compare and evaluate the accuracy of the various techniques using real data sets collected from daily activities.
Abstract
The ubiquity of communication devices such as smartphones has led to the emergence of context-aware services that are able to respond to specific user activities or contexts. These services allow communication providers to develop new, added-value services for a wide range of applications such as social networking, elderly care and near-emergency early warning systems. At the core of these services is the ability to detect specific physical settings or the context a user is in, using either internal or external sensors. For example, using built-in accelerometers, it is possible to determine whether a user is walking or running at a specific time of day. By correlating this knowledge with GPS data, it is possible to provide specific information services to users with similar daily routines. This article presents a survey of the techniques for extracting this activity information from raw accelerometer data. The techniques that can be implemented in mobile devices range from classical signal processing techniques such as FFT to contemporary string-based methods. We present experimental results to compare and evaluate the accuracy of the various techniques using real data sets collected from daily activities.

read more

Content maybe subject to copyright    Report

This is an uncorrected proof of an article published in:
Personal and Ubiquitous Computing, vol.14, no.7, pp.645–662, 2010
Preprocessing Techniques for Context Recognition from
Accelerometer Data
Davide Figo · Pedro C. Diniz · Diogo R. Ferreira ·
Jo
˜
ao M. P. Cardoso
Received: date / Accepted: date
Abstract The ubiquity of communication devices such as smartphones has led to the emer-
gence of context-aware services that are able to respond to specific user activities or con-
texts. These services allow communication providers to develop new, added-value services
for a wide range of applications such as social networking, elderly care, and near-emergency
early warning systems. At the core of these services is the ability to detect specific physical
settings or the context a user is in, using either internal or external sensors. For example,
using built-in accelerometers it is possible to determine if a user is walking or running at
a specific time of day. By correlating this knowledge with GPS data it is possible to pro-
vide specific information services to users with similar daily routines. This article presents
a survey of the techniques for extracting this activity information from raw accelerometer
data. The techniques that can be implemented in mobile devices range from classical signal
processing techniques such as FFT to contemporary string-based methods. We present ex-
perimental results to compare and evaluate the accuracy of the various techniques using real
data sets collected from daily activities.
Keywords Activity detection · Context-aware applications · Mobile computing · Sensor
data
1 Introduction
Mobile communication devices as the ubiquitous cellular phones, and more recently smart-
phones, have exploded in number and computing capabilities in recent years. Rather than
supporting only voice communications, contemporary devices have sophisticated internal
hardware architectures and possibly also an extended range of functions such a GPS loca-
tion, e-mail, organizer and synchronization with external, often centralized services. Ad-
Davide Figo, Pedro C. Diniz, Diogo R. Ferreira
?
IST Technical University of Lisbon
Avenida Prof. Dr. Cavaco Silva, 2744-016 Porto Salvo, Portugal
?
Corresponding author: diogo.ferreira@ist.utl.pt
Jo
˜
ao M. P. Cardoso
Faculty of Engineering, University of Porto (FEUP)
Rua Dr. Roberto Frias, 4200-465 Porto, Portugal

2
vanced units can even be equipped with a wide range of internal sensors including three-
dimensional accelerometers as well as the ability to interface with external web-based sensor
services such as traffic information.
Using sensor data, mobile devices can provide users with an wide range of added-value
services. For example, by analyzing accelerometer data a device can understand that the
user is performing some physical activity such as walking or running. This knowledge can
be gathered over a period of time, say a week or even a month, to recognize trends or daily
habits. Knowing that at a specific time of day a user might be jogging at a specific location, it
is possible to send a message advertising a refreshment booth or advertising a specific brand
of running shoes.
1
Potential services are not limited to individual end-users. By correlating
daily activity patterns, communication providers can also offer services to communities of
users with similar weekly habits, thus promoting and enhancing the use of the underlying
communication infrastructure.
The services are not restricted to leisure-oriented activities. Understanding the physical
situation of a user can also be used for early-warning healthcare related applications. Rec-
ognizing that an elderly person has fallen at his/her home and has not moved in the last
30 seconds indicates a potential emergency situation for which a relative or a local emer-
gency unit should be alerted. In civil protection scenarios, knowing the location and the
state of readiness of the elements of an emergency response team could dramatically reduce
dispatching time and thus the response lag.
A key enabler for these context-aware services that providers can now offer lies the abil-
ity of mobile devices to acquire, manage, process, and obtain useful information from raw
sensor data. From these data, devices must be able to accurately discover the characteristics
or features of the signal coming from a given sensor. Sensors do generate a high-volume
of raw data possibly contaminated with environment noise that needs to be filtered out. In
addition, the device must generate a very low number of incorrectly recognized features to
improve the accuracy of subsequent information processing stages where features are ana-
lyzed and organized into user context patterns.
The inclusion of sensors for context discovery in mobile devices is commonly organized
as part of a software stack with a general architecture similar to the one depicted in Figure 1.
At the lowest levels of the stack we have preprocessing phases where the device attempts to
extract a set of basic features from the sensor signal. These features include specific short-
term contexts or states such absence of light, quick movement or more sophisticated contexts
such as running. It is based on these states that the next layer the base-level classifier will
determine higher-level user contexts such as activities like jogging or exercising. Finally, a
layer of application code and scripting will use the context history to infer daily or weekly
activities or routines.
In contrast with other sensors, which provide an instant value that can be used directly
for context inference, the signal coming from an accelerometer may require the use of a
fairly complex preprocessing stage in order to characterize the physical activity of the user
within a certain time frame. Given the significance of this problem, a large number of tech-
niques have been developed to address it. In this article we survey the most representative do-
main approaches and techniques, including spectral analysis techniques such as Fast Fourier
Transforms (FFT), statistics-based metrics, and even string matching approaches. These ap-
1
For example, the Nike+ project (http://nikerunning.nike.com/nikeplus/) collects data captured by an ac-
celerometer located on the user’s running shoes. The user can then upload the data to a personal computer
and use an application that analyzes the running habits and physical effort to recommend training regimes.

3
Fig. 1 Layered architecture for context inference applications.
proaches vary widely in their context-recognition accuracy and often require specific input
representations.
In Section 2 we survey a wide range of techniques used to recognize user activities;
these techniques are organized into several broad domain approaches. Then in Section 3 we
present the results on the application of these techniques to a set of experimental data to
compare their benefits and computational cost. We conclude the article in section 4.
2 Preprocessing Techniques: Domains and Approaches
The need to extract key signal features that enable advanced processing algorithms to dis-
cover useful context information has led to the development of a wide range of algorithmic
approaches. These approaches rely on converting or transforming the input signals to and
from different domains of representation. In each domain there are specific methods to ab-
stract raw signal data and to provide, in addition to an early classification, some form of
data compression that makes it possible in many cases to apply higher-level algorithms for
context recognition.
As depicted in Figure 2 it is possible to classify the available sensor signal processing
techniques is three broad domains, namely: the time domain, the frequency domain and
what we call discrete representation domains. The following subsections describe the most
representative techniques in each of these domains in order to compare their implementation
complexity and accuracy in extracting signal features and identifying user activities.
2.1 Time Domain: Mathematical and Statistical Techniques
Simple mathematical and statistical metrics can be used to extract basic signal information
from raw sensor data. In addition, these metrics are often used as preprocessing steps for
metrics in other domains as a way to select key signal characteristics or features. These
techniques are often used in practical activity recognition algorithms.

4
Preprocessing Techniques
Time Domain Frequency Domain Discrete Domain
Mathematical/Statistical
Functions
Other
Functions
Wavelet
Transformations
Fourier
Transformations
Symbolic String
Representations
Coefficients Sum
Euclidean-based Distances
Levenshtein Edit Distance
Dynamic Time Warping
Coefficients Sum
DC component
Dominant Frequency
Energy
Info. Entropy
Differences
Angular Velocity
Zero-Crossings
SMA
SVM
DSVM
Mean, Median
Variance, Std Deviation
Min, Max, Range
RMS
Correlation, Cross-Correlation
Integration
Fig. 2 Classification of techniques applied to sensor signals for feature extraction.
2.1.1 Statistical Metrics: Mean, Variance and Standard Deviation
The mean over a window of data samples is a meaningful metric for almost every kind of
sensor. This metric can be calculated with small computational cost [48] and be done on the
fly with minimal memory requirements. The mean is usually applied in order to preprocess
raw data by removing random spikes and noise (both mechanical and electrical) from sensor
signals, smoothing the overall dataset.
There have been various early uses of the mean metric in activity recognition. Several
researchers have used the mean to either directly or indirectly identify user posture (sitting,
standing or lying) [11, 19, 22, 23] and also to discriminate the type of activity as either
dynamic or static [60]. Others have used the mean as input to classifiers like Neural Net-
works [51, 59], Naive Bayes [27], Kohonen Self-Organizing Maps [29], Decision Trees [5],
and even Fuzzy Inference [20]. Other applications of the mean value include, for example,
axial calibration by finding the average value for all the different orientations [7] and the
recognition of complex human gesture using Hidden Markov Models [8].
Another important statistical metric is the variance (
σ
2
) defined as the average of the
squared differences from the mean. The standard deviation (
σ
) is the square root of the
variance and represents both the variability of a data set and a probability distribution. The
standard deviation can give an indication of the stability of a signal. The measure is less
useful if it is known that the signal can include spurious values, as even a single value can
distort the result.
These two statistical metrics are often used as a signal feature in many activity recogni-
tion approaches where they have been used as an input to a classifier or to threshold-based
algorithms [12, 15, 30].
In other approaches [22, 23] the variance and standard deviation were used to infer user
movement or have been used as the base metric for classifiers like the Naive Bayes [27],
Dynamic Bayesian Networks [61], Neural Networks [59] and by [41] to identify the mode
of transport. The variance has also been used in [58] in a combination with other metrics
such as mean and maximum.

5
2.1.2 Envelope Metrics: Median, Maximum, Minimum and Range
The median is the number that separates the higher half of data samples from the lower
half. The value provided by the median is typically used to replace missing values from a
sequence of discrete values (see e.g. [63]).
Despite their simplicity these envelope metrics still offer some value in activity recogni-
tion (e.g. [2]) or as an input to Neural Networks for identifying different inclination angles
while walking, as well as in to distinguish between types of postures with threshold-based
techniques [3].
The range (the difference between maximum and minimum sample values) was used in
[11] together with other indicators to discriminate between walking and running. Applica-
tion of the maximum and minimum values in accelerometer-based systems was explored to
detect steps with the Twiddler keyboard [4], to detect gestures as mnemonical body short-
cuts [13], and in activity recognition as an input to a Neural Network classifier [59].
2.1.3 Root Mean Square (RMS) Metric
The root mean square (RMS) of a signal x
i
that represents a sequence of n discrete values
{
x
1
,x
2
,...,x
n
}
is obtained using equation (1) and can be associated with meaningful context
information.
x
RMS
=
s
x
2
1
+ x
2
2
+ ... + x
2
n
n
(1)
The RMS has been used to classify wavelet results by distinguishing walking patterns [53]
and is present in works of activity recognition like [42] as an input to a classifier such as
a Neural Network. It was also used as input to a multi-layer neural network in [8] for the
recognition of a set of gestures and integration with video records.
2.1.4 Position and Velocity using Numeric Integration
The integration metric measures the signal area under the data curve and is commonly ap-
plied to accelerometer signals to obtain estimates of speed and distance [40].
Several approaches have explored this integration technique. In gesture recognition (e.g.
[13]) researchers have used a double integration technique to compute the distance covered
by a gesture. Others have used this technique to determined velocity values and thus identify
gestures using Nintendo Wiimote and a Neural Network classifier [63].
Using the integral of the RMS signal and a simple threshold technique, researchers have
been able to distinguish between higher and lower states of activity intensity [15]. In other
approaches Lee et al. [30] used integration to compute the angular velocity of data supplied
by a gyroscope.
2.1.5 Signal Correlation and Correlation-Coefficient
Signal correlation is used to measure the strength and direction of a linear relationship be-
tween two signals. In activity recognition, correlation is especially useful in differentiating
between activities that involve translation in a single dimension [43].
In order to calculate the degree of correlation it is necessary to calculate the correlation
coefficients between the signals for the various axes. These coefficients can be obtained by
several statistical and geometrical formulas, depending on whether the situation involves

Citations
More filters
Journal ArticleDOI

Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition

TL;DR: A generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which is suitable for multimodal wearable sensors, does not require expert knowledge in designing features, and explicitly models the temporal dynamics of feature activations is proposed.
Journal ArticleDOI

A tutorial on human activity recognition using body-worn inertial sensors

TL;DR: In this paper, the authors provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition using on-body inertial sensors and describe the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems.

A Tutorial on Human Activity Recognition Using Body-Worn

TL;DR: This tutorial aims to provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition using on-body inertial sensors and describes the concept of an Activity Recognition Chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems.
Proceedings ArticleDOI

Convolutional Neural Networks for human activity recognition using mobile sensors

TL;DR: An approach to automatically extract discriminative features for activity recognition based on Convolutional Neural Networks, which can capture local dependency and scale invariance of a signal as it has been shown in speech recognition and image recognition domains is proposed.
Journal ArticleDOI

Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges

TL;DR: The focus of this review is to provide in-depth summaries of deep learning methods for mobile and wearable sensor-based human activity recognition, and categorise the studies into generative, discriminative and hybrid methods.
References
More filters
Book

A wavelet tour of signal processing

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Journal ArticleDOI

Dynamic programming algorithm optimization for spoken word recognition

TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Book

Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

TL;DR: In this paper, the authors introduce suffix trees and their use in sequence alignment, core string edits, alignments and dynamic programming, and extend the core problems to extend the main problems.
Journal ArticleDOI

Thirteen ways to look at the correlation coefficient

TL;DR: In this paper, the 100th anniversary of Galton's first discussion of regression and correlation is celebrated, and 13 different formulas representing a different computational and conceptual definition of Pearson's r are presented.
Book ChapterDOI

Activity recognition from user-annotated acceleration data

TL;DR: This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves, and suggests that multiple accelerometers aid in recognition.
Related Papers (5)
Frequently Asked Questions (17)
Q1. What contributions have the authors mentioned in the paper "Preprocessing techniques for context recognition from accelerometer data" ?

By correlating this knowledge with GPS data it is possible to provide specific information services to users with similar daily routines. This article presents a survey of the techniques for extracting this activity information from raw accelerometer data. The authors present experimental results to compare and evaluate the accuracy of the various techniques using real data sets collected from daily activities. 

For most metrics the first step is to compute the norm ni for each individual sample (xi;yi;zi), i.e. ni = √ xi2 + yi2 + zi2 and then applythe chosen metric to the resulting normal signal. 

If the signal-to-noise ratio (SNR) is greater than a fixed threshold value then the frequency at which this peak occurs is identified as the step rate. 

Because the wavelet transform can capture sudden changes in signals like the ones measured by an accelerometer, it is often chosen by several activity recognition approaches. 

The integration metric measures the signal area under the data curve and is commonly applied to accelerometer signals to obtain estimates of speed and distance [40]. 

A key enabler for these context-aware services that providers can now offer lies the ability of mobile devices to acquire, manage, process, and obtain useful information from raw sensor data. 

Several authors have used the summation of a set of spectral coefficients as a key metric for the recognition of specific activities. 

Once the spectral coefficients are computed a simple approach is to add all the coefficients thus generating a single metric value. 

CrossCorrelation(x,y) = n−1 max d=1 ( 1 n n ∑ i=1 xi · yi−d )(3)The typical implementation of this metric computes the cross-correlation coefficients for the pairs of signals corresponding to the three axes in a pairwise fashion (i.e. (x,y), (x,z) and (y,z)). 

For long signals with a large number of input samples, asymptotically more efficient algorithms such as merge sort with O(n logn) could reduce the number of such operations at the expense of a more complicated control-flow implementation of the sorting algorithm. 

In this article the authors survey the most representative domain approaches and techniques, including spectral analysis techniques such as Fast Fourier Transforms (FFT), statistics-based metrics, and even string matching approaches. 

These two statistical metrics are often used as a signal feature in many activity recognition approaches where they have been used as an input to a classifier or to threshold-based algorithms [12, 15, 30]. 

After the threshold parameters thr1 and thr2 have been determined, training is complete and it is time to apply the chosen metric to the test set in order to evaluate its real accuracy. 

The energy metric has been used [41] to identify the mode of transport of a user with a single accelerometer, respectively walking, cycling, running and driving. 

At the lowest levels of the stack the authors have preprocessing phases where the device attempts to extract a set of basic features from the sensor signal. 

The range (the difference between maximum and minimum sample values) was used in [11] together with other indicators to discriminate between walking and running. 

As can be observed, string-domain metrics exhibit much lower costs for expensive operations such as sqrt or even multiplications when compared to metrics in the time and frequency domains.