What is the first step to calculating the norm ni for each individual sample?

For most metrics the first step is to compute the norm ni for each individual sample (xi;yi;zi), i.e. ni = √ xi2 + yi2 + zi2 and then applythe chosen metric to the resulting normal signal.

What is the frequency at which the signal peak occurs?

If the signal-to-noise ratio (SNR) is greater than a fixed threshold value then the frequency at which this peak occurs is identified as the step rate.

Why is the wavelet transform used in the detection of human activity?

Because the wavelet transform can capture sudden changes in signals like the ones measured by an accelerometer, it is often chosen by several activity recognition approaches.

What is the key metric for the recognition of specific activities?

Several authors have used the summation of a set of spectral coefficients as a key metric for the recognition of specific activities.

What is the common approach to compute the spectral coefficients?

Once the spectral coefficients are computed a simple approach is to add all the coefficients thus generating a single metric value.

What is the common metric used to determine the correlation coefficient between two signals?

CrossCorrelation(x,y) = n−1 max d=1 ( 1 n n ∑ i=1 xi · yi−d )(3)The typical implementation of this metric computes the cross-correlation coefficients for the pairs of signals corresponding to the three axes in a pairwise fashion (i.e. (x,y), (x,z) and (y,z)).

What is the cost of a more complicated control-flow implementation of the sorting algorithm?

For long signals with a large number of input samples, asymptotically more efficient algorithms such as merge sort with O(n logn) could reduce the number of such operations at the expense of a more complicated control-flow implementation of the sorting algorithm.

What is the way to determine the threshold parameters for a given metric?

After the threshold parameters thr1 and thr2 have been determined, training is complete and it is time to apply the chosen metric to the test set in order to evaluate its real accuracy.

What is the energy metric used to identify the mode of transport of a user?

The energy metric has been used [41] to identify the mode of transport of a user with a single accelerometer, respectively walking, cycling, running and driving.

What are the costs of the string-domain metrics?

As can be observed, string-domain metrics exhibit much lower costs for expensive operations such as sqrt or even multiplications when compared to metrics in the time and frequency domains.

(Open Access) Preprocessing techniques for context recognition from accelerometer data (2010) | Davide Figo

This is an uncorrected proof of an article published in:

Personal and Ubiquitous Computing, vol.14, no.7, pp.645–662, 2010

Preprocessing Techniques for Context Recognition from

Accelerometer Data

Davide Figo · Pedro C. Diniz · Diogo R. Ferreira ·

ao M. P. Cardoso

Received: date / Accepted: date

Abstract The ubiquity of communication devices such as smartphones has led to the emer-

gence of context-aware services that are able to respond to speciﬁc user activities or con-

texts. These services allow communication providers to develop new, added-value services

for a wide range of applications such as social networking, elderly care, and near-emergency

early warning systems. At the core of these services is the ability to detect speciﬁc physical

settings or the context a user is in, using either internal or external sensors. For example,

using built-in accelerometers it is possible to determine if a user is walking or running at

a speciﬁc time of day. By correlating this knowledge with GPS data it is possible to pro-

vide speciﬁc information services to users with similar daily routines. This article presents

a survey of the techniques for extracting this activity information from raw accelerometer

data. The techniques that can be implemented in mobile devices range from classical signal

processing techniques such as FFT to contemporary string-based methods. We present ex-

perimental results to compare and evaluate the accuracy of the various techniques using real

data sets collected from daily activities.

Keywords Activity detection · Context-aware applications · Mobile computing · Sensor

data

1 Introduction

Mobile communication devices as the ubiquitous cellular phones, and more recently smart-

phones, have exploded in number and computing capabilities in recent years. Rather than

supporting only voice communications, contemporary devices have sophisticated internal

hardware architectures and possibly also an extended range of functions such a GPS loca-

tion, e-mail, organizer and synchronization with external, often centralized services. Ad-

Davide Figo, Pedro C. Diniz, Diogo R. Ferreira

IST – Technical University of Lisbon

Avenida Prof. Dr. Cavaco Silva, 2744-016 Porto Salvo, Portugal

Corresponding author: diogo.ferreira@ist.utl.pt

ao M. P. Cardoso

Faculty of Engineering, University of Porto (FEUP)

Rua Dr. Roberto Frias, 4200-465 Porto, Portugal

vanced units can even be equipped with a wide range of internal sensors including three-

dimensional accelerometers as well as the ability to interface with external web-based sensor

services such as trafﬁc information.

Using sensor data, mobile devices can provide users with an wide range of added-value

services. For example, by analyzing accelerometer data a device can understand that the

user is performing some physical activity such as walking or running. This knowledge can

be gathered over a period of time, say a week or even a month, to recognize trends or daily

habits. Knowing that at a speciﬁc time of day a user might be jogging at a speciﬁc location, it

is possible to send a message advertising a refreshment booth or advertising a speciﬁc brand

of running shoes.

Potential services are not limited to individual end-users. By correlating

daily activity patterns, communication providers can also offer services to communities of

users with similar weekly habits, thus promoting and enhancing the use of the underlying

communication infrastructure.

The services are not restricted to leisure-oriented activities. Understanding the physical

situation of a user can also be used for early-warning healthcare related applications. Rec-

ognizing that an elderly person has fallen at his/her home and has not moved in the last

30 seconds indicates a potential emergency situation for which a relative or a local emer-

gency unit should be alerted. In civil protection scenarios, knowing the location and the

state of readiness of the elements of an emergency response team could dramatically reduce

dispatching time and thus the response lag.

A key enabler for these context-aware services that providers can now offer lies the abil-

ity of mobile devices to acquire, manage, process, and obtain useful information from raw

sensor data. From these data, devices must be able to accurately discover the characteristics

or features of the signal coming from a given sensor. Sensors do generate a high-volume

of raw data possibly contaminated with environment noise that needs to be ﬁltered out. In

addition, the device must generate a very low number of incorrectly recognized features to

improve the accuracy of subsequent information processing stages where features are ana-

lyzed and organized into user context patterns.

The inclusion of sensors for context discovery in mobile devices is commonly organized

as part of a software stack with a general architecture similar to the one depicted in Figure 1.

At the lowest levels of the stack we have preprocessing phases where the device attempts to

extract a set of basic features from the sensor signal. These features include speciﬁc short-

term contexts or states such absence of light, quick movement or more sophisticated contexts

such as running. It is based on these states that the next layer – the base-level classiﬁer – will

determine higher-level user contexts such as activities like jogging or exercising. Finally, a

layer of application code and scripting will use the context history to infer daily or weekly

activities or routines.

In contrast with other sensors, which provide an instant value that can be used directly

for context inference, the signal coming from an accelerometer may require the use of a

fairly complex preprocessing stage in order to characterize the physical activity of the user

within a certain time frame. Given the signiﬁcance of this problem, a large number of tech-

niques have been developed to address it. In this article we survey the most representative do-

main approaches and techniques, including spectral analysis techniques such as Fast Fourier

Transforms (FFT), statistics-based metrics, and even string matching approaches. These ap-

For example, the Nike+ project (http://nikerunning.nike.com/nikeplus/) collects data captured by an ac-

celerometer located on the user’s running shoes. The user can then upload the data to a personal computer

and use an application that analyzes the running habits and physical effort to recommend training regimes.

Fig. 1 Layered architecture for context inference applications.

proaches vary widely in their context-recognition accuracy and often require speciﬁc input

representations.

In Section 2 we survey a wide range of techniques used to recognize user activities;

these techniques are organized into several broad domain approaches. Then in Section 3 we

present the results on the application of these techniques to a set of experimental data to

compare their beneﬁts and computational cost. We conclude the article in section 4.

2 Preprocessing Techniques: Domains and Approaches

The need to extract key signal features that enable advanced processing algorithms to dis-

cover useful context information has led to the development of a wide range of algorithmic

approaches. These approaches rely on converting or transforming the input signals to and

from different domains of representation. In each domain there are speciﬁc methods to ab-

stract raw signal data and to provide, in addition to an early classiﬁcation, some form of

data compression that makes it possible in many cases to apply higher-level algorithms for

context recognition.

As depicted in Figure 2 it is possible to classify the available sensor signal processing

techniques is three broad domains, namely: the time domain, the frequency domain and

what we call discrete representation domains. The following subsections describe the most

representative techniques in each of these domains in order to compare their implementation

complexity and accuracy in extracting signal features and identifying user activities.

2.1 Time Domain: Mathematical and Statistical Techniques

Simple mathematical and statistical metrics can be used to extract basic signal information

from raw sensor data. In addition, these metrics are often used as preprocessing steps for

metrics in other domains as a way to select key signal characteristics or features. These

techniques are often used in practical activity recognition algorithms.

Preprocessing Techniques

Time Domain Frequency Domain Discrete Domain

Mathematical/Statistical

Functions

Other

Functions

Wavelet

Transformations

Fourier

Transformations

Symbolic String

Representations

Coefficients Sum

Euclidean-based Distances

Levenshtein Edit Distance

Dynamic Time Warping

Coefficients Sum

DC component

Dominant Frequency

Energy

Info. Entropy

Differences

Angular Velocity

Zero-Crossings

SMA

SVM

DSVM

Mean, Median

Variance, Std Deviation

Min, Max, Range

RMS

Correlation, Cross-Correlation

Integration

Fig. 2 Classiﬁcation of techniques applied to sensor signals for feature extraction.

2.1.1 Statistical Metrics: Mean, Variance and Standard Deviation

The mean over a window of data samples is a meaningful metric for almost every kind of

sensor. This metric can be calculated with small computational cost [48] and be done on the

ﬂy with minimal memory requirements. The mean is usually applied in order to preprocess

raw data by removing random spikes and noise (both mechanical and electrical) from sensor

signals, smoothing the overall dataset.

There have been various early uses of the mean metric in activity recognition. Several

researchers have used the mean to either directly or indirectly identify user posture (sitting,

standing or lying) [11, 19, 22, 23] and also to discriminate the type of activity as either

dynamic or static [60]. Others have used the mean as input to classiﬁers like Neural Net-

works [51, 59], Naive Bayes [27], Kohonen Self-Organizing Maps [29], Decision Trees [5],

and even Fuzzy Inference [20]. Other applications of the mean value include, for example,

axial calibration by ﬁnding the average value for all the different orientations [7] and the

recognition of complex human gesture using Hidden Markov Models [8].

Another important statistical metric is the variance (

) deﬁned as the average of the

squared differences from the mean. The standard deviation (

) is the square root of the

variance and represents both the variability of a data set and a probability distribution. The

standard deviation can give an indication of the stability of a signal. The measure is less

useful if it is known that the signal can include spurious values, as even a single value can

distort the result.

These two statistical metrics are often used as a signal feature in many activity recogni-

tion approaches where they have been used as an input to a classiﬁer or to threshold-based

algorithms [12, 15, 30].

In other approaches [22, 23] the variance and standard deviation were used to infer user

movement or have been used as the base metric for classiﬁers like the Naive Bayes [27],

Dynamic Bayesian Networks [61], Neural Networks [59] and by [41] to identify the mode

of transport. The variance has also been used in [58] in a combination with other metrics

such as mean and maximum.

2.1.2 Envelope Metrics: Median, Maximum, Minimum and Range

The median is the number that separates the higher half of data samples from the lower

half. The value provided by the median is typically used to replace missing values from a

sequence of discrete values (see e.g. [63]).

Despite their simplicity these envelope metrics still offer some value in activity recogni-

tion (e.g. [2]) or as an input to Neural Networks for identifying different inclination angles

while walking, as well as in to distinguish between types of postures with threshold-based

techniques [3].

The range (the difference between maximum and minimum sample values) was used in

[11] together with other indicators to discriminate between walking and running. Applica-

tion of the maximum and minimum values in accelerometer-based systems was explored to

detect steps with the Twiddler keyboard [4], to detect gestures as mnemonical body short-

cuts [13], and in activity recognition as an input to a Neural Network classiﬁer [59].

2.1.3 Root Mean Square (RMS) Metric

The root mean square (RMS) of a signal x

that represents a sequence of n discrete values

{

,...,x

}

is obtained using equation (1) and can be associated with meaningful context

information.

RMS

+ x

+ ... + x

(1)

The RMS has been used to classify wavelet results by distinguishing walking patterns [53]

and is present in works of activity recognition like [42] as an input to a classiﬁer such as

a Neural Network. It was also used as input to a multi-layer neural network in [8] for the

recognition of a set of gestures and integration with video records.

2.1.4 Position and Velocity using Numeric Integration

The integration metric measures the signal area under the data curve and is commonly ap-

plied to accelerometer signals to obtain estimates of speed and distance [40].

Several approaches have explored this integration technique. In gesture recognition (e.g.

[13]) researchers have used a double integration technique to compute the distance covered

by a gesture. Others have used this technique to determined velocity values and thus identify

gestures using Nintendo Wiimote and a Neural Network classiﬁer [63].

Using the integral of the RMS signal and a simple threshold technique, researchers have

been able to distinguish between higher and lower states of activity intensity [15]. In other

approaches Lee et al. [30] used integration to compute the angular velocity of data supplied

by a gyroscope.

2.1.5 Signal Correlation and Correlation-Coefﬁcient

Signal correlation is used to measure the strength and direction of a linear relationship be-

tween two signals. In activity recognition, correlation is especially useful in differentiating

between activities that involve translation in a single dimension [43].

In order to calculate the degree of correlation it is necessary to calculate the correlation

coefﬁcients between the signals for the various axes. These coefﬁcients can be obtained by

several statistical and geometrical formulas, depending on whether the situation involves

Preprocessing techniques for context recognition from accelerometer data

Figures

Citations

Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition

A tutorial on human activity recognition using body-worn inertial sensors

A Tutorial on Human Activity Recognition Using Body-Worn

Convolutional Neural Networks for human activity recognition using mobile sensors

Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges

References

A wavelet tour of signal processing

Dynamic programming algorithm optimization for spoken word recognition

Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

Thirteen ways to look at the correlation coefficient

Activity recognition from user-annotated acceleration data

Related Papers (5)

Activity recognition from user-annotated acceleration data

Activity recognition using cell phone accelerometers

A tutorial on human activity recognition using body-worn inertial sensors

A Survey on Human Activity Recognition using Wearable Sensors

Activity recognition from accelerometer data

Frequently Asked Questions (17)

Q1. What contributions have the authors mentioned in the paper "Preprocessing techniques for context recognition from accelerometer data" ?

Q2. What is the first step to calculating the norm ni for each individual sample?

Q3. What is the frequency at which the signal peak occurs?

Q4. Why is the wavelet transform used in the detection of human activity?

Q5. What is the metric used to measure the signal area under the data curve?

Q6. What is the key enabler for context-aware services?

Q7. What is the key metric for the recognition of specific activities?

Q8. What is the common approach to compute the spectral coefficients?

Q9. What is the common metric used to determine the correlation coefficient between two signals?

Q10. What is the cost of a more complicated control-flow implementation of the sorting algorithm?

Q11. What are the representative domain approaches and techniques?

Q12. What are the two statistical metrics used in many activity recognition approaches?

Q13. What is the way to determine the threshold parameters for a given metric?

Q14. What is the energy metric used to identify the mode of transport of a user?

Q15. What is the common way to extract basic features from the sensor signal?

Q16. What was used to discriminate between walking and running?

Q17. What are the costs of the string-domain metrics?