scispace - formally typeset
Open AccessJournal ArticleDOI

A Posture Recognition-Based Fall Detection System for Monitoring an Elderly Person in a Smart Home Environment

Reads0
Chats0
TLDR
A novel computer vision-based fall detection system for monitoring an elderly person in a home care application that can achieve a high fall detection rate and a very low false detection rate in a simulated home environment is proposed.
Abstract
We propose a novel computer vision-based fall detection system for monitoring an elderly person in a home care application. Background subtraction is applied to extract the foreground human body and the result is improved by using certain postprocessing. Information from ellipse fitting and a projection histogram along the axes of the ellipse is used as the features for distinguishing different postures of the human. These features are then fed into a directed acyclic graph support vector machine for posture classification, the result of which is then combined with derived floor information to detect a fall. From a dataset of 15 people, we show that our fall detection system can achieve a high fall detection rate (97.08%) and a very low false detection rate (0.8%) in a simulated home environment.

read more

Content maybe subject to copyright    Report

This item was submitted to Loughborough’s Institutional Repository
(https://dspace.lboro.ac.uk/) by the author and is made available under the
following Creative Commons Licence conditions.
For the full text of this licence, please go to:
http://creativecommons.org/licenses/by-nc-nd/2.5/

1
Posture Recognition Based Fall Detection System
For Monitoring An Elderly Person In A Smart
Home Environment
Miao Yu, Adel Rhuma, Syed Mohsen Naqvi, Liang Wang and Jonathon Chambers
Abstract We propose a novel computer vision based fall
detection system for monitoring an elderly person in a home
care application. Background subtraction is applied to extract
the foreground human body and the result is improved by using
certain post-processing. Information from ellipse fitting and a
projection histogram along the axes of the ellipse are used as
the features for distinguishing different postures of the human.
These features are then fed into a directed acyclic graph support
vector machine (DAGSVM) for posture classification, the result
of which is then combined with derived floor information to
detect a fall. From a dataset of 15 people, we show that our fall
detection system can achieve a high fall detection rate (97.08%)
and a very low false detection rate (0.8%) in a simulated home
environment.
Index Terms Health care, assistive living, fall detec-
tion, multi-class classification, DAGSVM, system integration
I. INTRODUCTION
In this section, we will briefly review the existing fall detection
systems and describe our new computer vision based fall detection
system.
A. Current fall detection techniques
Nowadays, the trend in western countries is for populations to
contain an increasing number of elderly people. As shown in [1],
the old-age dependency ratio (the number of people 65 and over
relative to those between 15 and 64) in the European Union (EU) is
projected to double to 54 percent by 2050, which means that the EU
will move from having four persons of working age for every elderly
citizen to only two. So, the topic of home care for elderly people is
receiving more and more attention. Among such care, one important
issue is to detect whether an elderly person has fallen or not [2].
According to [2], falls are the leading cause of death due to injury
among the elderly population and 87% of all fractures in this group
are caused by falls. Although many falls do not result in injuries,
47% of non-injured fallers can not get up without assistance and this
period of time spent immobile also affects their health. An efficient
fall detection system is essential for monitoring an elderly person and
can even save his life in some cases. When an elderly person falls,
a fall detection system will detect the anomalous behavior and an
alarm signal will be sent to certain caregivers (such as hospitals or
health centers) or the elderly person’s family members by a modern
communication method. Fig. 1 shows such a fall detection system.
Different methods have been proposed for detecting falls and
are mainly divided into two categories: non-computer vision based
methods and computer vision based methods.
Copyright (c) 2007 IEEE. Personal use of this material is permitted.
Miao Yu, Adel Rhuma, Syed Mohsen Naqvi and Jonathon Chambers are
with the Advanced Signal Processing Group, School of Electronic, Electrical
and Systems Engineering, Loughborough University, UK, e-mails: (m.yu,
a.rhuma, s.m.r.naqvi, j.a.chambers)@lboro.ac.uk.
Liang Wang is with the National Laboratory of Pattern Recognition
(NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing,
China, e-mail: wangliang@nlpr.ia.ac.cn.
1) Non-computer vision based methods: There are many non-
computer vision based methods for fall detection [3], [4], [5] and [6].
For these methods, different sensors (including acceleration sensors,
acoustic sensors and floor vibration sensors) are used to capture the
sound, vibration and human body movement information and such
information is applied to determine a fall.
Veltink et al. [3] were the first to utilize a single axis acceleration
sensor to distinguish dynamic and static activities. In their work,
acceleration sensors were placed over the chest and at the feet to
observe the changes, and a threshold based algorithm was applied on
the measured signals for fall detection. Kangas et al. [4] proposed an
improved scheme, they used a single three axis acceleration sensor
to attach to the subject’s body in different positions and the dynamic
and static acceleration components measured from these acceleration
sensors were compared with appropriate thresholds to determine a
fall. Experimental results confirmed that a simple threshold based
algorithm was appropriate for certain falls. Some researchers have
also used acoustic sensors for fall detection. In [5], an acoustic
fall detection system (FADE) that would automatically signal a fall
to the monitoring care giver was designed. A circular microphone
array was applied to capture and enhance sounds in a room for
the classification of ‘fall’ or ‘non-fall’, and the height information
of the sound source was used to reduce the false alarm rate. The
authors evaluated the performance of FADE using simulated fall and
nonfall sounds performed by three stunt actors trained to behave like
elderly people under different environmental conditions and good
performance was obtained (100% fall detection rate and 3% false
detection rate using a dataset consisting of 120 falls and 120 nonfalls).
Y. Zigel et al. in [6] proposed a fall detection system based on floor
vibration and sound sensing. Temporal and spectral features were
extracted from signals and a Bayes’ classifier was applied to classify
fall and nonfall activities. In their work, a doll which mimicked a
human was used to simulate falls and their system detected such
falls with a fall detection rate of 97.5% and a false detection rate of
1.4%.
Although non-computer vision based methods may appear to
be suitable for wide application in the fall detection field, several
problems do exist; they are either inconvenient (elderly people have
to wear acceleration sensors) or easily affected by noise in the
environment (acoustic sensors and floor vibration sensors). In order
to overcome these problems, computer vision based fall detection
techniques are adopted. Infringement of personal privacy is a con-
cerning issue for computer vision based fall detection systems and
elderly people may worry that they are being ‘watched’ by cameras.
However, in most computer vision based fall detection systems,
only the alarm signal (sometimes with a short video clip as further
confirmation of whether an elderly person has fallen or not) will be
sent to the caregivers or family members when a fall is detected;
additionally, the original video recordings of an elderly person’s
normal activities will not be stored, nor transmitted.
2) Computer vision based methods: In the last 10 years, there
have been many advances in computer vision and camera/video and
image processing techniques that use real time movement of the
subject, which opens up a new branch of methods for fall detection.
For computer vision based fall detection methods, some re-

2
Fig. 1. Schematic representation of a fall detection system.
searchers have extracted information from the captured video and
a simple threshold method has been applied to determine whether
there is a fall or not; representative ones due to Rougier et al. are [7]
and [8]. In these two papers, the head’s velocity information and the
shape change information were extracted and appropriate thresholds
were set manually to differentiate fall and non-fall activities. However
these two methods produce high false detection rates (such as when
a fast sitting activity was misclassified as a fall activity [7]) and
the performance was strongly related to the set threshold. Another
threshold based method was proposed in [9] in which calibrated cam-
eras were used to reconstruct the three-dimensional shape of people.
Fall events were detected by analyzing the volume distribution along
the vertical axis, and an alarm was triggered when the major part
of this distribution was abnormally near the floor over a predefined
period of time. The experimental results showed good performance
of this system (achieving 99.7% fall detection rate or better with four
cameras or more) and a graphic processing unit (GPU) was applied
for efficient computation.
With the recent rapid development of pattern recognition tech-
niques many researchers have exploited such methods in fall detec-
tion. Posture recognition based fall detection methods are proposed in
[10], [11], [12] and [13]; in [10], the researchers used a neural fuzzy
network for posture classification, and when the detected posture
changed from ‘stand’ to ‘lie’ in a short time, a fall activity was
detected. A similar idea was proposed in [11] except that the classifier
was replaced with a more common k-nearest neighbour classifier;
moreover, statistical hypothesis testing was applied to obtain the
critical time difference to differentiate a fall incident event from a
lying down event, and a correct detection rate of 84.44% was obtained
according to their experimental results. In [12] and [13], Mihailidis
et al. used a single camera to classify fall and non-fall activities.
Carefully engineered features, such as silhouette features, lighting
features and flow features were extracted to achieve robustness in the
system to lighting, environment and the presence of multiple moving
objects. In [13] three pattern recognition methods were compared
(logistic regression, neural network and support vector machine) and
the neural network achieved the best performance with a fall detection
rate of 92% and a false detection rate of 5%.
Some other researchers classified fall and non-fall activities based
on the features extracted from short video clips. The representative
papers are [14] and [15]. For [14], a bounding box and motion
information were extracted from consecutive silhouettes as features.
These were then used to train a hidden Markov model (HMM) for
classifying fall and non-fall activities. In [15], a person’s three-
dimensional orientation information was extracted from multiple
uncalibrated cameras, and an improved version of HMM–layered
hidden Markov model (LHMM) was used for fall detection. Although
theoretically elegant, insufficient experimental results were provided
in this paper (it only concerned two kinds of activities walking and
falling).
There are also some other computer vision based methods for
fall detection. Nait-Charif and McKenna [16] proposed a method for
automatically extracting motion trajectory and providing a human
readable summary of activity and detection of unusual inactivity
in a smart home. A fall was detected as a deviation from usual
activity according to the particle filter-based tracking results. This
method exploited an unsupervised approach to detect abnormal events
(mainly falls) and as it is common with unsupervised methods has
the disadvantage that a long training period was required. In [17], D.
Anderson proposed a fuzzy logic based linguistic summarization of
video for fall detection. A hierarchy of fuzzy logic was used, where
the output from each level was summarized and fed into the next
level for inference. Corresponding fuzzy rules were designed under
the supervision of nurses to ensure that they reflect the manner in
which elders perform their activities. This system was tested on a
dataset which contained 14 fall a ctivities and 32 non-fall activities;
all the fall activities were correctly detected and only two non-fall
activities were mistaken as fall activities, which shows an acceptable
level of performance.
In this paper, we propose a new computer vision fall detection
system which is based on posture recognition using a single camera
to monitor an elderly person who lives alone at home. An efficient
codebook background subtraction algorithm is applied to extract
the human body foreground and some post-processing is applied
to improve the results. From the extracted foreground silhouette,
we extract features from the fitted ellipse and projection histogram,
which are used for classification purposes. These features are fed into
the DAGSVM (which is trained from a dataset containing features
extracted from different postures in different orientations) and the
extracted foreground silhouette is classified as one of four different
postures (bend, lie, sit and stand). The classification results, together
with the detected floor information, are then used to determine fall
or non-fall activities. The flow chart of the proposed fall detection
system is shown in Fig. 2. In the next sections, we will describe
different blocks of this flow chart in detail.
II. METHODS
A. Human body extraction
1) Background subtraction: In visual surveillance, a common
approach for discriminating moving objects from the background
is detection by background subtraction. Currently, there are many
background subtraction algorithms, these include the single-mode
model background subtraction method [18] and [19], the mixture
of Gaussians (MoG) background subtraction method [20], the non-
parametric density estimation based method [21] and the codebook
background subtraction method [22]. In this fall detection system,
we use the codebook method because of its advantages. There is
no parametric assumption on the codebook model and it shows the
following merits as proposed in [22]: (1) resistance to artifacts of
acquisition, digitization and compression, (2) capability of coping
with illumination changes, (3) adaptive and compressed background

3
Fig. 2. The flow chart of the proposed fall detection system
models that can capture structural background motion over a long
period of time under limited memory, (4) unconstrained training that
allows moving foreground objects in the scene during the initial
training period.
The codebook method is available for both colour and gray-
scale images, it is a pixel-based approach and initially a code-
book is constructed for each pixel during a training phase. As-
suming the training dataset I contains a number of N images:
I = {imag
1
, ..., imag
N
}, then, for a single pixel (x,y), it has
N training samples imag(x, y)
1
, ..., imag(x, y)
N
. From these N
training samples, a codebook is constructed for this pixel, which
includes a certain number of codewords. Each codeword, denoted
by c, consists of an RGB vector v = (R, G, B) and a 6-tuple
aux = (
ˆ
I,
ˇ
I, f, λ, p, q). The meanings of the six parameters in aux
are described as follows:
ˆ
I Maximum intensity that has been represented by the codeword.
ˇ
I Minimum intensity that has been represented by the codeword.
f Number of times that the codeword has been used.
λ Maximum negative runtime length (MNRL) in number of frames.
p The first frame in which this codeword was used.
q The last frame in which this codeword was used.
The details of the training procedure are given in [22] and the
trained codebooks of pixels are then used for background subtraction
purpose. For an incoming colour frame f, its pixel f(x, y) =
(R(x, y), G(x, y), B(x, y)) (a 3-dimensional vector) is determined
as a foreground or background pixel by comparing f(x, y) with
codewords in the codebook of this pixel. If f(x, y) is not matched
with any codeword, then it is a foreground pixel. For a particular
codeword c , we say the codeword c matches f(x, y) if the following
two conditions are met.
colordist(f(x, y ), c) ε
br ightness(I,
ˆ
I,
ˇ
I) = true (1)
where ε is a preset threshold value for comparison, I represents the
norm of f(x, y),
ˆ
I and
ˇ
I are the first two parameters of the 6-tuple
aux vector of the codeword c.
The col ordist(f(x, y), c) measures the chromatic difference be-
tween two colour vectors, which can be calculated by:
colordist(f(x, y), c) =
f(x, y)
2
f(x, y), v
v
2
(2)
where v represents the RGB vector v = (R, G, B) of codeword c,
and · and ⟨·⟩ denote respectively the Euclidean norm and dot
product operations.
The brightness(I,
ˆ
I,
ˇ
I) is defined as:
br ightness(I,
ˆ
I,
ˇ
I) =
tr ue if I
low
≤∥ f(x, y) ∥≤ I
hi
false otherwise
(3)
where I
low
= α
ˆ
I and I
hi
= min{β
ˆ
I,
ˇ
I
α
}. In our experiment, α and
β are fixed to be 0.5 and 2 respectively for background subtraction.
An important problem in background subtraction is background
model updating, because the background will not be kept constant
(such as with gradual light change, or movement of the furniture).
The codebook background subtraction method therefore provides a
background model updating scheme. The matched codeword accord-
ing to (1) is updated as shown in [22]. Moreover, an additional cache
model is introduced. If one codeword in this model is matched with
the incoming pixel values for a period longer than a time threshold
(which means this codeword is a new background codeword), it is
added to the original codebook. And for a codeword which is not
matched with incoming pixels longer than a time threshold (which
means this codeword is no longer a background codeword), it is
deleted from the codebook. Through the background model updating
scheme, we can cope with change of the background in an indoor
environment.
2) Post-processing: The result of the codebook background
subtraction is definitely not perfect and needs to be improved to obtain
a more accurate result to define the human’s silhouette. As shown
in one example in Fig. 3 (d) (the original background subtraction
result), we can see two types of problems: 1) There are many noise-
like pixel regions (very small areas which have sizes of less than 50
pixels, marked in blue); 2) Occasional movement of furniture, may
produce ghost foreground regions (marked in yellow in Fig. 3 (d)) and
the furniture at a new position can also be taken as the foreground
(marked green in Fig. 3 (d)). These two problems will definitely
deteriorate the result of the human body extraction. In order to solve
these problems, certain post-processing is applied.
As proposed in [23], the connected foreground pixels form a region
termed as a blob. By using the OpenCV blob library [24], we obtain
blobs in a binary image format and small blobs with a size smaller
than 50 pixels are removed. In this way, noise can be removed.
The background updating scheme can cope to some extent with the
large ghosting errors caused by movement of furniture, and furniture
appearing at a new position through absorbtion into the background
model [22]. However, there are two problems if we rely solely on the
background updating scheme: 1) It will take a time for ghosting and
furniture to be absorbed into the background model by background
updating; 2) The background updating scheme will wrongly absorb a
foreground human body into the background model if he/she is static
for a long time. In order to solve these two problems, we use a novel
three step blob operation strategy as follows:
Step1. Blob merging: If the distance between two blobs is less
than a threshold, these two blobs will be merged (as shown in Fig.
3 (d), the blobs B2 and B3 contain several separate blobs which are
near to each other). The distance between two blobs is defined as the
minimum 4-distance [23] between two rectangles which enclose the
blobs as given by:
D istance( B1, B2) = min
p1R1,p2R2
d
4
(p1, p2) (4)
where B1 and B2 are two blobs, R1 and R2 are two rectangles
which enclose them, and p1 and p2 are points belonging to R1 and
R2. Fig. 4 shows examples of the distance between two blobs with
respect to their positions.
Step2. Active blob determination: If the number of blobs after
blob merging is more than one, it suggests some furniture has been
moved (and we assume that the elderly person lives alone so that
normally there should be only one human moving object). In this
case, we determine which blob is the moving blob by using the frame

4
B1
B2
B3
(a)
(b)
(c)
(d)
(e)
Fig. 3. The background subtraction and the human body blob determination.
a) Background image; b) Image with human object; c) Frame difference result
obtained from two consecutive frames; d) Original background subtraction
result, there are three large blobs (B1, B2 and B3) after the blob merging
operation and they are marked red, green and yellow, and the blue colour
represents the small noise-like blobs; e) The final obtained human body blob.
difference technique [23]. Frame differencing is applied between
consecutive frames to obtain the moving pixels (shown in Fig.3 (c)),
and the blob with the greatest number of moving pixels is taken as
the moving blob (human body blob). From Fig.3, we can see that the
blob B1 contains the most moving pixels and so B1 is finally taken
as the human body blob.
Step3. Selective updating: The non-active blobs are removed (as
shown in Fig.3 (e), B2 and B3 are removed from the final background
subtraction result) and their pixel values form new codewords to
be added to the background codebook immediately for background
model updating. And no updating is performed for pixels in the active
blob.
In this way, ghosting and furniture at a new position are absorbed
into the background model immediately; while the foreground human
body object is not absorbed into the background model even though
he/she is static for a long time.
Fig. 4. Four cases of the distance between two blobs with respect to their
relative positions
3) Background model retraining: The trained background
codebook model can be affected in various ways, such as dramatic
global illumination change due to suddenly turning on the light. In
this situation, the codebook needs to be re-trained because the previ-
ous codebook is no longer available. The dramatic global illumination
change can be detected by frame differencing results, if the percent
of the active pixels in an image is larger than a threshold (we set
50%), then we assume that dramatic global illumination change has
occurred and the background model is retrained.
Next, having extracted a silhouette representation of the human,
we consider feature extraction to describe the posture of the person.
B. Feature extraction
After human body region extraction, the next step is to extract
useful features from the human body region. For feature extraction,
we extract two kinds of features: global features (which roughly
describe the shape of the human body) and local features (which
encapsulate the detail information of the posture of the human body).
To obtain global features, we use ellipse fitting [25] for a binary
image. The moments for a binary image f(x, y) are given as:
m
pq
=
x,y
x
p
y
q
f(x, y) with p, q = 0, 1, 2, 3........... (5)
By using the first and zero order spatial moments, we can compute
the center of the ellipse as: ¯x = m
10
/m
00
and ¯y = m
01
/m
00
. The
angle between the major axis of the person and the horizontal axis x
gives the orientation of the ellipse, and it is computed as:
Θ =
1
2
arctan(
2u
11
u
20
u
02
) (6)
where the central moment can be calculated as:
u
pq
=
x,y
(x ¯x)
p
(y ¯y)
q
f(x, y) with p, q = 0, 1, 2, 3...........
(7)
The major semi-axis a and the minor semi-axis b can be obtained
by calculating the greatest and least moments of inertia, here we
denote them as I
max
and I
min
. They can be calculated by evaluating
the eigenvalues of the covariance matrix:
J =
u
20
u
11
u
11
u
02
(8)
These are calculated as:
I
max
=
u
20
+ u
02
+
(u
20
u
02
)
2
+ 4u
2
11
2
(9)
I
min
=
u
20
+ u
02
(u
20
u
02
)
2
+ 4u
2
11
2
(10)
Finally, according to [8], we can calculate the major semi-axis a
and minor semi-axis b as:
a = (4)
1/4
[
(I
max
)
3
I
min
]
1/8
(11)
b = (4)
1/4
[
(I
min
)
3
I
max
]
1/8
(12)
An ellipse fitting result is depicted in Fig.5, and we compare the
ellipse fitting result and the rectangle fitting result used in [10]. The
ellipse fitting is clearly better in describing the human posture in the
presence of noise (such as the line underneath a person’s feet due to
the poor segmentation, as shown in Fig.5). After ellipse fitting, the
orientation of the ellipse and the ratio between a and b are taken as
global features, which have been found experimentally to be sufficient
to describe the posture of a human body.
Such global features are, however, insufficient to describe the
postures in detail, and sometimes it is hard to differentiate two
postures by using only the global information (such as a sit posture
and a sit-like bend posture). We need to use more information (local

Citations
More filters
Journal ArticleDOI

Internet of Things: Architectures, Protocols, and Applications

TL;DR: This survey paper proposes a novel taxonomy for IoT technologies, highlights some of the most important technologies, and profiles some applications that have the potential to make a striking difference in human life, especially for the differently abled and the elderly.
Journal ArticleDOI

Smart Homes for Elderly Healthcare—Recent Advances and Research Challenges

TL;DR: A comprehensive review on the state-of-the-art research and development in smart home based remote healthcare technologies is presented.
Journal ArticleDOI

A Survey on Activity Detection and Classification Using Wearable Sensors

TL;DR: This is one of the first surveys to provide such breadth of coverage across different wearable sensor systems for activity classification, and found that these single sensing modalities laid the foundation for hybrid works that tackle a mix of global and local interaction-type activities.
Journal ArticleDOI

Remote patient monitoring: a comprehensive study

TL;DR: This study provides a review of the recent advances in remote healthcare and monitoring in both with-contact and contactless methods and discusses some issues available in most systems.
Journal ArticleDOI

Survey on Fall Detection and Fall Prevention Using Wearable and External Sensors

TL;DR: This paper surveys the state of the art in FD and FP systems, including qualitative comparisons among various studies, and aims to serve as a point of reference for future research on the mentioned systems.
References
More filters

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Book

Pattern Recognition and Machine Learning

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Journal ArticleDOI

Pattern Recognition and Machine Learning

Radford M. Neal
- 01 Aug 2007 - 
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Proceedings ArticleDOI

Adaptive background mixture models for real-time tracking

TL;DR: This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model, resulting in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes.
Journal ArticleDOI

Pfinder: real-time tracking of the human body

TL;DR: Pfinder is a real-time system for tracking people and interpreting their behavior that uses a multiclass statistical model of color and shape to obtain a 2D representation of head and hands in a wide range of viewing conditions.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What are the contributions in "Posture recognition based fall detection system for monitoring an elderly person in a smart home environment" ?

The authors propose a novel computer vision based fall detection system for monitoring an elderly person in a home care application. From a dataset of 15 people, the authors show that their fall detection system can achieve a high fall detection rate ( 97. 08 % ) and a very low false detection rate ( 0. 8 % ) in a simulated home environment. 

To form their posture dataset, 3200 postures (including 800 stands, 800 sits, 800 lies and 800 bends) from 15 people were recorded. 

in most computer vision based fall detection systems, only the alarm signal (sometimes with a short video clip as further confirmation of whether an elderly person has fallen or not) will be sent to the caregivers or family members when a fall is detected; additionally, the original video recordings of an elderly person’s normal activities will not be stored, nor transmitted. 

In visual surveillance, a common approach for discriminating moving objects from the background is detection by background subtraction. 

Assuming the training dataset The authorcontains a number of N images: The author= {imag1, ..., imagN}, then, for a single pixel (x,y), it has N training samples imag(x, y)1, ..., imag(x, y)N . 

(2) If the training dataset is large enough, the well-trained classifier can effectively distinguish different types of postures, which are used for fall detection. 

From the extracted foreground silhouette, the authors extract features from the fitted ellipse and projection histogram, which are used for classification purposes. 

as the authors have discussed, multiple moving objects and occlusions are two problems needed to be solved for their fall detection system, which can be addressed by using a multiple cameras scheme with adding corresponding modules for people counting and object classification. 

For (b), although a ‘lie’ posture is detected, the human body blob is not in the floor region, so the lying on the sofa case is correctly classified as non-fall. 

Although non-computer vision based methods may appear to be suitable for wide application in the fall detection field, several problems do exist; they are either inconvenient (elderly people have to wear acceleration sensors) or easily affected by noise in the environment (acoustic sensors and floor vibration sensors). 

Different methods have been proposed for detecting falls and are mainly divided into two categories: non-computer vision based methods and computer vision based methods. 

For (d) and (e), either the detected ‘bend’ posture does not hold for a long time (for case (d), a person ties his shoe lace and the ‘bend’ posture recovers to ‘stand’ posture in a short time), or the posture is not in the ground region (only a small portion of the human body region is in the ground), so they are not detected as falls.