What are the contributions mentioned in the paper "An address-event fall detector for assisted living applications" ?

In this paper, the authors describe an address-event vision system designed to detect accidental falls in elderly home care applications. The sensor reports a fall at ten times higher temporal resolution than a frame-based camera and shows 84 % higher bandwidth efficiency as it transmits fall events. In this paper, the authors describe an address-event vision system designed to detect accidental falls in elderly home care applications. The sensor reports a fall at ten times higher temporal resolution than a frame-based camera and shows 84 % higher bandwidth efficiency as it transmits fall events. In this paper, the authors describe an address-event vision system designed to detect accidental falls in elderly home care applications. The sensor reports a fall at ten times higher temporal resolution than a frame-based camera and shows 84 % higher bandwidth efficiency as it transmits fall events.

Why is a 16-bit microcontroller available for the centroid computation and thresholding?

Due to its low computation complexity, a low-power and low-cost 16-bit microcontroller [31] is commercially available for the centroid computation and thresholding.

How much power is needed to run the detector?

The power budget of the detector is approximately 31 mW, including 30 mW for the image sensor [19] and 1 mW for the 16-bit microcontroller.

What are the fall types that the authors tested in this work?

The fall scenarios the authors tested in this work included a variety of fall types, such as fall forward, fall backward, and fall sideways.

What are the advantages of ATC vision sensors?

ATC vision sensors have two main advantages when compared to frame-based image sensors: first, the ATC vision sensor has a higher temporal resolution in high-speed tracking applications.

How is the vertical velocity invariant to distance?

In order to be invariant to distance, the vertical velocity is divided by the height of the subject in pixels, , as shown in (3).

Why did the authors choose not to use the ATC frame for comparison purposes?

The noise events can be filtered out by the fact that they are spatio-temporally uncorrelated [27] but the authors chose not to do so to keep the computational model closely matched to cheap embedded architectures.

What is the average of the events in the buffer?

As a new event comes in, a computation cycle starts with removing the expired events and appending the incoming event in the buffer.

How many meters away from the camera is the human centroid?

In the experiment, when both subjects are 2 m away from the camera, the vertical address of human centroid fluctuates around 30 to 40, while a pet is approximately at 10.

(Open Access) An Address-Event Fall Detector for Assisted Living Applications (2008) | Zhengming Fu

Q: What is the key feature of the ATC vision sensor?

A temporal contrast vision sensor extracts changing pixels (motion events) from the background [13] and reports temporal contrast, which is equivalent to image reflectance change when lighting is constant.

Q: What is the key feature of the ATC vision sensor used in the fall detection experiment?

In the ATC vision sensor used here, every pixel reports a change in illumination above a certain threshold with an asynchronous event, i.e., pixels are not scanned with a regular frame rate but every pixel is self-timed.

University of Zurich

Zurich Open Repository and Archive

Winterthurerstr. 190

CH-8057 Zurich

http://www.zora.uzh.ch

Year: 2008

An address-event fall detector for assisted living applications

Fu, Z; Delbruck, T; Lichtsteiner, P; Culurciello, E

Fu, Z; Delbruck, T; Lichtsteiner, P; Culurciello, E (2008). An address-event fall detector for assisted living

applications. IEEE Transactions on Biomedical Circuits and Systems, 2(2):88-96.

Postprint available at:

http://www.zora.uzh.ch

Posted at the Zurich Open Repository and Archive, University of Zurich.

http://www.zora.uzh.ch

Originally published at:

IEEE Transactions on Biomedical Circuits and Systems 2008, 2(2):88-96.

Fu, Z; Delbruck, T; Lichtsteiner, P; Culurciello, E (2008). An address-event fall detector for assisted living

applications. IEEE Transactions on Biomedical Circuits and Systems, 2(2):88-96.

Postprint available at:

http://www.zora.uzh.ch

Posted at the Zurich Open Repository and Archive, University of Zurich.

http://www.zora.uzh.ch

Originally published at:

IEEE Transactions on Biomedical Circuits and Systems 2008, 2(2):88-96.

An address-event fall detector for assisted living applications

Abstract

In this paper, we describe an address-event vision system designed to detect accidental falls in elderly

home care applications. The system raises an alarm when a fall hazard is detected. We use an

asynchronous temporal contrast vision sensor which features sub-millisecond temporal resolution. The

sensor reports a fall at ten times higher temporal resolution than a frame-based camera and shows 84%

higher bandwidth efficiency as it transmits fall events. A lightweight algorithm computes an

instantaneous motion vector and reports fall events. We are able to distinguish fall events from normal

human behavior, such as walking, crouching down, and sitting down. Our system is robust to the

monitored person's spatial position in a room and presence of pets.

88 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 2, NO. 2, JUNE 2008

An Address-Event Fall Detector for

Assisted Living Applications

Zhengming Fu, Student Member, IEEE, Tobi Delbruck, Senior Member, IEEE, Patrick Lichtsteiner, Member, IEEE,

and Eugenio Culurciello, Member, IEEE

Abstract—In this paper, we describe an address-event vision

system designed to detect accidental falls in elderly home care

applications. The system raises an alarm when a fall hazard is

detected. We use an asynchronous temporal contrast vision sensor

which features sub-millisecond temporal resolution. The sensor

reports a fall at ten times higher temporal resolution than a

frame-based camera and shows 84% higher bandwidth efﬁciency

as it transmits fall events. A lightweight algorithm computes an

instantaneous motion vector and reports fall events. We are able

to distinguish fall events from normal human behavior, such as

walking, crouching down, and sitting down. Our system is robust

to the monitored person’s spatial position in a room and presence

of pets.

Index Terms—Address-event, AER, assisted living, CMOS

image sensor, elderly home care, fall detection, motion detection,

temporal-difference, vision sensor.

I. INTRODUCTION

UMAN society is experiencing tremendous demographic

changes in aging since the turn of the 20th century. The

current life expectancy in the US is 77.85 years, and is ex-

tending as medical care is improved. According to a report of

U.S. Census Bureau, there will be a 210% increase in the pop-

ulation with age of 65 and over within the next 50 years [1].

The substantial increase in the ageing population will cause so-

ciety to face two challenges: increase of ageing people will re-

quire more investment in elderly care services; the decrease of

working population will cause shortage of skilled caregivers of

elders. In the future, this imbalance between the number of el-

derly people and that of the caregivers will be exacerbated when

life expectancies increase. Intelligent elderly care systems de-

liver one solution to reduce the workload of elderly caregivers

without compromising the quality of services.

In the past, various solutions were proposed based on

emerging technologies.

Video monitoring is a commonly-used

solution in nursing institutions. But considerable human re-

source is required in order to monitor activities. Patients’

privacy is also compromised when they are monitored. Another

common solution is to have patients raise alarms when they

Manuscript received October 1, 2007; revised . First published July 25, 2008;

current version published September 10, 2008. This paper was recommended

by Associate R. Etienne-Cummings.

Z. Fu and E. Culurciello are with the Department of Electrical Engineering,

Yale University, New Haven, CT 06511 USA (e-mail: zhengming.fu@yale.edu;

eugenio.culurciello@yale.edu).

T. Delbruck and P. Lichtsteiner are with the Institute for Neuroinformatics

(INI), Zurich CH-8057, Switzerland.

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TBCAS.2008.924448

are in trouble by pushing a button on a wearable or pendant

device [2]. This solution depends on the patient’s capability

and willingness to raise alarm. For example, a fall may result in

unconsciousness. A dementia patient may not be able or willing

to push the button when necessary [3]. Both scenarios would

limit this “push-the-button” solution in applications. Other

solutions include wearable devices, such as motion detector,

accelerometers, etc [4]–[9]. They are with patients all the time,

continuously collecting and streaming out physical parame-

ters. An alarm is raised when predeﬁned conditions of these

signatures are satisﬁed. The effectiveness of wearable sensors,

however, is also restricted by the willingness of patients to wear

them.

Fall is a major health hazard for the elders when they live

independently [10]. Approximately 30% of 65-year-old people

fall each year. This number becomes higher in medical service

institutions. Although less than one fall in ten results in an in-

jury, a ﬁfth of fall incidents require medical attention. Another

recent publication indicates that 50% of patients in nursery in-

stitutions fall each year, while 40% of them fall more than once

[11]. How to effectively assess, respond, and assist elderly pa-

tients in trouble becomes an important research topic in medical

elderly care services [12].

Elderly care systems aim to effectively evaluate and respond

to the behavior of elderly people when they live alone. These

systems have the following requirements.

1) The sensor systems should be non-intrusive to patient life.

The impact of elderly care systems on patients’ lives is ex-

pected to be reduced to the minimum. From the system’s

perspective, elderly care systems are expected to be small

enough to be placed in appropriate locations. An ideal el-

derly care system operates with zero maintenance.

2) The sensor systems should preserve patient privacy. Most

people under care expect that their privacy is respected. No

private information should be released until an emergency

is detected. Many elders are against using commer-

cial-off-the-shelf (COTS) cameras or microphones in their

home, because they feel they are monitored and their pri-

vacy is compromised. In elderly care sensor nodes, most

information analysis and decision-making should occur

within the detection nodes. This eliminates the necessity

to transmit information outside the detector and protects

patient privacy.

Fig. 1 illustrates the fall detector setup. The detectors take

multiple side-views of the scene in order to detect accidental

activities and raise alarms. The vision systems are mounted on

the wall at a height of 0.8 m, which is approximately the same

height of a light switch. Our approach is innovative for two

Authorized licensed use limited to: MAIN LIBRARY UNIVERSITY OF ZURICH. Downloaded on March 7, 2009 at 11:17 from IEEE Xplore. Restrictions apply.

FU et al.: AN ADDRESS-EVENT FALL DETECTOR FOR ASSISTED LIVING APPLICATIONS 89

Fig. 1. Address-event fall detectors are used for assisted living applications.

The detectors are mounted on the wall at a height of 0.8 m, which is approxi-

mately the same height of a light switch.

reasons: First, an asynchronous temporal contrast vision sensor

reports pixel changes with a latency on the order of millisec-

onds. Second, a lightweight computation algorithm plus a fast

readout allow us to compute an instantaneous motion vector and

report fall events. This cannot be done with a frame-based tem-

poral-difference image sensor because the frame rate is con-

stant, and redundant information in images saturate the trans-

mission bandwidth. Notice that in this paper we will refer to a

motion detection system performed with temporal differences

image sensors only.

This paper is divided into seven sections. In Section II, we de-

scribe the design overview for elderly home-care systems. Sec-

tion III describes the temporal contrast (motion-detection) vi-

sion sensor and the test platform used in the fall detection work.

In Section IV, we evaluate the asynchronous temporal contrast

vision sensor in tracking fast movement. In Section V and Sec-

tion VI, a lightweight moving-average algorithm to compute

centroid events is presented. This algorithm is then evaluated

as a fall detector. Section VII describes the design concerns in

a fall detector system. Section VIII concludes the paper.

II. A

N ATC V ISION SENSOR

The core technology used in our research is an asynchronous

temporal contrast (ATC) vision sensor. A temporal contrast vi-

sion sensor extracts changing pixels (motion events) from the

background [13] and reports temporal contrast, which is equiv-

alent to image reﬂectance change when lighting is constant. A

temporal contrast vision sensor can extract motion information

because, in normal lighting conditions, the intensity of a signif-

icant number of pixels changes as a subject moves in the scene

[14]–[16]. In the ATC vision sensor used here, every pixel re-

ports a change in illumination above a certain threshold with an

asynchronous event, i.e., pixels are not scanned with a regular

frame rate but every pixel is self-timed. In case of an event, the

corresponding pixel address is transmitted. After the event is ac-

knowledged by an external receiver, the pixel resets itself.

A key feature of ATC is the temporal contrast response which

means that the sensor reports scene reﬂectance changes (caused

e.g. by moving objects), discarding local absolute illumination.

A major advantage of this ATC image sensor is that it pushes

information to the receiver once a predeﬁned condition is sat-

isﬁed. This feature is important in high-speed vision systems

Fig. 2. The 64

64 address-event temporal constrast vision sensor used in the

fall detector system [17], [18].

Fig. 3. Temporal contrast image from the (b) ATC image sensor and )a) one

intensity frame. The subject is swaying left to right. The ATC imaging system

is placed in front of the subject with a distance of 3 meters and a height of 0.8 m.

because a pixel sends information of interest immediately, in-

stead of waiting for its polling sequence. A pixel generates a

higher rate of events when it experiences larger changes in light

intensity.

Fig. 2 shows the ATC image sensor system [17]–[20] used

in the fall detection experiment. The temporal contrast vision

sensor contains a 64

64 array of pixels and responds to rela-

tive changes in light intensity. The imaging system streams a

series of time-stamped address-events from the vision chip, and

sends them to a PC via a USB interface. The data is reported in

the address-event format with 12 bits (6 for X address, 6 for Y

address in a 64

64 image sensor.) The silhouette of a moving

subject can be reconstructed on a PC (The address-event vision

reconstruction software is available from http://www.jaer.wiki.

sourceforge.net/). The vision system uses a Rainbow S8 mm

1:1.3 lens, and the lens format is 2/3

. Fig. 3 shows an image

from the ATC image sensor and its targeted scene. The imaging

system is placed in front of the subject with a distance of 3 m

and a height of 0.8 m. The image sensor features a high dynamic

range of 120 dB. The sensor consumes 30 mW of power at 3.3 V,

which is comparable to most low-power COTS image chips on

the market [21]–[23]. The power consumption is approximately

120 mW for the USB device in Fig. 2. Notice that the camera

is used as a line-powered ﬁxed device in home and laboratory

installations. We suppose that the system will be professionally

installed by caregivers.

Authorized licensed use limited to: MAIN LIBRARY UNIVERSITY OF ZURICH. Downloaded on March 7, 2009 at 11:17 from IEEE Xplore. Restrictions apply.

90 IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, VOL. 2, NO. 2, JUNE 2008

III. COMPARISON BETWEEN

ATC

AND FRAME-BASED

TEMPORAL DIFFERENCE

VISION SENSORS

In this section we compare an ATC image sensor with a

frame-based system (a COTS web-camera) and characterize

them by tracking an object in free fall. We demonstrate that the

ATC image sensor performs better in high-speed tracking for

two reasons: Firstly, the ATC image sensor has higher temporal

resolution and delivers timely information on motion events.

Secondly, the ATC image sensor ranks data based on impor-

tance and selectively sends informative data on motion only.

This reduces information size and communication bandwidth.

A COTS camera samples images at low rates (30 fps), resulting

in at most a few temporal difference frames of information

for each fall event. This little data is not enough to compute

accurate velocity and acceleration measurement to distinguish

a fall and is a major impediment to the use of COTS camera for

this application. The image data is also blurred by the camera

speed.

In order to use a COTS camera to detect motion, some image

data manipulation is necessary. For comparative purpose, we

wrote a real-time temporal difference image emulator using an

COTS camera [24]–[26]. The software can be downloaded from

http://www.eng.yale.edu/elab/FallDetect.html. Image frames

from the COTS camera are down-sampled to 64

64 pixels,

and pairwise subtracted to mimic a temporal difference imager.

Using the same frame twice in two subsequent differences

is not necessary, since the event resolution is not increased,

but the overall number of events increases, at the expense

of more computation after readout. For this manipulation

8,192-bit subtractions and thresholdings are performed by a PC

(

subtractions and an identical number of

thresholding operations). The threshold of the COTS was set

to match the one from the ATC (10 for an 8 bit pixel output).

The temporal difference frames are then converted into an

address-event stream, in order to compare them to the ATC

output. This is performed by reporting only the address of the

pixels that have changed by a threshold. This comparison is fair

because address-event is the most efﬁcient way to report sparse

matrices of events.

Fig. 4 shows the measured responses as the ATC vision sensor

tracks a box in free fall. The sensor communicates motion events

at 1330 event/s when it is monitoring the object’s fall. The event

rate reduces to 221 event/s in the quiet period when no motion is

present in the scene. These noise events are due to source/drain

junction leakage in pixel transistors, and are sparse, uncorre-

lated in space and time. The noise events are represented by cir-

cles in Fig. 4. The noise events can be ﬁltered out by the fact

that they are spatio-temporally uncorrelated [27] but we chose

not to do so to keep the computational model closely matched to

cheap embedded architectures. In this experiment 1590 events

have been collected during the 1.1s fall, 94% of which describe

the fall, the rest are noise.

Fig. 5 shows the measured address-event outputs from the

frame-based image emulator as it tracks a box’s fall. The event

rate is 150 event/s in average and ten times less than the ATC

vision sensor. Every frame contains a lot of redundant informa-

tion due to the unchanged background. Fig. 5 reports only 232

Fig. 4. (a) Measured responses while the ATC vision sensor tracks an ob-

ject thrown in the air and then falling. The object is 3 m away and the camera

is installed at 0.8 m. (b) Noise events (represented by cycles) and fall events

(represented by dot) distribution when the ATC vision sensor tracks the object

free-falls.

events during the 1s fall, with no added noise. Notice also the

spread of events in the Y axis for each frame: it is up to 15 pixels

out of a total of 64 resulting in a 23% spread for COTS. On the

other hand, it is only 3 pixels in the reconstructed ATC frame for

a spread of 4.6% (see Fig. 4(b); more precisely the fall events

between 3 and 3.2 s). This data shows that the ATC system can

perform at least 5 times more precise vertical velocity calcula-

tions than a COTS sensor. Notice that we can generate an ATC

frame for comparison purposes by collecting events for 30 ms

and then generating an histogram frame.

ATC vision sensors have two main advantages when com-

pared to frame-based image sensors: ﬁrst, the ATC vision sensor

has a higher temporal resolution in high-speed tracking applica-

tions. In the experiment the ATC vision sensor shows a 10 times

higher event rate as it tracks the free fall. The uniform frame

rate of the COTS camera imposes an upper limit on the tem-

poral difference sampling rate. Second, the ATC vision sensor

has a higher bandwidth efﬁciency because it selectively sends

information. Given this experimental setting, with an image res-

olution of 64

64, the ATC vision sensor saves over 84% band-

width for transmission of the image data. (

bit address

events in s bits in the ATC vision sensor versus

Authorized licensed use limited to: MAIN LIBRARY UNIVERSITY OF ZURICH. Downloaded on March 7, 2009 at 11:17 from IEEE Xplore. Restrictions apply.

An Address-Event Fall Detector for Assisted Living Applications

Figures

Citations

A survey on fall detection: Principles and approaches

Challenges, issues and trends in fall detection systems

STDP and STDP variations with memristors for spiking neuromorphic learning systems.

Finding a roadmap to achieve large neuromorphic hardware systems.

Event-Based Visual Flow

References

A smart sensor to detect the falls of the elderly

A Smart and Passive Floor-Vibration Based Fall Detector for Elderly

A 128 X 128 120db 30mw asynchronous vision sensor that responds to relative intensity change

SPEEDY:a fall detector in a wrist watch

Smart home care network using sensor fusion and distributed vision-based reasoning

Related Papers (5)

A 128 $\times$ 128 120 dB 15 $\mu$ s Latency Asynchronous Temporal Contrast Vision Sensor

CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking

A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS

A Method for Automatic Fall Detection of Elderly People Using Floor Vibrations and Sound—Proof of Concept on Human Mimicking Doll Falls

VLSI analogs of neuronal visual processing: a synthesis of form and function

Frequently Asked Questions (15)

Q1. What are the contributions mentioned in the paper "An address-event fall detector for assisted living applications" ?

Q2. What is the key feature of the ATC vision sensor?

Q3. Why is a 16-bit microcontroller available for the centroid computation and thresholding?

Q4. What is the advantage of the ATC vision sensor?

Q5. How much power is needed to run the detector?

Q6. What are the fall types that the authors tested in this work?

Q7. What is the reason for using the same frame twice in two subsequent differences?

Q8. What is the key feature of the ATC vision sensor used in the fall detection experiment?

Q9. What are the advantages of ATC vision sensors?

Q10. How is the vertical velocity invariant to distance?

Q11. Why did the authors choose not to use the ATC frame for comparison purposes?

Q12. What is the average of the events in the buffer?

Q13. How can the authors generate an ATC frame for comparison purposes?

Q14. How many subtractions and thresholding operations are performed by a PC?

Q15. How many meters away from the camera is the human centroid?