What are the contributions mentioned in the paper "Human sensing using visible light communication" ?

The authors present LiSense, the first-of-its-kind system that enables both data communication and fine-grained, real-time human skeleton reconstruction using Visible Light Communication ( VLC ).

What have the authors stated for future works in "Human sensing using visible light communication" ?

But based on their experiences building the LiSense testbed the authors also recognize several limitations of the existing system and potential applications that motivate future work. The authors are convinced that the complexity can be eased in the future. The authors plan to study these realistic settings as part of their future work. In the near future, the authors plan to add a cover made of thin, durable plastic glass ( e. g., polycarbonate plastic ) over the photodiodes floor so users can stand on the “ glass floor ” allowing us to experiment with a larger, more realistic set of leg movements – further advancing the gestures they can infer with LiSense.

What is the simplest way to avoid the flickering problem?

since lights are also used for illumination, the flashing frequencies need to be above a threshold fflicker (1 kHz in their implementation) to avoid the flickering problem [32, 36, 48].

How can the authors extract more details on the human gesture?

With denser LEDs on the ceiling, LiSense can extract more details on the human gesture by examining the shadows cast from different viewing angles.

How can the authors recover the shadow map cast by each light Li?

By aggregating the blockage detection result from all photodiodes, the authors can recover the shadow map cast by each light Li. Specifically, assuming N photodiodes on the floor, which can sense K LED lights within their FoVs, the authors define the shadow map Si(t) cast by LED light Li at time t as: Si(t) = {sij(t)|0 < j ≤ N}, where sij(t) indicates whether the direct path from location pj to light Li is blocked at time t, i.e., sij(t) = 1 if ∆Pij(t) ≥ τ , and sij(t) = 0 otherwise.

What is the effect of light Li on the main frequency power?

In other words, if the perceived light intensity from light Li changes, it will affect not only the main frequency power at f , but also the power peaks at harmonics.

What is the key to reconstructing finergrained gestures?

Photodiode-embedded fabric would also allow a much denser deployment of photodiodes – the key to realize reconstructing finergrained gestures (e.g., finger movements).

What are the key factors that affect LiSense’s skeleton reconstruction accuracy?

In particular, the authors observe three key factors that affect LiSense’s skeleton reconstruction accuracy under a given photodiode density: 1) Body part size: LiSense better tracks larger body parts (e.g., backbone joint that corresponds to the user’s main body).

What are the factors that affect the latency of inferring a user posture?

Two factors affect the latency of inferring a user posture based on five shadow maps, which are the shadow size (i.e., the number of photodiodes inside the shadow) and the movement complexity.

What is the reason for the angular errors of left-side joints?

For the same reason, the angular errors of left-side joints are smaller than the right joints for some two-hand gestures (e.g., boxing, fighting), since the right-handed user moves the left hand slightly more slowly.

What is the blockage at a single photodiode?

As shown in their prior experiment (Figure 3(c)), the blockage at a single photodiode is independent of its relative distance to the blocking object.

(Open Access) Human Sensing Using Visible Light Communication (2015) | Tianxing Li

Q: What is the key challenge in disambiguating composite shadows?

To disambiguate composite shadows created by multiple lights, LiSense recovers the shadow shape, referred to as the shadow map, resulting from each individual light source.

Q: What is the key to reconstructing finergrained gestures?

Photodiode-embedded fabric would also allow a much denser deployment of photodiodes – the key to realize reconstructing finergrained gestures (e.g., finger movements).

Q: What is the effect of blocking the direct path?

Since the photodiode perceives a combination of light rays coming in all directions, this multipath effect can potentially reduce the light intensity drop caused by blocking the direct path.

Human Sensing Using Visible Light Communication

Tianxing Li, Chuankai An, Zhao Tian, Andrew T. Campbell, and Xia Zhou

Department of Computer Science, Dartmouth College, Hanover, NH

{tianxing, chuankai, tianzhao, campbell, xia}@cs.dartmouth.edu

ABSTRACT

We present LiSense, t he ﬁrst-of-its-kind system that enables both

data communication and

ﬁne-grained, real-time human skeleton re-

construction

using Visible Light Communication (VLC). LiSense

uses shadows created by the human body from blocked light and

reconstructs 3D human skeleton postures in real time. We over-

come two key challenges to r ealize shadow-based human sensing.

First, multiple lights on the ceiling lead to diminished and complex

shadow patterns on the ﬂoor. We design light beacons enabled by

VLC to separate light rays from different light sources and recover

the shadow pattern cast by each individual light. Second, we design

an efﬁcient inference algorithm to reconstruct user postures using

2D shadow information with a limited resolution collected by pho-

todiodes embedded in the ﬂoor. We build a 3 m × 3 m LiSense

testbed using off-the-shelf LEDs and photodiodes. Experiments

show that LiSense

reconstructs the 3D user skeleton at 60 Hz in

real time wit h

◦

mean angular error

for ﬁve body joints.

Categories and Subject Descriptors

C.2.1 [Network Architecture and Design]: Wireless communica-

tion

Keywords

Visible light communication; sensing; skeleton reconstruction

1. INTRODU C TION

Light plays a multifaceted role (e.g., illumination, energy source)

in our life. Advances on Visible Light Communication (VLC) [30,

59] add a new dimension to the list: data communication. VLC

encodes data into light intensity changes at a high f requency im-

perceptible to human eyes. Unlike conventional RF radio systems

that require complex signal processing, VLC uses low- cost, energy-

efﬁcient Light Emitting Diodes (LEDs) to transmit data. Any de-

vices equipped with light sensors (photodiodes) can recover data by

monitoring light changes. VLC has a number of appealing proper-

ties. It reuses existing lighting infrastructure, operates on an un-

regulated spectrum band with bandwidth 10K times greater than

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst pa ge. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from Permissions @acm.org.

MobiCom’15, September 7–11, 2015, Paris, France.

 2015 AC M. ISBN 978-1-4503-3619-2/15/09 ...$15.00.

DOI: http://dx.doi.org/10.1145/2789168.2790110.

Figure 1: Shadow cast by varying manikin postures under an LED

light (CREE XM-L). In this scaled-down table-top testbed, the manikin

is 33 cm in height, and the LED light is 22 cm above the manikin.

the RF spectrum, and importantly, is secure (i.e. , does not pene-

trate walls, resisting eavesdropping), energy-efﬁcient, and free of

electromagnetic interference.

In t his paper, we push the envelope further and ask: Can l ight

turn into a ubiquitous sensing medium that tracks what we do and

senses how we behave? Envision a smart space (e.g., home, ofﬁce,

gym) that takes the advantage of the ubiquity of light as a medium

that integrates data communication and human sensing [71]. Smart

devices (e.g., smart glasses, smart watches, smartphones) equipped

with photodiodes communicate using VLC. More importantly, light

also serves as a passive sensing medium. Users can continuously

gesture and interact with appliances and objects in a room (e.g.,

a wall mounted display, computers, doors, window s, coffee ma-

chine), similar to using the Kinect [57] or Wii in front of a TV, but

there are no cameras (high-ﬁdelity sensors with privacy concerns)

monitoring users, neither any on-body devices or sensors that users

have to constantly wear or carry [27, 62], just LED lights on the

ceiling and photodiodes on the ﬂoor.

The key idea driving light sensing is strikingly simple: shadows.

Any opaque object (e.g., human body) obstructs a light beam, re-

sulting in a silhouette behind the object. Because the wavelength

of visible light is measured in nanometers, any macroscopic object

can completely block the light beam, much more signiﬁcant than

the radio frequencies [16, 50, 65, 70]. The shadow cast on the ﬂoor

is essentially a two-dimensional projection of the 3D object. As

the object moves and changes its shape, different light beams are

blocked and the projected shadow changes at light speed – the same

principle as the shadow puppet. Thus, by analyzing a continuous

stream of shadow s cast on the ﬂoor, we can infer a user’s posture

and track her behavior. As a simple illustration, Figure 1 shows the

shadow shapes for varying manikin postures under a single light.

We present LiSense, the ﬁrst-of-its-kind system that enables ﬁne-

grained, real-time user

skeleton reconstruction

at a high frame

rate (60 Hz)

using visible light communication. LiSense consists

We deﬁne skeleton reconstruction as calculating the vectors of the

skeleton body segments in the 3D space.

of VLC-enabled LED lights on the ceiling and low-cost photodi-

odes on the ﬂoor.

LiSense aggregates the light intensity data from

photodiodes, recovers the shadow cast by individual LED light,

and continuously

reconstructs a user’s skeleton posture in real time.

LiSense’s ability of performing 3D skeleton reconstruction in real

time puts l ittle constraints on the range of gestures and behaviors

that LiSense can sense, which sets a key departure from existing

work that eit her targets a limited set of gestures [3, 17, 46] or only

tracks user’s 2D movements [2, 4, 10]. More importantly, by i nte-

grating both data communication and human sensing into the ubiq-

uitous light, LiSense also fundamentally differs from vision-based

skeleton tracking systems (e.g., Kinect) t hat are built solely for the

sensing purpose. In addition, these systems rely on cameras to cap-

ture high-resolution video frames, which bring privacy concerns as

the raw camera data can be leaked to the adversary [52, 64]. While

prior vision methods [55, 56] have leveraged shadow to infer hu-

man gestures, they work strictly under a single light source and do

not apply in a natural indoor setting with multi ple light sources.

LiSense overcomes two key challenges to realize shadow-based

light sensing: 1) Shadow Acquisition: Acquiring shadows using

low-cost photodiodes is challenging in practice. In the presence of

multiple light sources, light rays from different directions cast a di-

luted composite shadow, which is more complex than a shadow cast

by a single light source. A shadow can also be greatly i nﬂuenced

by ambient light (e.g., sunlight). Both factors limit the ability of

photodiodes detecting the light intensity drop inside a shadow. To

address this challenge, LiSense leverages the fact that each light is

an active transmitter using VLC and designs light beacons to sepa-

rate light rays from individual LEDs and ambient light. Each LED

emits light beacons by transmitting (i.e., ﬂashing) at a unique fre-

quency. LiSense transforms the light intensity perceived by each

photodiode over time to the f requency domain. By monitoring fre-

quency power changes, LiS ense detects whether the photodiode is

suddenly blocked from an LED and aggregates the detection results

from all photodiodes to recover the shadow map cast by each light.

2) Shadow-based

Skeleton Reconstruction: Shadow maps mea-

sured by photodiodes are 2D projections w ith a limited resolution

(constrained by t he photodiode density). Such low-resolution, im-

perfect shadow images pose signiﬁcant challenges to

reconstruct a

user’s 3D skeleton

. Existing computer vision algorithms [11, 19,

21, 24, 42, 66, 51] cannot be directly applied to this problem be-

cause they all deal wit h video frames in a higher resolution and are

often augmented with the depth information. LiSense overcomes

this challenge by combining shadows cast by light sources in dif-

ferent directions and infers the 3D vectors of key body segments

that best match shadow maps. LiSense ﬁne-tunes the inferences

using a Kalman ﬁlter to take into account movement continuity and

to further reduce

the skeleton reconstruction errors.

LiSense Testbed . We build a 3 m × 3 m LiSense testbed (Fig-

ure 9), using ﬁve commercial LED lights, 324 low-cost, off-the-

shelf photodiodes, 29 micro-controllers, and a server. We i mple-

ment light beacons by programming the micro-controllers that mod-

ulate LEDs. We implement blockage detection and 3D

skeleton

reconstruction

algorithms on the server, which generates a st r eam

of shadow maps and continuously tracks user gestures. The

recon-

struction

results are visualized in real time using an animated user

skeleton

(Figure 11). We t est our system with 20 gestures and seven

users in diverse settings. Our key ﬁndings are as follows:

• LiSense reconstructs a user’s 3D skeleton with the average an-

gular error of

◦

for ﬁve key body joints;

Engineering photodiodes on the ﬂoor sounds labor-intensive to-

day, but it can be eased by smart fabric [1, 45] (see more in § 7).

• LiSense generates shadow maps in real time. It is able to produce

shadow maps of all LEDs every 11.8 ms, reaching the same level

of capturing video frames yet without using cameras;

• LiSense tracks user gestures in real time. It reconstructs the user

skeleton within 16 ms based on ﬁve shadow maps, thus generat-

ing 60 reconstructed postures, each of which consists of the 3D

vectors of ﬁve key body segments. The reconstructed skeleton is

displayed in real t ime (60 FPS), similar to playing a video at a

high frame rate;

• LiSense is robust in diverse ambient light settings (morning, noon,

and night) and users with different body sizes and shapes.

Contributions. We make the f oll owing contributions:

• We propose for the ﬁrst time the concept of continuous user

skeleton reconstruction based on visible light communication,

which enables light to be a medium for both communication and

passive human sensing;

• We design algorithms to extract the shadow of each individual

light source and reconstruct 3D human skeleton posture continu-

ously using only a stream of low-resolution shadow information;

• We build the ﬁrst testbed implementing real-time, human skele-

ton reconstruction based on VLC, using off-the-shelf, low - cost

LEDs, photodiodes, and micro-controllers in an indoor environ-

ment;

• Using our testbed, we test our system with diverse gestures and

demonstrate that it can reconstruct a user skeleton continuously

in real time with small reconstruction angular errors.

Our work takes the ﬁrst step to go beyond conventional radio

spectrum and demonstrates the potential of using visible light spec-

trum for both communication and ﬁne-grained human sensing. We

believe that with its unbounded bandwidth, light holds great poten-

tial to mitigate the spectrum crunch crisis. By expanding t he ap-

plications VLC can enable, we hope that our work can trigger new

radical thinkings on VLC applications. Our work examines the in-

terplay between wireless networking, computer vision, and HCI,

opening the gate to new paradigms of user interaction designs.

2. LIGHT SHADOW EFFECT

Shadow is a common phenomenon we observe everyday. It is

easily recognizable under a single light source by unaided human

eyes. Our goal is to understand whether off-the-shelf, low-cost pho-

todiodes can reliably detect the light intensity drop in the shadow. If

so, we can deploy them on the ﬂoor and aggregate their li ght inten-

sity data to obtain the shadow cast by a human body. In this section,

we ﬁrst study the impact of a blocking object on light propagation

using low-cost photodiodes. We then examine the challenges of

shadow-based analysis in t he presence of multiple lights.

2.1 Experiments on Blocking the Light

Consider a single photodiode on the ﬂoor, we hypothesize that

if any opaque object stands in the direct path between the point

light source and the photodiode, the photodiode will not be able

to perceive any light coming from this point light source. Thus,

the photodiode observes a light intensity drop compared to the case

when there is no object blocking its direct path to the light source.

To conﬁrm our hypothesis, we build a scaled-down table-top

testbed (Figure 2) using commercial LED lights (CREE XM-L) and

low-cost photodiodes (Honeywell SD3410-001). We set up a sin-

gle LED chipset as the point light source at 55 cm height and place

the photodiode directly below the light. By default we calibrate the

photodiode’s location using a plumb bob to ensure 0

◦

of light’s in-

cidence angle. The photodiode has 90

◦

ﬁeld of vision (FoV), i.e., it

reector

(a) Default setup ( left) and setup

w/ reﬂector (right)

LED

resistor

(b) LE D and photodiode (PD) w/

Arduino boards

Figure 2: Experiment setup with an

LED and a ph otodiode (a), both at-

tached to micro-controllers (b).

200

400

600

800

0 50 100 150 200 250

Arduino readings

Light intensity(lux)

(a) Light intensity vs Arduino

reading

100

200

300

400

500

20 40 60 80

Arduino readings

LED duty cycle (%)

w/o blocking

w/ blocking

Ambient light

(b) Light source light intensity

100

200

300

400

500

0.1 0.2 0.3 0.4 0.5

Arduino readings

Distance between blocking object and PD (m)

w/o blocking

w/ blocking

Ambient light

100

200

300

400

500

0 10 20 30 40 50

Arduino readings

Light incident angle (degree)

w/o blocking

w/ blocking

Ambient light

(d) I ncident angle

100

200

300

400

500

Morning Noon Night

Arduino readings

Ambient light (lux)

w/o blocking

w/ blocking

Ambient light

(e) Ambient light

100

200

300

400

500

None Clothes Paper Plastic Foam Tin

Arduino readings

Reflector material

w/o blocking

w/ blocking

Ambient light

(f) Multi-path

Figure 3: Experiments on blocking the light using our scaled-down table-top testbed. (a) shows that the mea-

sured Arduino reading is directly proportional to the perceived light intensity, where the slope decreases after

the photodiode enters its saturation range. (b)-(f) show the impact of blockage on the measured Arduino reading

under varying settings.

can sense incoming light with incidence angle within 45

◦

. To fetch

signal continuously from the photodiode, we cascade the photodi-

ode and a resistor (10 KΩ) and measure the resistor voltage using a

micro-controller (Arduino DUE in Figure 2(b)). It maps the mea-

sured voltage to an integer between 0 and 1023. Since the photodi-

ode’s output current is directly proportional to the perceived light

intensity, resistor voltage (thus the Arduino reading) reﬂects the

perceived light intensity. Using a light meter (EXTECH 401036)

we have veriﬁed that the Arduino reading is directly proportional

to the perceived li ght intensity (Figure 3(a)). To understand the im-

pact of blockage, we place a 10 cm × 10 cm × 2 cm wood plate

between the photodiode and the LED, and compare the Arduino

readings before and after placing the wood plate

. We aim to an-

swer the following key questions:

Q1: Is the shadow dependent on t he light intensity of the light

source? We ﬁrst examine how the light source brightness affects

the photodiode’s sensing data upon blockage. We connect the LED

to an Arduino UNO board to vary the LED’s duty cycle from 10%

to 90% (Figure 2(b)), resulting in light intensities from 5 to 30 lux

perceived by t he photodiode. For a given duty cycle, we record the

average Arduino reading before and after blocking the LED and

plot the results in Figure 3(b). We observe that upon blockage, the

Arduino’s reading reports only the ambient light in all duty cycle

settings, meaning that an opaque object completely blocks the light

rays r egardless of the brightness of the light source.

Q2: Does the distance between the blocking object and the light

source matter? Next, we test whether the relative distance between

the blocking object and the light source affects the photodiode’s

sensed light intensity. To do so, we ﬁx the LED’s duty cycle to

90%, move the wood plate along the line between the LED and

the photodiode, and record the Arduino data at each distance. Fig-

ure 3(c) shows that as long as the object stays in the direct path

between the LED and the photodiode, the light beam is completely

blocked regardless of the r el at ive distance of the blocking object.

Q3: How does the light incidence angle come into play? Be-

cause a photodiode has a limited viewing angle and can perceive

We also measured the blockage impact using different body parts

(e.g., arms, hands) of a manikin and observed similar results.

incoming light only within its FoV, we further examine whether

it can detect blockage under varying light incidence angles. We

move the photodiode horizontally with 10-cm intervals and record

the Arduino’s reading before and after blockage. As expected, the

perceived light intensity gradually drops as t he photodiode moves

further away from the LED (Figure 3(d)). More importantly, at all

locations (incidence angles), the light beam blockage result in a sig-

niﬁcant drop in the Arduino’s reading. The drop is less signiﬁcant

when the incidence angle approaches half of the photodiode’s FoV.

This is because the photodiode can barely sense any light coming

at its FoV and thus blocking the light beam has a negligible impact.

Q4: What is the impact of ambient light? We also perform our

measurements during different time of a day as the ambient light

varies. In Figure 3(e), we plot the Arduino reading before and af-

ter blockage as the ambient light intensity increases from 2 to 100

lux. In all conditions, we observe a signiﬁcant drop in the Arduino

reading. Because the photodiode senses a combination of the am-

bient light and the light from the LED, its perceived light intensity

increases as the ambient light intensity increases.

Q5: How signiﬁcant is the light multi-path effect? Would it di-

minish the shadow? Visible light is diffusive in nature. While a

object blocks the direct path between the photodiode and the LED,

light rays can bounce off surrounding objects and reach t he pho-

todiode f r om multiple directions. Since the photodiode perceives

a combination of light rays coming in all directions, this multi-

path effect can potentially r educe the light intensity drop caused

by blocking the direct path. To examine the impact of the multi-

path effect, we place a ﬂat board vertically close to the LED to

increase the r eﬂected light rays (Figure 2(a), right) and record the

Arduino’s reading with and without blocking the direct path to the

LED. Among all types of material we have tested, the signiﬁcant

drop in the Arduino’s reading is consistent (Figure 3(f)). Thus, light

in the direct path dominates the perceived light intensity. The tin

has the highest light intensity because of its minimal energy loss

during reﬂection.

Overall, our experiment results conﬁrm that opaque objects can

effectively block li ght in diverse settings and the blockage can be

detected by low-cost photodiodes under a single point light source.

(a) 2 LEDs (b) 3 LEDs (c) 4 LEDs (d) 5 LEDs (e) Setup w/ PD

200

400

600

800

1000

1 LED 2 LEDs 3 LEDs 4 LEDs 5 LEDs

Arduino readings

w/o blocking

w/ blocking

(f) Sensed li ght intensity

Figure 4: Shadow cast by multiple LEDs on the table-top testbed (a)-(d). We further measure the light intensity change caused by blockage using

off-the-shelf photodiode (e). Light intensity drop caused by blockage is less signiﬁcant under more LEDs (f).

2.2 Where is My Shadow?

Detecting a shadow is relatively st raightforward under a single

point light source. However, when there are multiple light sources

present, shadow detection becomes much more challenging. This

is because light rays fr om different sources result in a composite

shadow, which comprises shadow components created and fused

by multiple light sources. Figure 4 illustrates the resulting shadow

of the manikin as we switch on more LED lights in our table-top

testbed. We make two key observati ons. First, the shadow is grad-

ually diluted as more LEDs are switched on. This is because there

are more light rays coming from different directions, and hence

blocking a beam from one LED does not necessarily block t he

light beams from other LEDs, leaving a fading shadow. Second,

the shadow shape becomes more complex under more LEDs as a

result of superimposing different shadows cast by individual LEDs.

As a result, it becomes harder to infer the manikin’s posture based

upon the r esulti ng tangled shape pattern.

While visually less noticeable, a shadow is also much harder

to detect under these conditions using off-the-shelf photodiodes.

In our experiments, we place the photodiode in a shadow region

caused by the manikin’s posture, gradually switch on more LEDs,

and compare the Arduino’s reading with and without the manikin’s

blockage (Figure 4(e)). We observe that as more LEDs are switched

on, more light rays coming in different directions hit the shadow re-

gion and thus the perceived light intensity level ri ses. Furthermore,

once three or more LED lights are switched on, the photodiode

enters the saturation region (Figure 3(a)), thus blocking light rays

from a single LED has a negligible impact on the Arduino reading.

As a result, detecting shadow using these low-cost photodiodes is

very challenging in practice under multiple lights.

In the next two sections, we introduce LiSense, which disam-

biguates composite shadow s using VLC and continuously tracks

user posture in real time.

3. DISAMBIGUATING SHADOWS

To disambiguate composite shadows created by multiple lights,

LiSense recovers the shadow shape, referred to as the shadow map,

resulting from each individual light source. Speciﬁcally, the shadow

map associated with light source L

is the shadow if only light

source L

is present. We describe this as disambiguating a com-

posite shadow. The key challenge is that each photodiode perceives

a combination of light rays coming from different light sources and

cannot separate light rays purely based on the perceived light inten-

sity (Figure 5(a)(b)).

To overcome this technical barrier, we leverage the fact that each

LED light is an active transmitter using VLC. We instrument each

light source to emit a unique light beacon, implemented by modu-

lating the light intensity changes at a given frequency. By assigning

a different frequency to each light source, we enable photodiodes

to differentiate lights from different sources. This allows LiSense

to recover the shadow cast by each individual l ight.

In this section, we ﬁrst describe our design of light beacons, fol-

lowed by the mechanism to detect blockage (shadow) and infer

shadow maps.

3.1 Separating Light Using Light Beacons

The design of light beacons is driven by the observation that

while the perceived light intensity represents the sum of all incom-

ing light rays, these light rays can be separated in the frequency do-

main if they ﬂash at different frequencies. That is, if we transform a

time series of perceived light intensity to the frequency domain us-

ing Fast Fourier Transform (FFT), we can observe frequency power

peaks at the frequencies at which these light rays ﬂash. Figure 5

shows an example with two LED lights ﬂashing at 2.5 kHz and

1.87 kHz, respectively, in an unsynchronized manner. The light in-

tensity perceived by the photodiode is a combination of these two

light pulse waves, and yet FFT can decompose the light mixture

and generate peaks at the two ﬂashing frequencies. Thus, a light

beacon can be implemented by programming each light source to

ﬂash at a unique frequency.

Beneﬁts. Light beacons bring three key beneﬁts when consid-

ering blockage detection. First, by examining the resulting fre-

quency power peaks after applying FFT, we can separate light r ays

from different light sources. The frequency power at frequency f

is approximately directly proportional to the intensity of light rays

ﬂashing at f

. Thus, the changes in power peaks allow the photo-

diode to determine which lights are blocked. Second, light beacons

also allow us to avoid interference from ambient light sources by

applying a high pass ﬁlter (HPF). This is because the change of am-

bient light is random and generates frequency components close to

zero in the frequency domain. Third, by separating light rays from

different sources, we observe a much more signiﬁcant drop in the

frequency power caused by blocking a light, which is the key to

achieving robust detection of blockage, especially when the photo-

diode perceives a weak light intensity because of a long distance or

a large incidence angle.

Light Beacon Frequency Selection. Designing robust light bea-

cons, however, is nontrivial, mainly because selecting the ﬂashing

frequency for each light source is challenging. Speciﬁcally, assume

an LED ﬂashes as a pulse wave with a duty cycle of D and ﬂashing

frequency of f , the Fourier series expansion of this pulse wave is:

f(t) = D +

∞

n=1

nπ

sin(πnD) cos(2πnft). (1)

It indicates that the power emitted by the pulse wave is decomposed

into the main frequency power, which is the ﬁrst AC component

when n = 1, and an inﬁnite number of harmonics (components

425

450

475

0 1 2 3 4 5 6

Time (ms)

Arduino readings

, 1.87 kHz

375

400

425

450

, 2.5 kHz

(a) A single LED on

400

420

440

460

480

500

0 1 2 3 4 5 6

Arduino readings

Time (ms)

+ L

(b) Two LEDs on

150

300

450

600

0 2 4 6 8 10

Frequency power

Frequency (kHz)

Harmonics

150

300

450

600

0 2 4 6 8 10

Frequency power

Frequency (kHz)

Harmonics

(d) After blocking L

Figure 5: Experiments with a photodiode (PD) and two LEDs (L

and L

, 50% duty cycle) that ﬂash at different frequencies. (a)-(b) show the PD’s

readings when only one LED is on and when both are on. PD perceives a combination of light rays, which however can b e separated in the frequency

domain after applying FFT (c). The frequency power of the ﬂashing frequency f

reﬂects the perceived intensity of light rays ﬂashing at f

. Thus the

power peak at 2.5 kHz disappears after L

is blocked (d).

Algorithm 1: Selecting the candidates of ﬂashing frequency for

light beacons.

input : 1) R, signal sampling rate; 2) A, the number of FFT points; 3)

f licker

, the threshold to avoid ﬂickering; 4) f

interval

, the

minimal interval between adjacent ﬂashing frequencies

output: f

candidate

, ﬂashing frequency candidates for all LEDs

candidate

= {

× ⌈

flicker

×A

⌉}

for k ← ⌈

flicker

×A

⌉ + 1 to

× k

val i d = true

for f

∈ f

candidate

if (f

mod f

) == 0) OR (|f

− f

| ≤ f

interval

) then

val i d = false

break

end

if valid then f

candidate

← f

∪ f

candidate

end

with n > 1). Hence, an LED light L

ﬂashing at frequency f

leads to not only a global power peak (main frequency power) at

frequency f

, but also small local power peaks at all the harmon-

ics frequencies (Figure 5(c)). In other words, if the perceived light

intensity from light L

changes, it will affect not only the main fre-

quency power at f, but also the power peaks at harmonics. To sepa-

rate out the light rays and avoid interference across lights, we need

to ensure that the harmonics do not overlap with the main frequen-

cies of other lights. Tracking all harmonics is infeasible. In our

design, we focus on the top-ten harmonics frequency components.

This is because the harmonics frequency power drops signiﬁcantly

as n increases. We observe it becomes negligible once n > 10.

Furthermore, the ﬂashing frequencies need to satisfy three addi-

tional constraints. First, since lights are also used for illumination,

the ﬂashing frequencies need to be above a threshold f

f licker

kHz in our implementation) to avoid the ﬂickering problem [32,

36, 48]. Second, the highest ﬂashing frequency is l imited by the

sampling rate of the micro-controller fetching data from photodi-

odes. The Nyquist Shannon sampling theorem [25] says that it has

to be no larger than R/2, where R is sampling rate. Finally, the

adjacent frequencies have to be at least f

interval

away to ensure

robust detection of frequency power peaks. We set f

interval

200 H z based on a prior study [32]. Algorithm 1 details the pro-

cedure to select all candidate ﬂashing frequencies satisfying all the

above constraints.

We then assign the candidate ﬂashing frequencies to all LED

lights, such that the lights within each photodiode’s viewing an-

gle (FoV) ﬂash at different frequencies. Since photodiodes have a

limited FoV (90 degrees for Honeywell SD3410-001), each photo-

diode perceives only a small subset of all lights. Thus we do not

need a large number of candidate frequencies to cover all lights and

the system can easily scale up to more lights. Supporting denser

lights requires more candidate ﬂashing fr equencies, which can be

achieved by increasing the signal sampling rate R.

Light Beacon Overhead. Light beacons can be seamlessly inte-

grated into existing VLC systems, enabling light to fulﬁll a dual

role of data communication and human sensing. For VLC sys-

tems [5, 14 , 36, 48] that use Frequency Shift Keying (FSK) to

modulate data, all the data packets serve as light beacons, as long

as LED lights within the FoV of a photodiode use different ﬂash-

ing frequencies to modulate data. For VLC systems that use other

modulation schemes [34, 35], we can instrument each LED light to

emit light beacons periodically, in the same way that Wi-Fi access

points periodically transmit beacons. For an ADC sampling rate

of 20 kHz and modulation window of 128 points, a light beacon

lasting for 6.4 ms is sufﬁcient for the photodiode to separate light

rays. Thus, the overhead of transmitting light beacons is negligible

given that a data packet typically lasts for hundreds of ms based on

the IEEE 802.15.7 standard [8].

3.2 Blockage Detection

We detect blockage by t r ansforming the time series of light in-

tensity values of light beacons to the frequency domain and ex-

amining the frequency power changes. Speciﬁcally, the intensity

of light r ays from light L

ﬂashing at frequency f

is represented

by the frequency power of f

. If an opaque object blocks the di-

rect path from light L

to a photodiode, the frequency power of f

changes (Figure 5(d)).

To examine the impact of blockage on the f requency power, we

mount ﬁve commercial LED lights (Cree CXA25) on an ofﬁce ceil-

ing (2.65 m in height), attach them to an Arduino UNO board,

which modulates the Pulse Width Modulation (PWM) of each light

to allow each LED to emit light beacons at a given frequency (Ta-

ble 1 by running Algorithm 1).

We place photodiodes (Figure 2(b))

at 324 locations in a 3 m x 3 m area on the ﬂoor. Each photodiode

can perceive light rays from all LED lights. We then measure the

readings of the Ardunio controllers connected to the photodiodes

for 6.4 ms before and after blocking each LED l ight.

Figure 6 shows the CDF of relative frequency power change.

Assume the P

(t) is the frequency power of f

(the ﬂashing fre-

quency of the light beacons from light L

) at time t perceived by

the photodiode at location p

, its relative frequency power change

∆P

(t) is deﬁned as:

∆P

(t) = |

nonBlock

− P

(t)

nonBlock

|, (2)

Human Sensing Using Visible Light Communication

Figures

Citations

Visible Light Communication, Networking, and Sensing: A Survey, Potential and Challenges

DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing

mD-Track: Leveraging Multi-Dimensionality for Passive Indoor Wi-Fi Tracking

Visible Light Communication: A System Perspective-Overview and Challenges.

Towards 3D human pose construction using wifi

References

Real-time human pose recognition in parts from single depth images

Real-time human pose recognition in parts from single depth images

Fundamental analysis for visible-light communication system using LED lights

A survey of advances in vision-based human motion capture and analysis

KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

Related Papers (5)

Demo: Luxapose: indoor positioning with mobile phones and visible light

Whole-home gesture recognition using wireless signals

Understanding and Modeling of WiFi Signal Based Human Activity Recognition

ArrayTrack: a fine-grained indoor location system

SpotFi: Decimeter Level Localization Using WiFi

Frequently Asked Questions (18)

Q1. What are the contributions mentioned in the paper "Human sensing using visible light communication" ?

Q2. What have the authors stated for future works in "Human sensing using visible light communication" ?

Q3. What is the simplest way to avoid the flickering problem?

Q4. What is the key challenge in disambiguating composite shadows?

Q5. How can the authors instrument each LED light to emit light beacons periodically?

Q6. How can the authors extract more details on the human gesture?

Q7. How can the authors recover the shadow map cast by each light Li?

Q8. What is the effect of light Li on the main frequency power?

Q9. What is the key to reconstructing finergrained gestures?

Q10. What is the effect of blocking the direct path?

Q11. How long before and after blocking each LED light?

Q12. What are the key factors that affect LiSense’s skeleton reconstruction accuracy?

Q13. What are the factors that affect the latency of inferring a user posture?

Q14. How can the authors reconstruct a user skeleton in real time?

Q15. How do the authors change the duty cycle of the LED?

Q16. What is the reason for the angular errors of left-side joints?

Q17. What is the key challenge to realize shadow-based light sensing?

Q18. What is the blockage at a single photodiode?