What are the contributions in "On line predictive appearance-based tracking" ?

The authors present a novel predictive statistical framework to improve the performance of an EigenTracker. The authors show its successful application in hand gesture analysis ; and face and person tracking.

What are the extensions of the EigenTracker framework?

Existing extensions of the EigenTracker framework include tracking flexible objects [2], and incorporating the notion of shape in an eigenspace – Active Appearance Models (AAMs) [3].

What is the way to get the seed values for different objects?

In general, one may use motion cues (dominant motion detection) but depending on the particular application, other cues can be used to advantage – in their hand gesture tracker for example, the authors augment motion cues with skin colour cues [6] to segment out the moving hand.

What is the main factor for the inefficiency of the EigenTracker?

The EigenTracker estimates the affine and reconstruction coefficients after every frame, requiring a good seed value for the nonlinear optimization.

What is the main reason why the authors have a predictive EigenTracker?

Their predictive EigenTracker framework is flexible – it can be used to symbiotically augment other trackers with appearance information.

(Open Access) On line predictive appearance-based tracking (2004) | Namita Gupta

Q: What is the common way to estimate the state of a moving object?

The authors use six affine coefficients as elements of the state vector X. A commonly used model for state dynamics is a second order AR process (t represents time): Xt = D2Xt−2 +D1Xt−1 +wt, where wt is a zero-mean, white, Gaussian random vector.

Q: What is the main idea of the paper?

Section 2 discusses their prediction scheme, eigenspace updates, tracker initialization issues, and the Importance Sampling mechanism.

Q: What is the next step in the tracker?

For all subsequent frames, the next step is obtaining the measurements – optimizing the predicted values of affine coefficients a and reconstruction coefficients c.

ON LINE PREDICTIVE APPEARANCE-BASED TRACKING

Namita Gupta

Pooja Mittal

Kaustubh S. Patwardhan

Sumantra Dutta Roy

2,∗

Santanu Chaudhury

Subhashis Banerjee

Dept of Maths

Dept of EE

Dept of CSE

IIT Delhi IIT Bombay IIT Delhi IIT Delhi

New Delhi 110016 Mumbai 400076 New Delhi 110016 New Delhi 110016

ABSTRACT

We present a novel predictive statistical framework to im-

prove the performance of an EigenTracker. In addition, we

use fast and efﬁcient eigenspace updates to learn new views

of the object being tracked on the ﬂy. We also incorporate a

new Importance Sampling mechanism which increases the

robustness of the EigenTracker, and enables it to track non-

convex objects better. Our EigenTracker is ﬂexible – it is

possible to use it symbiotically with other trackers. We

show its successful application in hand gesture analysis; and

face and person tracking.

1. INTRODUCTION

An appearance-based tracker (EigenTracker [1]) can track

moving objects undergoing appearance changes. Existing

extensions of the EigenTracker framework include tracking

ﬂexible objects [2], and incorporating the notion of shape

in an eigenspace – Active Appearance Models (AAMs) [3].

The Isard and Blake CONDENSATION algorithm [4] can

represent simultaneous multiple hypothesis. In [5], they

propose the idea of Importance Sampling in a CONDEN-

SATION tracker to improve sample efﬁcacy. We enhance

the capabilities of an EigenTracker in three ways. We aug-

ment it with a CONDENSATION-based predictive frame-

work to increase its efﬁciency. We also formulate a novel

uniformity predicate as an Importance function to make it

more robust. Our predictive EigenTracker learns and tracks

unknownviews of an object on the ﬂy with an on-line eigenspace

update mechanism. Our predictive EigenTracker framework

is ﬂexible – it can be used to symbiotically augment other

trackers with appearance information. The rest of the pa-

per is organized as follows. Section 2 discusses our predic-

tion scheme, eigenspace updates, tracker initialization is-

sues, and the Importance Sampling mechanism. Here, we

also describe an interesting extension of our on-line Eigen-

Tracker – Symbiotic Tracking. In Section 3, we show the

∗

Author for Correspondence, sumantra@ee.iitb.ac.in

applications of the proposed method, such as in hand ges-

ture analysis; and face and person tracking.

142 156 165

170 179 191

Fig. 1. Our Predictive EigenTracker efﬁciently tracks a ges-

ticulating hand undergoing appearance changes, in spite of

background clutter.

2. ON-LINE PREDICTIVE EIGENTRACKER

2.1. The Prediction Mechanism

One of the main factors for the inefﬁciency of the Eigen-

Tracker is the absence of a predictive framework. The Eigen-

Tracker estimates the afﬁne and reconstruction coefﬁcients

after every frame, requiring a good seed value for the non-

linear optimization. The predictive framework helps gener-

ating better seed values for diverse object dynamics.

An EigenTracker approximates the object motion by an

afﬁne model. We use six afﬁne coefﬁcients as elements of

the state vector X. A commonly used model for state dy-

namics is a second order AR process (t represents time):

= D

t−2

+ D

t−1

+ w

, where w

is a zero-mean,

white, Gaussian random vector. The measurement is the set

of six afﬁne parameters obtained from the image, Z

= a.

Similar to [4], the observation model has Gaussian peaks

around each observation, and constant density otherwise.

We use a pyramidal approach for the CONDENSATION-

based predictiveEigenTracker. We start at the coarsest level.

ALGORITHM PREDICTIVE EIGENTRACKER

1. Delineate object of interest

REPEAT FOR ALL frames:

2. Get image MEASUREMENT optimizing

affine parameters a and

reconstruction coefficients c

3. IF using Importance Sampling THEN

optimize

a &

c parameters in the

‘importance’ eigenspace to compute

importance MEASUREMENT

4. ESTIMATE new affine parameters

using output of steps 2 and 3

5. FOR EACH eigenspace:

IF reconstruction error ∈ (T

, T

]

THEN update eigenspace

6. IF ANY recons. error very large

THEN construct eigenspace afresh

7. PREDICT a for next frame

Fig. 2. Our On-line Predictive EigenTracker: An Overview

Here, we estimate the values of the afﬁne coefﬁcients based

on their predicted values and the measurements done at this

level. These estimates serve as seeds to the next level of

pyramid. For every frame, we thus get sampled version of

conditional state density (S

), and corresponding weights

(Π

) for CONDENSATION. The state estimate at the ﬁnest

level is used to generate the predictions for the next frame.

2.2. Initialization and On-line Eigenspace Updates

Accurate tracker initialization is a difﬁcult problem because

of multiple moving objects and background clutter. Our

system performs fully automatic initialization under certain

conditions. In general, one may use motion cues (domi-

nant motion detection) but depending on the particular ap-

plication, other cues can be used to advantage – in our hand

gesture tracker for example, we augment motion cues with

skin colour cues [6] to segment out the moving hand. In

most tracking problems, the object of interest undergoes

changes in appearance over time. In a hand gesture-based

system for example, it is not feasible to learn all possible

hand poses and shapes, off-line. Therefore, one needs to

learn and update the relevant eigenspaces on the ﬂy. Since

a naive O (mN

) algorithm (for N images having m pixels

each) is time-consuming, we use an efﬁcient, scale-space

variant of the O(mNk) algorithm (for k most signiﬁcant

singular values) of Chandrasekaran et al. [7].

2.3. An Importance Sampling Mechanism

An Importance function augments a tracker operating with

one type of measurement, with information from an aux-

iliary measurement source [5]. Each measurement source

has its own characteristics and limitations. When combined

in an Importance Sampling framework, the two measure-

ment sources complement each other and together enhance

the reliability of the tracker. We propose a new uniformity

predicate-based Importance Sampling mechanism. Consider

a non-convex shape being tracked (Figure 3). We propose

the use of an ‘Importance eigenspace’ – this represents an

object view sans its background. We optimize the

a and

parameters of the Importance eigenspace to obtain the Im-

portance measurement. An on-line EigenTracker may have

problems with changing backgrounds in the bounding par-

allelogram (and might otherwise end up tracking the back-

ground). A combination of the two in an Importance Sam-

pling framework results in more reliable tracking.

2.4. The Overall Tracking Scheme

Figure 2 outlines our overall tracking scheme. In the ﬁrst

frame, we initialize the tracker (Section 2.2). For all subse-

quent frames, the next step is obtaining the measurements –

optimizing the predicted values of afﬁne coefﬁcients a and

reconstruction coefﬁcients c. We then obtain the Importance

measurement

a and

c, independent of the measurements of

step 2. The measurements of step 2 and 3 are combined in

the Importance Sampling framework to give the ﬁnal state

estimates. We then calculate the reconstruction error (us-

ing the robust error norm [1]), and update the eigenspaces

if required (steps 5 and 6). Finally, we predict the afﬁne

coefﬁcient values for the next frame.

2.5. Symbiotically Augmenting Other Trackers

We extend our EigenTracking framework for use in con-

junction with other trackers, to get its afﬁne parameters. It

then optimizes these parameters and returns shape param-

eters - a tighter ﬁt parallelogram bounding box. We thus

take advantage of the other tracker tracking the same object,

using a different measurement process, or tracking princi-

ple. Such a synergistic combination endows the combined

tracker with the beneﬁts of both, the EigenTracker as well

as the other one – tracking the view changes of an object in

a predictive manner. We have experimented using a CON-

DENSATION tracker and an EigenTracker for cases of re-

stricted afﬁne motion – rotation, translation and scaling (de-

tails in Section 3.1.2).

3. APPLICATIONS

We present two important applications of our approach: ges-

ture analysis; and face and person tracking. Our tracker

runs on a 700MHz PIII machine running Linux. In [8], we

present some preliminary results of predictive EigenTrack-

ing for tracking a moving hand. (Videos:

http://www.ee.iitb.ac.in/∼sumantra/icip04a)

3.1. Hand Gesture Tracking

Figure 1 shows successful application of our tracker to track

a hand undergoing extensive shape changes in a typical ges-

ture sequence, ﬁlmed against a cluttered background. For

the sequence shown in Figure 3(b) and 3(d), average number

of iterations decreases from 7.44 to 4.67 due to prediction.

For the face tracking example in Figure 5(b), the improve-

ment is from 12.8 to 12.3.

3.1.1. Incorporating our Importance Sampling Mechanism

The authors in [9] show that human skin occupies a small

portion in the entire colour space. For colour C = [C

]

in the Y C

colour space, we learn two likelihood func-

tions P (C|skin) and P (C|not skin). We then calculate n,

a number based on the colour C of a pixel as

P (skin|C )

P (not skin|C )

[6]. We consider those pixels corresponding to the top p%

values of n as skin-coloured pixels. These pixels are used in

forming the Importance eigenspace (Section 2.3), as shown

in Figure 3(c). The effect of Importance Sampling is evident

for cases such as in Figure 3 where the background consti-

tutes a large component of the image of the open hand (a

non-convex object) being tracked. The entire hand is better

tracked in the latter case (Figure 3(d)).

3.1.2. Synergistic Conjunction with Other Trackers: Re-

stricted Afﬁne Motion (Section 2.5)

We now show experimental results of using the on-line, multi-

resolution EigenTracker with a modiﬁed version of skin colour-

based CONDENSATION tracker described in [6]. The lat-

ter uses 4-element state vector, consisting of the rectangular

bounding window parameters. We ﬁrst compute the prin-

cipal axis of the pixel distribution of the best ﬁtting blob.

We then align the principal axis with the vertical Y -axis

and compute the new width, height and centroid. These pa-

rameters give us the restricted afﬁne matrix (scaling, rota-

tion, translation): A

restricted

= Inv(SRT). We use these

parameters, as an input to our EigenTracker. The Eigen-

Tracker then reﬁnes these parameters and computes the re-

construction error. In Figure 4 we show results of successful

symbiotic tracking. This scheme allows tracking of large

rotations as evident in Figure 4. It also yields a better ﬁt-

ting window and less background pixels, leading to lower

eigenspace reconstruction error.

023 049 069

(a) Non-predictive EigenTracker

023 049 080

(b) Predictive EigenTracker

023 061 076

086 101 113

(d) Predictive EigenTracker with Importance Sampling

Fig. 3. Our Importance sampling mechanism enables the

Predictive EigenTracker to track non-convex objects, better

3.2. Face Tracking, Person Tracking

In this section, we show examples of our Importance Sam-

pling method for tracking faces and persons across frames

in video sequences. In Figure 5(a) and 5(b), we use a skin

colour-based importance function for face tracking. The

object to be tracked (a face) undergoes motion as well as

considerable change in appearance. The on-line predictive

EigenTracker with Importance Sampling correctly tracks for

all 44 frames – a great improvement over simple Eigen-

Tracker. Additionally, our on-line predictive EigenTracker

has learnt different eigenspace views of the object being

tracked - this information can be used to recognize the per-

son in other ﬁlm clips as well.

In Figure 5 (person tracking), the uniformity predicate is

based on the colour of the person’s shirt. Successful track-

ing results even though the person moves against a back-

001 060 090

180 230 235

Fig. 4. Synergistic tracking (Section 2.5)

ground of a similar colour. In this case, a simple Eigen-

Tracker based on a colour predicate alone, would have failed

because the background colour is similar to that of the ob-

ject being tracked. However, the same in importance frame-

work enables the person to be tracked correctly. While we

have used colour, one may use texture or any other unifor-

mity predicate for the region of interest.

4. REFERENCES

[1] M. J. Black and A. D. Jepson, “EigenTracking: Robust

Matching and Tracking of Articulated Objects Using a

View-Based Representation,” International Journal of

Computer Vision, vol. 26, no. 1, pp. 63 – 84, 1998.

[2] F. De la Torre, J. Vitria, P. Radeva, and J. Melenchon,

“Eigenﬁltering for Flexible Eigentracking (EFE),” in

Proc. International Conference on Pattern Recognition

(ICPR), 2000, pp. III:1118 – 1121.

[3] T. Cootes, G. J. Edwards, and C. Taylor, “Active Ap-

pearance Models,” in Proc. European Conference on

Computer Vision (ECCV), 1998.

[4] M. Isard and A. Blake, “CONDENSATION - Condi-

tional Density Propagation For Visual Tracking,” Inter-

national Journal of Computer Vision, vol. 28, no. 1, pp.

5 – 28, 1998.

[5] M. Isard and A. Blake, “ICONDENSATION: Unify-

ing Low-level and High-level Tracking in a Stochastic

Framework,” in Proc. European Conference on Com-

puter Vision (ECCV), 1998, pp. 893 – 908.

[6] J. Mammen, S. Chaudhuri, and T. Agrawal, “Tracking

of both hands by estimation of erroneous observations,”

in Proc. British Machine Vision Conference (BMVC),

2001.

[7] S. Chandrasekaran, B. S. Manjunath, Y. F. Wang,

J. Winkeler, and H. Zhang, “An Eigenspace Update

FACE TRACKING IN A SPORTS VIDEO

002 006 010

(a) Without Imp. Sampling: failure at the 10th frame

002 006 010

020 025 044

(b) Using Importance Sampling

PERSON TRACKING

051 057 063

069 073 080

Fig. 5. Using our Importance Sampling mechanism (Sec-

tion 2.3) for two applications: Face tracking in a sports

video, and person tracking in a movie sequence

Algorithm for Image Analysis,” Graphical Models

and Image Processing, vol. 59, no. 5, pp. 321 – 332,

September 1997.

[8] N. Gupta, P. Mittal, S. Dutta Roy, S. Chaudhury, and

S. Banerjee, “Developing a gesture-based interface,”

IETE Journal of Research: Special Issue on Visual Me-

dia Processing, 2002.

[9] R. Kjeldsen and J. Kender, “Finding Skin in Color Im-

ages,” in Proc. Intl. Conf. on Automatic Face and Ges-

ture Recognition, 1996, pp. 312 – 317.

On line predictive appearance-based tracking

Figures

Citations

Hand gesture modelling and recognition involving changing shapes and trajectories, using a Predictive EigenTracker

System and method for object based parametric video coding

Real-Time FPGA-Based Object Tracker with Automatic Pan-Tilt Features for Smart Video Surveillance Systems

A Drifting-proof Framework for Tracking and Online Appearance Learning

Dynamic Hand Gesture Recognition Using Predictive Eigen Tracker.

References

C ONDENSATION —Conditional Density Propagation forVisual Tracking

Active Appearance Models

EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation

ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework

Finding skin in color images

Related Papers (5)

C ONDENSATION —Conditional Density Propagation forVisual Tracking

EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation

Feature Processing and Modeling for 6D Motion Gesture Recognition

Face tracking and hand gesture recognition for human-robot interaction

Efficient approximations to model-based joint tracking and recognition of continuous sign language

Frequently Asked Questions (8)

Q1. What are the contributions in "On line predictive appearance-based tracking" ?

Q2. What is the common way to estimate the state of a moving object?

Q3. What are the extensions of the EigenTracker framework?

Q4. What is the way to get the seed values for different objects?

Q5. What is the main factor for the inefficiency of the EigenTracker?

Q6. What is the main idea of the paper?

Q7. What is the main reason why the authors have a predictive EigenTracker?

Q8. What is the next step in the tracker?