scispace - formally typeset
Open AccessJournal ArticleDOI

Hierarchical fuzzy logic based approach for object tracking

TLDR
The aim of this methodology is to use these concepts as a tool to, while maintaining the needed accuracy, reduce the complexity usually involved in object tracking problems.
Abstract
In this paper a novel tracking approach based on fuzzy concepts is introduced. A methodology for both single and multiple object tracking is presented. The aim of this methodology is to use these concepts as a tool to, while maintaining the needed accuracy, reduce the complexity usually involved in object tracking problems. Several dynamic fuzzy sets are constructed according to both kinematic and non-kinematic properties that distinguish the object to be tracked. Meanwhile kinematic related fuzzy sets model the object's motion pattern, the non-kinematic fuzzy sets model the object's appearance. The tracking task is performed through the fusion of these fuzzy models by means of an inference engine. This way, object detection and matching steps are performed exclusively using inference rules on fuzzy sets. In the multiple object methodology, each object is associated with a confidence degree and a hierarchical implementation is performed based on that confidence degree.

read more

Content maybe subject to copyright    Report

Hierarchical Fuzzy Logic Based Approach for Object Tracking
Nuno Vieira Lopes
a,b
, Pedro Couto
b
, Aranzazu Jurio
c
, Pedro Melo-Pinto
b
a
School of Technology and Management, Polytechnic Institute of Leiria, Morro do Lena, Alto do Vieiro, Apartado 4163, 2411-901 Leiria, Portugal
b
CITAB - Centre for the Research and Technology of Agro-Environmental and Biological Sciences, University of Tr´as-os-Montes e Alto Douro, Quinta de Prados,
5000-911 Vila Real, Portugal
c
Department of Automatic and Computation, Public University of Navarra, Campus Arrosad´ıa, 31006 Pamplona, Spain
Abstract
In this paper a novel tracking approach based on fuzzy concepts is introduced. A methodology for both single and multiple object
tracking is presented. The aim of this methodology is to use these concepts as a tool to, while maintaining the needed accuracy,
reduce the complexity usually involved in object tracking problems. Several dynamic fuzzy sets are constructed according to both
kinematic and non kinematic properties that distinguish the object to be tracked. Meanwhile kinematic related fuzzy sets model
the object’s motion pattern, the non kinematic fuzzy sets model the object’s appearance. The tracking task is performed through
the fusion of these fuzzy models by means of an inference engine. This way, object detection and matching steps are performed
exclusively using inference rules on fuzzy sets. In the multiple object methodology, each object is associated with a confidence
degree and a hierarchical implementation is performed based on that confidence degree.
Keywords:
Dynamic fuzzy sets, inference engine, hierarchical, multiple object tracking
1. Introduction
Object tracking plays an important role in computer vision.
During the last years, extensive research has been conducted
in this field and many types and applications of object tracking
systems have been proposed in the literature such as automated
surveillance, vehicle navigation, human computer interaction
and trac analysis [1, 2, 3, 4, 5, 6, 7]. Tracking is essential
to many applications and robust tracking algorithms are still a
huge challenge. Diculties can arise due to noise presence in
images, quick changes in lighting conditions, abrupt or com-
plex object motion, changing appearance patterns of the object
and the scene, non-rigid object structures, object-to-object and
object-to-scene occlusions, camera motion and real time pro-
cessing requirements. Typically, assumptions are made to con-
strain the tracking problem in the context of particular applica-
tions. For instance, almost all tracking algorithms assume that
the object motion is smooth or impose constrains on the object
motion to be constant in velocity or acceleration. Multiple view
image tracking or prior knowledge about objects, such as size,
number or shape, can also be used to simplify the process. In
this work, the word ”object” refers to the template image pat-
tern being tracked (e.g. person’s hair, briefcase, etc.).
Normally, tracking is seen as a main task involving several
subtasks such as image segmentation for object detection, ob-
ject matching and object position estimation. A myriad of algo-
rithms has been developed to implement this subtasks but each
Email addresses: nuno.lopes@ipleiria.pt (Nuno Vieira Lopes),
pcouto@utad.pt (Pedro Couto), aranzazu.jurio@unavarra.es
(Aranzazu Jurio), pmelo@utad.pt (Pedro Melo-Pinto)
one have their strengths and weaknesses and, over the last years,
extensive research has been made in this field to find optimal
tracking systems for specific applications. Many approaches of
tracking techniques have been proposed in the literature, how-
ever, they are not completely accurate for all kind of scenarios
and just provide good results when a certain number of assump-
tions are verified. Moreover, tracking methodologies that are
not designed for particular applications, where specific and well
established assumptions or constrains can easily be imposed,
tend to be very complex. These reasons are the motivation to
study and implement new tracking approaches where the intro-
duction of soft computing techniques, such as fuzzy logic, is
intended for:
Reducing the tracking task complexity by endowing the
methodology with the capability of incorporating reason-
ing in the same sense that human reasoning simplifies real
tracking problems (e.g. most tracking problems are not
complex for humans, they are indeed trivial in most situa-
tions).
Endowing the methodology with the needed scalability in
order to cope with the specific needs of dierent tracking
problems by easily adding, changing or adapting the used
fuzzy sets while maintaining its general framework.
The presented methodology intents to be an ease and feasible
general framework for object tracking that can easily be adapted
to specific applications or problems.
The remainder of this paper is organized as follows. In Sec-
tion 2 the definition and a review of object tracking is presented.
Preprint submitted to Elsevier September 6, 2013

A general description of fuzzy set theory is presented in Sec-
tion 3. The proposed approach is presented in Section 4. A
possible implementation of the proposed approach is given at
Section 5. Section 6 shows the experimental results to illustrate
the eectiveness of the proposed approach and a comparative
study with well known tracking approaches is performed. Fi-
nally Section 7 presents the final conclusions and future direc-
tions.
2. Object Tracking
Object tracking can be described as the problem of estimat-
ing the trajectory of an object as it moves around a scene.
Although this general concept is almost consensual, the spe-
cific definition of tracking can change in the literature. Nev-
ertheless, tracking systems must address two basic processes:
figure-ground segmentation and temporal correspondences [8].
Figure-ground segmentation is the process of extracting the ob-
jects of interest from the video frame. Segmentation methods
are applied as the first step in many tracking systems and there-
fore they are a crucial task. Object detection can be based on
motion [9, 10], appearance [11, 12, 13], etc. Temporal corre-
spondence concerns to the association of the detected objects
in the current frame with those in the previous frames defining
temporal trajectories [14, 15].
In [16], tracking is described as a motion problem and a
matching problem. In this work, the motion problem is related
with the prediction of the object location in the next frame. The
second step is similar to the explained above. However, [1]
and [17] present a wider description of tracking with three
steps: detection of interesting objects, tracking such objects and
analysis of object tracks to recognize their behavior. In [18] this
behavior analysis is seen as a further interpretation of tracking
results.
The selection of the most suitable feature to track is a critical
role in tracking systems. The uniqueness of such feature pro-
vides an easy way to distinguish the object in the scene along
time. Properties as intensity, color, gradient, texture or motion
are commonly used to perform object tracking.
According to its properties, object tracking could be cate-
gorized in three groups: point, kernel and motion based ap-
proaches.
2.1. Point based tracking
Point based tracking approaches are suitable for tracking ob-
jects that occupy small regions in an image or they can be rep-
resented by several distinctive points. These points must be
representative of the object and invariant to changes in illumi-
nation, object orientation and camera viewpoint. Points denot-
ing significant gradient in intensity are preferred and commonly
used by dierent detectors such as Harris [19], KLT [20] and
SIFT [21]. To deal with the point correspondence problem be-
tween frames, deterministic constraints such as proximity, max-
imum velocity and small velocity change could be used. An al-
ternative is to use statistical methods such as Kalman or particle
filters. KLT and SIFT approaches provide internal methodolo-
gies to address the correspondence problem. Scale-invariant
feature transform (or SIFT) is a well-known algorithm for ob-
ject recognition and tracking. Interesting points are extracted
from the object to provide a set of descriptors. These descrip-
tors must be detected on the new image even among clutter, par-
tial occlusion and uniform object scaling and rotation. In order
to reduce computational time consumption, a research region in
the next frame is defined according the last known location or
based in a motion model of the object. However this method
would typically not work with deformable or articulated ob-
jects since the relative positions between the descriptors dier
from the original representation. To overcome this limitation
an update scheme could be used and the object descriptors are
recomputed after a predefined elapsed time.
2.2. Kernel based tracking
In this approach it is required a template or an appearance
model of the object. Template tracking consists of searching
in the current image for a region similar to the object tem-
plate. The position and, consequently, the object matching
between two consecutive frames is achieved by computing a
similarity measure such as the cross-correlation. The cross-
correlation concept is presented in [13]. Instead of templates,
other object representations can be used for matching, for in-
stance, color, color statistics, texture or histogram based infor-
mation. The mean shift tracking algorithm is an ecient ap-
proach to tracking objects whose appearance can be described
using histograms [22]. This iterative method maximizes the ap-
pearance similarity by comparing the histograms of the object
and the region around the predicted object location. The Bhat-
tacharya and Kullback-Leibler distances are commonly em-
ployed to measure the similarity between the template and the
current target region. It fails in the case of occlusions and quick
appearance changes.
2.3. Motion based tracking
This group of approaches perform tracking based on dis-
placement or optical flow of image pixels. The optical flow of a
pixel is a motion vector represented by the translation between
a pixel in one frame and its corresponding pixel in the follow-
ing frame. This computation has been proved to be dicult
to achieve due to issues such as the brightness constancy as-
sumption and the aperture problem. The classic tracking algo-
rithm Kanade-Lucas-Tomasi (KLT) was firstly proposed by Lu-
cas and Kanade in 1981, being perfected by Tomasi and Kanade
in 1991 and explained in detail by Shi and Tomasi in 1994 [20].
The method proposed by Lucas and Kanade computes the opti-
cal flow for each pixel of an image, while the method proposed
by Tomasi and Kanade known as KLT, extracts optimal points
in the image and then computes the optical flow on the subse-
quent images to only this subset of points. The KLT is a com-
plete method that provides a solution for two problems in com-
puter vision: the problem of optimal selection of suitable points
in an image and the problem of determining the correspondence
between points in consecutive frames. It has little tolerance in
image brightness variation and diculty in detecting rapid ob-
ject movements. Tracking moving objects can also be achieved
2

by constructing a reference representation of the environment
called background model and then finding deviations between
this model and each incoming frame. A significant change be-
tween the background model and an image region denotes a
moving object. This process is referred as background sub-
traction and represents a popular method especially under those
situations with a relatively static background. An alternative
approach to detect changes and, consequently the movement,
between two consecutive intensity image frames I(x, y, t) and
I(x, y, t 1) taken at times t and t 1, respectively, is to per-
form a pixelwise dierence operation. Frame dierencing is
very adaptive to dynamic environments, but generally does a
poor job of extracting all the relevant pixels, i. e., there may be
holes left inside slowly moving objects.
Since the arise of fuzzy logic theory, it has been success-
fully applied in a large range of areas such as process con-
trol systems, automotive navigation systems, information re-
trieval systems and image processing. As presented before-
hand, a tracking system can be seen as a multi-stage process
that comprise figure-ground segmentation and temporal corre-
spondences. Hence, fuzzy logic can be used in these two dier-
ent stages. [23] assigned a membership degree to the pixel us-
ing the relationship between its grey value and mean grey value
of the region to which it belongs. For each grey level a fuzzy
set is constructed and the optimal threshold value is the level
of grey associated with the fuzzy set with lowest entropy. [24]
proposed a segmentation approach using an extension of fuzzy
sets theory, so called the Atanassov’s Intuitionistic Fuzzy Sets,
for representing the uncertainty of the expert in determining if
a pixel belongs to the background or to the object. The optimal
threshold value is associated with the intuitionistic fuzzy set of
lowest entropy. In [25, 26] an automatic histogram threshold
approach based on a fuzziness measures is presented. In [27]
an active sonar system to track submarines using a Kalman fil-
ter and a posterior fuzzy rule logic association is presented. A
fuzzy approach to assign one or several blobs to a track for au-
tomatic surveillance in airport areas is described in [28]. The
previous work presented in [29, 30, 31], a multi feature tracking
approach using dynamic fuzzy sets were introduced, however,
no hierarchial scheme was implemented.
3. Fuzzy Sets Theory
In 1965, fuzzy sets were introduced by Zadeh [32] to repre-
sent or manipulate data and information containing nonstatisti-
cal uncertainties. This theory was specifically created to math-
ematically represent uncertainty and vagueness and to provide
tools for dealing with the imprecision intrinsic to many prob-
lems.
A classical (crisp) set is defined as a collection of elements
x X where each single element can either belong to or not be-
long to a set A, A X. However, fuzzy sets have more flexible
membership requirements allowing the elements to have partial
memberships between 0 and 1 rather than the unique member-
ships 0 and 1 like in classical sets.
Let X = {x
1
, ..., x
n
} be an ordinary finite non-empty set. A
fuzzy set
˜
A in X is as set of ordered pairs
˜
A = {(x, µ
˜
A
(x))|x X},
where µ
˜
A
: X [0, 1] represents the membership function.
A fuzzy set
˜
A is said to be empty, written
˜
A = , if and only
if
µ
˜
A
(x) = 0, x X (1)
Two fuzzy sets
˜
A and
˜
B in X are equal, written
˜
A =
˜
B, if and
only if
µ
˜
A
(x) = µ
˜
B
(x), x X (2)
Instead of writing µ
˜
A
(x) = µ
˜
B
(x), x X, it can be written,
more simply, µ
˜
A
= µ
˜
B
, x X.
The membership function µ
˜
A
is also called grade of member-
ship, degree of compatibility or degree of truth. The range of
this function is a subset of the nonnegative real numbers whose
supremum is finite, normally 1. The basic operations in fuzzy
set theory are the complement, intersection and union. Since
the membership function is the crucial component of a fuzzy
set, it is therefore not surprising that operations with fuzzy sets
are defined via their membership functions. These concepts,
firstly suggested in [32], constitute a consistent framework for
the theory of fuzzy sets. However, they are not unique since
Zadeh and other authors have suggested consistent alternative
or additional definitions for fuzzy set operations. The comple-
ment of a fuzzy set
˜
A in X, written ¬
˜
A, is the fuzzy set
¬
˜
A = {(x, µ
¬
˜
A
(x) = 1 µ
˜
A
(x))|x X}. (3)
The intersection of two fuzzy sets
˜
A and
˜
B in X, written
˜
A
˜
B,
is the fuzzy set
˜
A
˜
B = {(x, µ
˜
A
˜
B
(x) = (µ
˜
A
(x), µ
˜
B
(x)))|x X}, (4)
where is the minimum operator.
The union of two fuzzy sets
˜
A and
˜
B in X, written
˜
A
˜
B, is
the fuzzy set
˜
A
˜
B = {(x, µ
˜
A
˜
B
(x) = (µ
˜
A
(x), µ
˜
B
(x)))|x X}, (5)
where is the maximum operator.
When dealing exclusively with fuzzy sets, the symbol
could be omitted.
General operators for the intersection and union of fuzzy
sets are referred as triangular norms (t-norms) and triangular
conorms (t-conorms or s-norms), respectively. A function
t : [0, 1] × [0, 1] [0, 1], (6)
satisfying, for each a, b, c, d [0, 1], the following properties:
P1. it has 1 as the unit element: t(a, 1) = a;
P2. it is monotone: t(a, b) t(c, d) if a c and b d;
P3. it is commutative: t(a, b) = t(b, a);
P4. it is associative: t[t(a, b), c] = t[a, t(b, c)].
is called a t-norm.
Some relevant examples of t-norms are referred in [33]:
1. the minimum: t(a, b) = a b = min(a, b). Which was
proposed by [32].
3

2. the algebraic product: t(a, b) = a · b
3. the Lukasiewicz t-norm: t(a, b) = max(0, a + b 1)
A function
s : [0, 1] × [0, 1] [0, 1], (7)
satisfying, for each a, b, c, d [0, 1], the following properties:
P1. it has 0 as the unit element: s(a, 0) = a;
P2. it is monotone: s(a, b) s(c, d) if a c and b d;
P3. it is commutative: s(a, b) = s(b, a);
P4. it is associative: s[s(a, b), c] = s[a, s(b, c)].
is called a t-conorm or s-norm.
Some relevant examples of t-conorms are also referred
in [33]:
1. the maximum: t(a, b) = a b = max(a, b). Which was
proposed by [32].
2. the probabilistic product: s(a, b) = a + b ab
3. the Lukasiewicz s-norm: s(a, b) = min(a + b, 1)
Note that a t-norm is dual to an s-norm in that:
s(a, b) = 1 t(1 a, 1 b). (8)
4. Proposed Methodology
The implementation of this approach is based in some under-
lying assumptions. These assumptions are commonly used in
most tracking systems:
1. The object has constancy of grey levels intensity;
2. The object presents smooth motion;
3. For sake of simplicity, the motion between two consecu-
tive frames can be described using a linear motion model;
4. The area occupied by the object is small when compared
with the total image area;
5. The size of the object is preserved during the sequence.
In this approach object brightness constancy is assumed.
This situation can be described as
I(x, y, t) I(x + δx, y + δy, t + δt), (9)
where δx and δy are the displacement of the local region at
(x, y, t) after time δt.
Nevertheless, small changes in illumination, camera sensor
noise, among other factors that cause variations in the intensity
of the object, are tolerated.
The smoothness of the movement concerns the continuity of
the object movement. The object movement is assumed to be
continuous and, therefore, using a typical acquisition frame rate
and assuming there are no occlusions or misdetections, the next
position of the object lies inside a neighborhood of its previous
position.
It is also assumed that the object movement between two con-
secutive frames can be represented by a linear motion model
with constant acceleration. The object can move along both the
x and y axis and, therefore, the position p(t) can be obtained
from the previous position p(t 1) using the following equa-
tion:
p(t) = p(t 1) + vt +
1
2
at
2
, (10)
where p(t) = [x, y]
0
is the object position at instant t, p(t 1) =
[x
0
, y
0
]
0
is the object position at instant t 1, t is the elapsed
time from instant t 1 to instant t, v = [v
x
, v
y
]
0
and a = [a
x
, a
y
]
0
are, respectively, the observed velocity and acceleration in both
axis during t.
The size of the object is considerably small when compared
with the total image area. Assuming this, the object can be
represented as a point or by a small A × B matrix and, similar
strategies to the ones used in point correspondence can be de-
veloped for object matching. In the examples presented in this
work, for simplicity sake, we take A = B = 3.
4.1. Single-object Tracking
In this methodology, the visual descriptors used to track ob-
jects are divided in two distinct groups: kinematic and non
kinematic descriptors. Non kinematic descriptors are used to
describe the object’s appearance and kinematic properties used
to describe the object’s motion. At the beginning of the process,
both these sources of information are treated separately. At the
final stage of the process these sources are combined by a fuzzy
inference engine that will ultimately provide the estimated po-
sition of the tracked object (Fig. 1).




























Figure 1: Single-object tracking methodology scheme.
The methodology scheme presented in Fig. 1 can be divided
into three main stages:
Fuzzification of all used descriptors. This way the method-
ology is able to better deal with the uncertainty and im-
precision present in the images (and consequently in the
descriptors).
A fuzzy operations stage where descriptors can be com-
bined according to their precision (higher memberships
are more likely to prevail over weaker memberships).
4

An inference engine that’s responsible for fusing the in-
formation provided by all used descriptors (both kine-
matic and non kinematic). Through this inference engine,
the methodology is able to incorporate reasoning, in the
same sense as human reasoning, in the tracking process by
providing answers based on the existing knowledge base
(fuzzified descriptors).
The non kinematic properties used can be chosen according
to the specificity of the problem in hand. There’s a myriad of
possibilities such as color, shape, texture, size, among others,
that can be used as visual descriptors of the object appearance.
A fuzzy set is constructed to model each one of the chosen prop-
erties. These fuzzy sets will represent the membership of all the
candidate image positions to the object been tracked. The set
of fuzzy operations performed on these fuzzy sets (in their sim-
plest form, fuzzy unions and intersections) are used to combine
them and, as a consequence, reduce their number. This dimen-
sionality reduction is an additional advantage since the lower
the number of fuzzy sets on the output of this fuzzy operations
block (Fig. 1), the less complex the inference engine block will
be (i.e. less rules will be necessary).
The kinematic properties will undergo a similar process in
the methodology (Fig. 1). As for the non kinematic properties
also the kinematic properties can be chosen according to the
problem at hand. Kinematic properties such as object velocity,
acceleration and other motion patterns can be used as kinematic
properties.
The final processing block of the methodology consists in an
inference engine which complexity (number of rules) depends
on the number of inputs. The design of the inference engine
should able the process to model the system in order to achieve
a good balance between the information provided by the objects
kinematic and non kinematic properties. This way, depending
on the problem at hand, through the design of this inference
engine, the method is able to incorporate some reasoning valu-
ing either the kinematic or the non kinematic object descriptors
depending on their importance. Also, the certainty one has re-
garding the information provided by each one of these sources
(kinematic and non kinematic) can be incorporated in the de-
sign of this engine (human reasoning).
The inference engine will present as its output must always
be a single fuzzy set. The position of the tracked object is then
obtained by maximizing the membership function of this fuzzy
set.
4.2. Multi-object Tracking
Based on the previous depicted methodology for single ob-
ject tracking, a multiple object tracking was developed. For
each tracked object, the methodology presented in Fig. 1 is ap-
plied and its results (a fuzzy set for each object) are the input of
a hierarchical matching system responsible for establishing the
correspondences between objects from frame to frame. This hi-
erarchical matching system (Fig. 2) is mainly constituted by a
confidence assessment scheme that assigns objects with con-
fidence degrees according the correspondence situation from
which the object’s position is estimated. To correctly estab-
lish these correspondences, there are situations where the in-
put fuzzy set is not sucient and it is crucial to know the rule
used to create this fuzzy set (dashed arrows in Fig. 2). These
correspondence situations are depicted in the remainder of this
section.






Figure 2: Multi-object tracking methodology scheme.
In multiple object tracking several correspondence situations
can occur. Fig. 3 depicts these situations, where denotes the
object position at frame t 1 and × denotes the object posi-
tion at frame t. The question mark (?) represents the absence
of matching at frame t. The first situation depicted in Fig. 3a
indicates that each object is matched with a dierent candidate
position in the next frame and the current position, at frame t,
of the object will be the position of the corresponding candi-
date position. Sometimes dierent objects in frame t 1 will
be assigned to the same point in frame t. When two moving
objects pass close each other or even when one object occludes
another, or also due to the representation of a 3D world in a
2D plane, they can appear as being just one region in the im-
age. This situation could be seen as a merging of objects or a
inter object occlusion case (Fig. 3b). The opposite situation is
also considered, i. e., several united objects could have dierent
motion directions and one single region, representing multiple
objects, could result in multiple matching positions. It could be
seen as a split of objects (Fig. 3c). Finally, if at some instant,
the situation depicted at Fig. 3d occurs, i.e., there is no candi-
5

Citations
More filters
Journal ArticleDOI

A multi-view model for visual tracking via correlation filters

TL;DR: This work proposes a multi-view correlation tracker that combines features from distinct views to do tracking via correlation filters and raises a simple but effective scale-variation detection mechanism, which strengthens the stability of scale variation tracking.
Journal ArticleDOI

Single Object Tracking With Fuzzy Least Squares Support Vector Machine

TL;DR: This paper introduces the fuzzy strategy into tracking and proposes a novel fuzzy tracking framework, which can measure the importance of the training samples by assigning different memberships to them and offer more strict spatial constraints.
Journal ArticleDOI

Shadowed sets of dynamic fuzzy sets

TL;DR: This paper provides an analytic solution to computing the pair of thresholds by searching for a balance of uncertainty in the framework of shadowed sets, and constructs errors-based three-way approximations ofShadowed sets and presents an alternative decision-theoretic formulation for calculating the Pair of thresholds.
Journal ArticleDOI

A review and an approach for object detection in images

TL;DR: An object detection system finds objects of the real world present either in a digital image or a video, where the object can belong to any class of objects namely humans, cars, etc.
Journal ArticleDOI

Visual tracking via exemplar regression model

TL;DR: This paper demonstrates that, by giving a very simple positive-negative prior knowledge for the training samples, the performance of the ridge regression model can be improved by a large margin, even better than its frequency domain competitors-the correlation filters, on most challenging sequences.
References
More filters
Book

Fuzzy sets

TL;DR: A separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Frequently Asked Questions (16)
Q1. What are the contributions in "Hierarchical fuzzy logic based approach for object tracking" ?

In this paper a novel tracking approach based on fuzzy concepts is introduced. A methodology for both single and multiple object tracking is presented. The aim of this methodology is to use these concepts as a tool to, while maintaining the needed accuracy, reduce the complexity usually involved in object tracking problems. 

Although the presented hierarchical matching approach for multiple object tracking has provided encouraging results, these results lead us to further work with intend to improve robustness, introduce new capabilities and achieve computational efficiency over different image sequences. Further work is intended on the introduction and performance evaluation of different distinctive object properties such as shape, texture and other object descriptors, in order to construct suitable fuzzy sets and introduce new rules in the inference engine. 

To deal with the point correspondence problem between frames, deterministic constraints such as proximity, maximum velocity and small velocity change could be used. 

There’s a myriad of possibilities such as color, shape, texture, size, among others, that can be used as visual descriptors of the object appearance. 

After all objects had been processed by the fuzzy algorithm, an update stage is needed to update the Kalman filter and to remove objects with lower confidence degree. 

The Bhattacharya and Kullback-Leibler distances are commonly employed to measure the similarity between the template and the current target region. 

General operators for the intersection and union of fuzzy sets are referred as triangular norms (t-norms) and triangular conorms (t-conorms or s-norms), respectively. 

The design of the inference engine should able the process to model the system in order to achieve a good balance between the information provided by the objects kinematic and non kinematic properties. 

An alternative approach to detect changes and, consequently the movement, between two consecutive intensity image frames I(x, y, t) and I(x, y, t − 1) taken at times t and t − 1, respectively, is to perform a pixelwise difference operation. 

The proposed methodology incorporates hierarchical matching schemes to deal with multi featuretracking and Kalman filters to incorporate the kinematic feature model that increase the processing time. 

If there exists more image frames to process, the next frame is analyzed and the cycle is repeated until it reaches the end of the sequence. 

The optical flow of a pixel is a motion vector represented by the translation between a pixel in one frame and its corresponding pixel in the following frame. 

The KLT is a complete method that provides a solution for two problems in computer vision: the problem of optimal selection of suitable points in an image and the problem of determining the correspondence between points in consecutive frames. 

Point based tracking approaches are suitable for tracking objects that occupy small regions in an image or they can be represented by several distinctive points. 

After frame number 451 the tracking fails since the background denotes higher histogram similarity than the feature histogram due to illumination and object pose changes. 

The update rate is very important to deal with object shape variations, but the higher the update frequency the higher the computational time.