scispace - formally typeset
Open AccessProceedings ArticleDOI

Towards autonomous bootstrapping for life-long learning categorization tasks

TLDR
An exemplar-based learning approach for incremental and life-long learning of visual categories and it is argued that contextual information is beneficial for this process.
Abstract
We present an exemplar-based learning approach for incremental and life-long learning of visual categories. The basic concept of the proposed learning method is to subdivide the learning process into two phases. In the first phase we utilize supervised learning to generate an appropriate category seed, while in the second phase this seed is used to autonomously bootstrap the visual representation. This second learning phase is especially useful for assistive systems like a mobile robot, because the visual knowledge can be enhanced even if no tutor is present. Although for this autonomous bootstrapping no category labels are provided, we argue that contextual information is beneficial for this process. Finally we investigate the effect of the proposed second learning phase with respect to the overall categorization performance.

read more

Content maybe subject to copyright    Report

Towards Autonomous Bootstrapping for
Life-long Learning Categorization Tasks
Stephan Kirstein, Heiko Wersing and Edgar K
¨
orner
AbstractWe present an exemplar-based learning approach
for incremental and life-long learning of visual categories. The
basic concept of the proposed learning method is to subdivide
the learning process into two phases. In the first phase we utilize
supervised learning to generate an appropriate category seed,
while in the second phase this seed is used to autonomously
bootstrap the visual representation. This second learning phase
is especially useful for assistive systems like a mobile robot,
because the visual knowledge can be enhanced even if no
tutor is present. Although for this autonomous bootstrapping
no category labels are provided, we argue that contextual
information is beneficial for this process. Finally we investigate
the effect of the proposed second learning phase with respect
to the overall categorization performance.
I. INTRODUCTION
In the recent decades a wide variety of category learn-
ing paradigms have been proposed ranging from generative
[10], [14] to discriminative models [6], [18]. However, most
research on this topic focused so far on supervised learn-
ing. The major advantage of supervised over unsupervised
learning is the higher categorization performance, where the
time consuming and costly collection of accurately labeled
training data is its fundamental drawback. In the context of
assistive systems this means that whenever the system should
enhance its category representation a tutor has to specify
the corresponding labels. Although we consider the interac-
tion with a tutor as a necessary part of the early learning
phase, we want to enable the system to more and more
autonomously bootstrap its acquired category representation.
Therefore we investigate in this paper the combination of
semi-supervised and life-long learning to reduce the necessity
of tutor interactions.
The basic idea of semi-supervised learning is to com-
bine supervised with unsupervised learning [12], [2]. The
advantage of this combination is typically a considerably
higher performance compared to purely data driven unsuper-
vised methods, whereas the labeling effort can be strongly
reduced. Typically for semi-supervised learning the initial
representation is trained based on the labeled portion of the
training data. Afterwards this initial representation is utilized
to estimate the correct class labels for the unlabeled portion
of the training data. Commonly only unlabeled training
examples with high classifier confidence are used for the
Stephan Kirstein is with the Honda Research Institute Europe
GmbH, Carl-Legien-Strasse 30, 63073 Offenbach, Germany; (email:
stephan.kirstein@honda-ri.de).
Heiko Wersing is with the Honda Research Institute Europe GmbH,
(email: heiko.wersing@honda-ri.de).
Edgar K
¨
orner is with the Honda Research Institute Europe GmbH, (email:
edgar.koerner@honda-ri.de).
bootstrapping. This guaranties a low amount of errors in
the estimated labels, but this data most probably is less
useful to enhance the classifier performance, because it is
already well represented [17]. To overcome this limitation
semi-supervised learning can be extended by active learning
[13], [15], where the learning system requests the tutor-driven
labeling for the currently worst represented training data.
In contrast to this we propose to use temporal context
information to overcome this limitation rather than requesting
additional user interactions. To use the temporal context,
object views that belong to the same physical object have
to be identified first. In offline experiments this typically can
be easily achieved. For an autonomous system this requires
the tracking of the object over a longer period, so that it
is most probable that the corresponding views belong to
the same physical object. Based on this object view list a
majority voting can be applied. The advantage of such voting
is that not only already well represented views are added to
the training ensemble, but also currently wrong categorized
views of the same object. We believe that such a combination
has the highest potential effect with respect to an increasing
categorization performance.
Although semi-supervised learning is a common learn-
ing technique (see [19] for an overview), in the context
of incremental and life-long learning it has gained so far
much less interest. We consider the ability of increasing
the visual knowledge in a life-long learning fashion as a
basic requirement for an autonomous system. Nevertheless
combining semi-supervised with life-long learning is more
challenging compared to typical semi-supervised learning
approaches. This is because for life-long learning tasks the
learning method commonly has only access to a limited
amount of training data, so that the bootstrapping is normally
purely based on the unlabeled training views and their
autonomously assigned label information. This is in contrast
to typical semi-supervised approaches, where the labeled
and unlabeled training views are combined to one single
training set. Furthermore to cope with the “stability-plasticity
dilemma” [1] of life-long learning tasks on the one hand sta-
bility considerations are required to avoid the “catastrophic
forgetting effect” [3] of the learned representation, while
for the plasticity the allocation of new network resources is
necessary. It is obvious that this resource allocation is con-
siderably more difficult if the label information is unreliable
as this is the case for the unsupervised training data.
The paper is structured in the following way. In the next
Section II we briefly explain our category learning vector
quantization (cLVQ) framework. Afterwards the modifica-

Negative representative
Positive representative
Object 4
Object 6
Object 5
High−dimensional
feature space
cLVQ
...
w
w
w
1
w
2
w
3
4
K
w
K−2
w
K−1
...
Low−dimensional
subspace
Category 1
Low−dimensional
subspace
Category C−1
Low−dimensional
subspace
Category 2
Low−dimensional
subspace
Category C
Limited and Changing Training Set Category Representation
i
x
Fig. 1. Illustration of the Category Learning Framework. The learning with our proposed category learning vector quantization (cLVQ) approach
is based on a limited and changing training set. Based on the currently available training vectors x
i
and the corresponding target labels t
i
the cLVQ
incrementally allocates new representation nodes and category-specific features. The selected features sets for each category c enables an efficient separation
of co-occurring categories (e.g. if an object belongs to several categories, which is the standard setting in our experiments) and the definition of various
metrical “views” to a single node w
k
. The categorization decision itself is based on the allocated cLVQ nodes w
k
and the low-dimensional category-specific
feature spaces.
tions of the basic cLVQ approach and the context dependent
estimation of category labels is described in Section III. In
Section IV the experimental results are summarized and are
discussed in Section V.
II. CATEGORY LEARNING VECTOR QUANTIZATION
Our proposed category learning approach [8] enables in-
teractive and life-long learning and therefore can be utilized
for autonomous systems, but so far we only considered
supervised learning based on interactions with an human
tutor. In the following we briefly describe the learning
framework as illustrated in Fig.1. In the presented paper we
utilized this framework for creating the category seed in a
purely supervised fashion. The proposed learning approach
is basically based on an exemplar-based incremental learning
network combined with a forward feature selection method
to enable incremental and life-long learning of arbitrary
categories. Both parts are optimized together to find a balance
between the insertion of features and allocation of represen-
tation nodes, while using as little resources as possible. In the
following we refer to this architecture as category learning
vector quantization (cLVQ).
To achieve the interactive and incremental learning capa-
bility the exemplar-based network part of the cLVQ method
is used to approach the ”stability-plasticity dilemma” of life-
long learning problems. Thus we define a node insertion
rule that automatically determines the number of required
representation nodes. The final number of allocated nodes
w
k
and the assigned category labels u
k
corresponds to the
difficulty of the different categories itself but also to the
within-category variance. Finally the long-term stability of
these incrementally learned nodes is considered based on an
individual node learning rate Θ
k
as proposed in [7].
Additionally a category-specific forward feature selection
method is used to enable the separation of co-occurring cate-
gories, because it defines category-specific metrical “views”
on the representation nodes of the exemplar-based network.
During the learning process it selects low-dimensional sub-
sets of features by predominantly choosing features that
occur almost exclusively for this particular category. Fur-
thermore only these selected category-specific features are
used to decide whether a particular category is present or
not as illustrated in Fig.1. For guiding this selection process
a feature scoring value h
cf
is calculated for each category c
and feature f . This scoring value is only based on previously
seen exemplars of a certain category, which can strongly
change if further information is encountered. Therefore a
continuous update of the h
cf
values is required to follow
this change.

A. Distance Computation and Learning Rule
The learning in the cLVQ architecture is based on a
set of high-dimensional and sparse feature vectors x
i
=
(x
i
1
, . . . , x
i
F
), where F denotes the total number of features.
Each x
i
is assigned to a list of category labels t
i
=
(t
i
1
, . . . , t
i
C
). We use C to denote the current number of
represented color and shape categories, whereas each t
i
c
{−1, 0, +1} labels an x
i
as positive or negative example of
category c. The third state t
c
= 0 is interpreted as unknown
category membership, which means that all x
i
with t
i
c
= 0
have no influence on the representation of category c.
The cLVQ representative nodes w
k
with k = 1, . . . , K are
built up incrementally, where K denotes the current number
of allocated vectors w. Each w
k
is attached to a label vector
u
k
where u
k
c
{−1, 0, +1} is the model target output for
category c, representing positive, negative, and missing label
output, respectively. The winning nodes w
k
min
(c)
(x
i
) are
calculated independently for each category c, where k
min
(c)
is determined in the following way:
k
min
(c) = arg min
k
F
X
f=1
λ
cf
(x
i
f
w
k
f
)
2
, k with u
k
c
6= 0.
(1)
where the category-specific weights λ
cf
are updated contin-
uously inspired by the generalized relevance LVQ proposed
by [4] . We denote the set of selected features for an active
category c C as S
c
. We choose λ
cf
= 0 for all f 6∈ S
c
,
and otherwise adjust it according to a scoring procedure
explained later. Each w
k
min
(c)
(x
i
) is updated based on the
standard LVQ learning rule [9], but is restricted to feature
dimensions f S
c
:
w
k
min
(c)
f
:= w
k
min
(c)
f
+ µ Θ
k
min
(c)
(x
i
f
w
k
min
(c)
f
) f S
c
,
(2)
where µ = 1 if the categorization decision for x
i
was correct,
otherwise µ = 1 and the winning node w
k
min
(c)
will be
shifted away from x
i
. Additionally Θ
k
min
(c)
is the node-
dependent learning rate as proposed by [7]:
Θ
k
min
(c)
= Θ
0
exp
a
k
min
(c)
σ
. (3)
Here Θ
0
is a predefined initial value, σ is a fixed scaling
factor, and a
k
is an iteration-dependent age factor. The age
factor a
k
is incremented every time the corresponding w
k
becomes the winning node.
B. Feature Scoring and Category Initialization
The learning dynamics of the cLVQ learning approach
is organized in training epochs, where at each epoch only
a limited amount of objects and their corresponding views
are visible to the learning method. After each epoch some
of the training vectors x
i
and their corresponding target
category values t
i
are removed and replaced by vectors of
a new object. Therefore for each training epoch the scoring
values h
cf
, used for guiding the feature selection process,
are updated in the following way:
h
cf
=
H
cf
H
cf
+
¯
H
cf
. (4)
The variables H
cf
and
¯
H
cf
are the number of previously
seen positive and negative training examples of category c,
where the corresponding feature f was active (x
f
> 0). For
each newly inserted object view, the counter value H
cf
is
updated in the following way:
H
cf
:= H
cf
+ 1 if x
i
f
> 0 and t
i
c
= +1, (5)
where
¯
H
cf
is updated as follows:
¯
H
cf
:=
¯
H
cf
+ 1 if x
i
f
> 0 and t
i
c
= 1. (6)
The score h
cf
defines the metrical weighting in the cLVQ
representation space. We then choose λ
cf
= h
cf
for all f
S
c
and λ
cf
= 0 otherwise.
For our learning architecture we assume that not all cate-
gories are known from the beginning, so that new categories
can occur in each training epoch. Therefore if category c
with the category label t
i
c
= +1 occurred for the first time
in the current training epoch, we initialize this category c
with a single feature and one cLVQ node. We select the
feature v
c
= arg max
f
(h
cf
) with the largest scoring value
and initialize S
c
= {v
c
}. The training vector x
i
is selected
as the initial cLVQ node, where the selected feature v
c
has
the highest activation, i.e. w
K+1
= x
q
with x
q
v
c
x
i
v
c
for
all i. The attached label vector is chosen as u
K+1
c
= +1 and
zero for all other categories.
C. Learning Dynamics
All changes of the cLVQ network are only based on the
limited and changing set of training vectors x
i
. During a
single learning epoch of the cLVQ method an optimization
loop is performed iteratively as illustrated in Fig. 2. The
basic concept behind this optimization loop is to apply small
changes to the representation of erroneous categories by
testing new features v
c
and representation nodes w
k
that may
lead to a considerable performance increase for the current
set of training vectors. A single run through the optimization
loop is composed of the following processing steps:
Step 1: Feature Testing. For each category c with remain-
ing errors a new feature is temporally added and tested. If
a category c is not present in the current training set or is
error free then no modification to its representation is applied.
The feature selection itself is based on the observable training
vectors x
i
, the feature scoring values h
cf
and the e
+
cf
values.
The e
+
cf
is defined as the ratio of active feature entries
(x
i
f
> 0.0) for feature f among the positive training errors
E
+
c
of class c. The E
+
c
is calculated in the following way:
E
+
c
= {i|t
i
c
= +1 t
i
c
6= u
k
min
c
(x
i
)}, (7)
where the t
i
c
{−1, 0, +1} is defined as target signal for
x
i
and u
k
min
c
is the label assigned to the winning node
w
k
min
(c)
(x
i
) of category c.
For the feature testing a candidate v
c
should be added to
the category-specific feature set S
c
that potentially improves

until errors solved
or no features left
as new node
del feature
new feature
select and add
keep feature
keep nodedel node
select erroneous vector
gain >
gain <=
gain <=
gain >
1
1
2
2
occured − start learning
errors for category c
all errors solved for
category c − stop learning
Fig. 2. Illustration of the cLVQ Optimization Loop. The basic idea of
this optimization loop is to make small modifications to the representation of
categories where categorization errors on the available training vectors occur.
If the gain in categorization performance, based on all available training
examples of category c , is above the insertion threshold the modification is
kept and otherwise it is retracted.
the categorization performance of category c by having a
high scoring value h
cf
. Additionally the feature candidate
should also be very active in the remaining training errors of
this category to quickly resolve all remaining errors of this
particular category. Therefore we choose:
v
c
= arg max
f6∈S
c
(e
+
cf
+ h
cf
) (8)
and add S
c
:= S
c
{v
c
}. The added feature dimension modi-
fies the cLVQ metrics by changing the decision boundaries of
all Voronoi clusters assigned to category c, which potentially
reduces the remaining categorization errors. Thus based on
all training vectors x
i
we calculate the actual categorization
performance of the erroneous categories. If the performance
increase for category c is larger than the prespecified thresh-
old ǫ
1
the v
c
is permanently added and otherwise is removed
and excluded for further training iterations of this epoch.
Furthermore in rare cases also the removal of already
selected features is possible. This is done if the total number
of negative errors #E
c
> #E
+
c
, where the E
c
is analogous
to E
+
c
defined as:
E
c
= {i|t
i
c
= 1 t
i
c
6= u
k
min
c
(x
i
)}. (9)
The only difference is that in this case a feature f S
c
is removed from the set of selected features S
c
and the
performance gain is computed for the final decision on the
removal.
Step 2: LVQ Node Testing. Similar to Step 1 we test new
LVQ nodes only for erroneous categories. In contrast to the
node insertion rule proposed in [7], where nodes are inserted
for training vectors with smallest distance to wrong winning
nodes, we propose to insert new LVQ nodes based on training
vectors x
i
with most categorization errors. This leads to a
more compact representation, because a single node typically
improves the representation of several categories. In this
optimization step we insert new representation nodes w
k
until for each erroneous category c at least one new node
is inserted. As categorization labels u
k
for these nodes only
the correct targets labels for the categorization errors are
assigned. For all other categories c the corresponding u
k
c
= 0,
keeping all error free categories unchanged.
Again we calculate the performance increase based on all
currently available training vectors. If this increase for cate-
gory c is above the threshold ǫ
2
, we make no modifications
to LVQ node labels of the newly inserted nodes. Otherwise
we set the labels u
k
c
of this set of newly inserted nodes w
k
to zero. If due to this evaluation step all u
k
c
become zero
then we remove the corresponding w
k
.
Step 3: Stop condition. If all remaining categorization
errors for the current training set are resolved or all possible
features f of erroneous categories c are tested then we
start the next training epoch. Otherwise we continue this
optimization loop and test further feature candidates and
LVQ representation nodes.
III. UNSUPERVISED BOOTSTRAPPING OF CATEGORY
REPRESENTATIONS
Our focus is the life-long learning of visual representa-
tions. For such learning tasks normally it is unsuitable to
store all previously seen training vectors. Thus we decided
that the learning during the bootstrapping phase is only based
on unlabeled training views and their estimated category
labels, which is distinct from most commonly used semi-
supervised learning methods. Before the cLVQ modifications
are described in more detail, we first define the majority vot-
ing schema used for the autonomous estimation of category
labels for the unlabeled training views.
A. Autonomous Estimation of Category Labels
For the autonomous estimation of category labels we first
measure the network response for all available unlabeled
training views based on the previously supervised trained
category seed. For each individual object o in this current
training set we calculate the detection rates d
+
oc
= D
+
oc
/Q
o
and d
oc
= D
oc
/Q
o
, where the Q
o
is defined as the number
of unlabeled training views of object o. The measures d
+
oc
indicates how reliable the category c can be detected in the
views of object o, while the rate d
oc
indicates how probable
the category c is not present in these views. Furthermore we
count the number of object views indicating the presence
(D
+
oc
) and absence (D
oc
) of category c in the following way:
D
+
oc
:= D
+
oc
+ 1 if u
k
min
c
(x
i
) = +1 (10)
and
D
oc
:= D
oc
+ 1 if u
k
min
c
(x
i
) = 1, (11)
where the sum of D
+
oc
+ D
oc
= Q
o
.
Based on these detection rates and the predetermined
thresholds ǫ
+
and ǫ
the correct target values t
i
c
{−1, 0, +1} are estimated for all views of the same object.

The assignment of the target values is done in the following
way:
t
i
c
=
+1 : if d
+
oc
> ǫ
+
1 : if d
+
oc
<= ǫ
+
& d
oc
> ǫ
0 : else.
(12)
The selection of ǫ
+
and ǫ
is crucial with respect to the
potential performance gain of this bootstrapping phase. If
these values are chosen too conservative many t
i
c
become
zero and the corresponding object views have no effect
to the representation. On the contrary the possibility of
mislabeling increases if these values are low. In general our
cLVQ approach is robust with respect to a smaller amount
of mislabeled training vectors, because additional network
resources are only allocated if the performance gain is above
the insertion thresholds ǫ
1
and ǫ
2
. Nevertheless if the number
of wrongly labeled training views becomes to large the
categorization performance can possibly also decrease.
B. Modification of the cLVQ Learning Approach
For our first evaluation of the unsupervised bootstrapping
of visual category representations we keep the incremental
learning approach as in [8]. Thus also in this bootstrapping
phase the learning process is subdivided into epochs and
also the overall cLVQ learning dynamics is reused. This
means the category representation is enhanced by making
small changes to the category representation by selecting new
category-specific features or by allocating additional repre-
sentation nodes. Furthermore the same learning parameters
like the learning rate Θ, the feature insertion threshold ǫ
1
and node insertion threshold ǫ
2
are used.
Although the same learning parameters are utilized we still
want to express the reliability of the autonomously estimated
category labels. This means if the reliability is low only
small changes with respect to the modification of existing
nodes, the allocation of new category-specific features and
representation nodes should be applied. To achieve this effect
all learning parameters are modulated based on the parameter
r
i
c
{0, . . . , 1} that is defined as follows:
r
i
oc
=
d
+
oc
: if t
i
c
= +1
d
oc
: if t
i
c
= 1
0 : if t
i
c
= 0.
(13)
The r
i
oc
value is assigned to each unlabeled object views and
is equal for all views of one physical object o.
For both insertion thresholds ǫ
1
and ǫ
2
this r
i
oc
modulates
the measurement of the performance gain after the insertion
of a new feature v
c
or representation node w
k
. In the
basic cLVQ each erroneous training view that could be
resolved by such slight modification of the representation
is counted with 1.0. In contrast to this for the modified
version of the cLVQ each resolved erroneous training view is
counted as r
i
oc
only. This means that the required amount of
training vectors, necessary to reach the insertion threshold,
is inversely proportional to the corresponding r
i
oc
values
(e.g. if for all current training views r
i
oc
= 0.8 a factor
of 1.25 views are required compared to the basic cLVQ).
The fundamental effect of the modulation of ǫ
1
and ǫ
2
is
that it becomes distinctly more difficult to allocate new
resources the more unreliable the corresponding estimated
category labels become. Therefore the allocation of category
unspecific or even erroneous network resources should be
strongly reduced.
Also for the adaptation of the representation nodes w
k
the
original cLVQ learning rule (see Eq. 2) is multiplied with
r
i
oc
. Besides the node dependent learning rate Θ
k
min
(c)
this
modification guarantees the stability of the learned visual
category representation. The update step for the winning
node w
k
min
(c)
of category c is calculated as follows:
w
k
min
(c)
f
:= w
k
min
(c)
f
+r
i
oc
µΘ
k
min
(c)
(x
i
f
w
k
min
(c)
f
) f S
c
,
(14)
where r
i
oc
is the reliability factor and the µ indicates the
correctness of the categorization decision.
Besides this modulation of the learning parameters,
weighted with reliability, the continuous update of the scor-
ing values h
cf
was deactivated for this bootstrapping phase,
because these values are most fragile with respect to errors
in the estimation process of category labels. A larger amount
of such errors could strongly interfere globally with the
previous trained category representations. This can cause
a global performance decrease of all categories, while all
other modifications due to the allocation of new features and
representation nodes have only a local effect.
IV. EXPERIMENTAL RESULTS
A. Image Ensemble
As experimental setup we use an image database com-
posed of 44 training and 33 test objects as shown in
Fig. 3. This image ensemble contains objects assigned to
ve different color and ten shape categories. Each object
was rotated around the vertical axis in front of a black
background. For each of the training and test objects 300
views are collected. The views of all training objects are
furthermore subdivided into labeled and unlabeled views as
illustrated at the bottom of Fig. 3. In general out of all
300 views are 200 used to train the seed of the category
representation in a supervised manner, while the remaining
100 object views (view range 50–100 and 150–200) are used
for the unsupervised bootstrapping of this representation.
This separation into labeled and unlabeled object views
means that for the autonomous bootstrapping the cLVQ has
to generalize to a quite large unseen angular range of object
views. Compared to a random sampling of the unlabeled
object views this is more challenging, because for random
selected views the appearance difference to already seen
labeled views would be considerably smaller.
B. Feature Representation
For the representation of visual categories we combine
simple color histograms with a parts-based feature repre-
sentation, but we do not utilize this a priori separation for
our category learning approach. Therefore for each object
view all extracted features are concatenated into a single

Figures
Citations
More filters
Proceedings ArticleDOI

Online learning for template-based multi-channel ego noise estimation

TL;DR: This paper presents a system that gives a robot the ability to diminish its own disturbing noise by utilizing template-based ego noise estimation, an algorithm previously developed by the authors.
Journal ArticleDOI

2013 Special Issue: Efficient online bootstrapping of sensory representations

TL;DR: An autonomous and local neural learning algorithm termed PROPRE (projection-prediction) that updates induced representations based on predictability that is computationally efficient and stable, and that the multimodal transfer of feature selectivity is successful and robust under resource constraints.
Proceedings ArticleDOI

PROPRE: PROjection and PREdiction for multimodal correlations learning. An application to pedestrians visual data discrimination

TL;DR: The modulation of the projection learning by the predictability measure improves significantly classification performances of the system independently of the measure used, and multiple generic predictability measures are proposed.
Proceedings ArticleDOI

Multimodal space representation driven by self-evaluation of predictability

TL;DR: The self-evaluation module of PROPRE is improved, by introducing a sliding threshold, and applied to the unsupervised classification of gestures caught from two time-of-flight (ToF) cameras to illustrate that the modulation mechanism is still useful although less efficient than purely supervised learning.
Proceedings ArticleDOI

Simultaneous concept formation driven by predictability

TL;DR: An online learning method that simultaneously discovers “meaningful” concepts in the associated processing streams is described, extending methods such as PCA, SOM or sparse coding to the multimodal case and finding that those concepts which are predictable from other modalities successively “grow”, i.e., become over-represented, whereas concepts that are not predictable become systematically under-represented.
References
More filters
Journal ArticleDOI

Color indexing

TL;DR: In this paper, color histograms of multicolored objects provide a robust, efficient cue for indexing into a large database of models, and they can differentiate among a large number of objects.
PatentDOI

System for self-organization of stable category recognition codes for analog input patterns

TL;DR: ART 2, a class of adaptive resonance architectures which rapidly self-organize pattern recognition categories in response to arbitrary sequences of either analog or binary input patterns, is introduced.
Journal ArticleDOI

Catastrophic forgetting in connectionist networks.

TL;DR: The causes, consequences and numerous solutions to the problem of catastrophic forgetting in neural networks are examined and the brain might have overcome this problem and the consequences for distributed connectionist networks are explored.
Frequently Asked Questions (10)
Q1. What is the basic concept behind this optimization loop?

2. The basic concept behind this optimization loop is to apply small changes to the representation of erroneous categories by testing new features vc and representation nodes wk that may lead to a considerable performance increase for the current set of training vectors. 

The learning in the cLVQ architecture is based on a set of high-dimensional and sparse feature vectors xi = (xi1, . . . , x i F ), where F denotes the total number of features. 

Furthermore the same learning parameters like the learning rate Θ, the feature insertion threshold ǫ1 and node insertion threshold ǫ2 are used. 

Additionally are the fluctuations in the feature responses of the extracted parts-based features larger during the object rotation compared to the color features, so that the unlabeled object views contain further information with respect to the representation of shape categories. 

Their proposed category learning approach [8] enables interactive and life-long learning and therefore can be utilized for autonomous systems, but so far the authors only considered supervised learning based on interactions with an human tutor. 

The authors selected a distinctly smaller range for the threshold ǫ− because due to the selection of low-dimensional feature sets the rejection of categories is typically nearly perfect. 

The update step for the winning node wkmin(c) of category c is calculated as follows:w kmin(c) f := w kmin(c) f +r i ocµΘ kmin(c)(xif−w kmin(c) f ) ∀f ∈ Sc, (14) where rioc is the reliability factor and the µ indicates the correctness of the categorization decision. 

Besides this modulation of the learning parameters, weighted with reliability, the continuous update of the scoring values hcf was deactivated for this bootstrapping phase, because these values are most fragile with respect to errors in the estimation process of category labels. 

For each newly inserted object view, the counter value Hcf is updated in the following way:Hcf := Hcf + 1 if x i f > 0 and t i c = +1, (5)where H̄cf is updated as follows: 

Therefore for each training epoch the scoring values hcf , used for guiding the feature selection process,are updated in the following way:hcf = HcfHcf + H̄cf .