Towards autonomous bootstrapping for life-long learning categorization tasks
Summary (3 min read)
Introduction
- In the recent decades a wide variety of category learning paradigms have been proposed ranging from generative [10], [14] to discriminative models [6], [18].
- The major advantage of supervised over unsupervised learning is the higher categorization performance, where the time consuming and costly collection of accurately labeled training data is its fundamental drawback.
- In the context of incremental and life-long learning it has gained so far much less interest.
- Afterwards the modifica- tions of the basic cLVQ approach and the context dependent estimation of category labels is described in Section III.
A. Distance Computation and Learning Rule
- The authors useC to denote the current number of represented color and shape categories, whereas eachtic ∈ {−1, 0,+1} labels anxi as positive or negative example of categoryc.
- Eachwk is attached to a label vector u k whereukc ∈ {−1, 0,+1} is the model target output for categoryc, representing positive, negative, and missing label output, respectively.
- Sc, and otherwise adjust it according to a scoring procedure explained later.
- The age factor ak is incremented every time the correspondingwk becomes the winning node.
B. Feature Scoring and Category Initialization
- The learning dynamics of the cLVQ learning approach is organized in training epochs, where at each epoch only a limited amount of objects and their corresponding views are visible to the learning method.
- After each epoch some of the training vectorsxi and their corresponding target category valuesti are removed and replaced by vectors of a new object.
- Therefore for each training epoch the scoring valueshcf , used for guiding the feature selection process, are updated in the following way: hcf = Hcf Hcf +.
- Therefore if categoryc with the category labeltic = +1 occurred for the first time in the current training epoch, the authors initialize this categoryc with a single feature and one cLVQ node.
- The attached label vector is chosen asuK+1c = +1 and zero for all other categories.
C. Learning Dynamics
- All changes of the cLVQ network are only based on the limited and changing set of training vectorsxi.
- A single run through the optimization loop is composed of the following processing steps: Step 1: Feature Testing.
- Sc is removed from the set of selected featuresSc and the performance gain is computed for the final decision on the removal.
- If all remaining categorization errors for the current training set are resolved or all possible featuresf of erroneous categoriesc are tested then the authors start the next training epoch.
- Otherwise the authors continue this optimization loop and test further feature candidates and LVQ representation nodes.
A. Autonomous Estimation of Category Labels
- For the autonomous estimation of category labels the authors first measure the network response for all available unlabeled training views based on the previously supervised trained category seed.
- The measuresd+oc indicates how reliable the categoryc can be detected in the views of objecto, while the rated−oc indicates how probable the categoryc is not present in these views.
- If these values are chosen too conservative manytic become zero and the corresponding object views have no effect to the representation.
- On the contrary the possibility of mislabeling increases if these values are low.
- In general their cLVQ approach is robust with respect to a smaller amount of mislabeled training vectors, because additional network resources are only allocated if the performance gain is above the insertion thresholdsǫ1 andǫ2.
B. Modification of the cLVQ Learning Approach
- For their first evaluation of the unsupervised bootstrapping of visual category representations the authors keep the incremental learning approach as in [8].
- In contrast to this for the modified version of the cLVQ each resolved erroneous training view is counted asrioc only.
- Besides the node dependent learning rateΘ kmin(c) this modification guarantees the stability of the learned visual category representation.
- This can cause a global performance decrease of all categories, while all other modifications due to the allocation of new features and representation nodes have only a local effect.
- The views of all training objects are furthermore subdivided into labeled and unlabeled views as illustrated at the bottom of Fig.
B. Feature Representation
- For the representation of visual categories the authors combine simple color histograms with a parts-based feature representation, but they do not utilize this a priori separation for their category learning approach.
- Therefore for each object view all extracted features are concatenated into a single Training Objects Test Objects structureless feature vector.
- The authors use color histograms becaus they combine robustness against view and scale changes with computational efficiency [16].
- The parts-based shape feature extraction [5] is based on a learned set of categoryspecific feature detectors that are based on SIFT descriptors [11].
- This especially allows the representation of less structured categories.
C. Categorization Performance
- As already mentioned for the experimental evaluation of their semi-supervised category learning framework the training is splitted into two training phases.
- In the second training phase the categories are bootstrapped based on the incremental presentation of the unlabeled training set.
- These additionally allocated shape features ar most probable the cause for the slight performance decrease of the color categories.
- In this experiment the authors selected the optimal detection threshold ǫ+ = 0.5 and ǫ− = 0.9 for the shape categories and investigate the effect of an continuously increasing set of additional object views with respect to the change in categorization performance.
- A. McCallum and K. Nigam, “Employing EM and pool-based active learning for text classification”,In Proc. of the Fifteenth International Conference on Machine Learning, pp. 350–358, 1998. [14].
Did you find this useful? Give us your feedback
Citations
13 citations
Cites methods from "Towards autonomous bootstrapping fo..."
...The learning process, which continues over the entire lifespan of a robot without human intervention, is called life-long learning and it is successfully applied for various tasks such as robot navigation/manipulation [9] and object recognition/categorization [10] in robotics....
[...]
11 citations
6 citations
Cites background from "Towards autonomous bootstrapping fo..."
...Hence, autonomous and progressive construction of sensory-motor representations is currently an active research field in developmental robotics [2], [3], [4]....
[...]
5 citations
Cites background from "Towards autonomous bootstrapping fo..."
...This work fits in the currently active research on autonomous and progressive construction of sensory-motor representations in the developmental robotics field [3], [4], [5]....
[...]
5 citations
Cites background from "Towards autonomous bootstrapping fo..."
...I. INTRODUCTION The autonomous formation of representations is a currently very active research topic in developmental robotics[1], [2], [3], [4]....
[...]
References
46,906 citations
24,320 citations
18,620 citations
"Towards autonomous bootstrapping fo..." refers background in this paper
...In the recent decades a wide variety of category learning paradigms have been proposed ranging from generative [10], [14] to discriminative models [6], [18]....
[...]
8,197 citations
Related Papers (5)
Frequently Asked Questions (10)
Q2. What is the learning in the cLVQ architecture?
The learning in the cLVQ architecture is based on a set of high-dimensional and sparse feature vectors xi = (xi1, . . . , x i F ), where F denotes the total number of features.
Q3. What are the learning parameters used for the cLVQ?
Furthermore the same learning parameters like the learning rate Θ, the feature insertion threshold ǫ1 and node insertion threshold ǫ2 are used.
Q4. What is the performance of the unlabeled object views?
Additionally are the fluctuations in the feature responses of the extracted parts-based features larger during the object rotation compared to the color features, so that the unlabeled object views contain further information with respect to the representation of shape categories.
Q5. What is the description of the proposed learning approach?
Their proposed category learning approach [8] enables interactive and life-long learning and therefore can be utilized for autonomous systems, but so far the authors only considered supervised learning based on interactions with an human tutor.
Q6. Why did the authors select a smaller range for the threshold?
The authors selected a distinctly smaller range for the threshold ǫ− because due to the selection of low-dimensional feature sets the rejection of categories is typically nearly perfect.
Q7. What is the update step for the winning node of category c?
The update step for the winning node wkmin(c) of category c is calculated as follows:w kmin(c) f := w kmin(c) f +r i ocµΘ kmin(c)(xif−w kmin(c) f ) ∀f ∈ Sc, (14) where rioc is the reliability factor and the µ indicates the correctness of the categorization decision.
Q8. Why was the continuous update of the scoring values deactivated?
Besides this modulation of the learning parameters, weighted with reliability, the continuous update of the scoring values hcf was deactivated for this bootstrapping phase, because these values are most fragile with respect to errors in the estimation process of category labels.
Q9. What is the value of the counter value for each new object view?
For each newly inserted object view, the counter value Hcf is updated in the following way:Hcf := Hcf + 1 if x i f > 0 and t i c = +1, (5)where H̄cf is updated as follows:
Q10. What is the scoring value for each training epoch?
Therefore for each training epoch the scoring values hcf , used for guiding the feature selection process,are updated in the following way:hcf = HcfHcf + H̄cf .