Fine-Grained Crowdsourcing for Fine-Grained Recognition
read more
Citations
3D Object Representations for Fine-Grained Categorization
Part-Based R-CNNs for Fine-Grained Category Detection
Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
ReferItGame: Referring to Objects in Photographs of Natural Scenes
Evaluation of Output Embeddings for Fine-Grained Image Classification
References
LIBLINEAR: A Library for Large Linear Classification
Locality-constrained Linear Coding for image classification
Labeling images with a computer game
Evaluating Color Descriptors for Object and Scene Recognition
Caltech-UCSD Birds 200
Related Papers (5)
Frequently Asked Questions (18)
Q2. How can the game guarantee that bubbles contain discriminative features?
Through proper setup of reward, the game can guarantee that bubbles selected by a successful human player contain discriminative features.
Q3. What is the reason for the bubble detectors?
Their intuition is that since each bubble contains discriminative features for recognition, it suffices to detect such patterns in a test image.
Q4. What is the way to represent a bubble?
Since each bubble is usually a small area, it can be represented by a single descriptor such as SIFT, or a concatenation of simple descriptors.
Q5. How do the authors set the penalty on wrong answers?
the authors set the penalty on wrong answers very large, for example, 100 points for correct identification but−300 for incorrect ones.
Q6. How do the authors design the reward of the game?
The authors design the reward of the game such that a player can only earn high scores if she identifies the categoriescorrectly and uses bubbles parsimoniously.
Q7. How do the authors run the bubble detectors?
To run the bubble detectors, the authors resize an image to a max dimension of 300 pixelsand extract the same SIFT and color descriptors on dense patches at every 2 pixels at multiple resolutions.
Q8. What is the pooling region for the bubble detectors?
The authors specify the pooling region for each detector to be a 0.5× 0.5 rectangle centered at the original bubble location, after normalizing all (x, y) coordinates to be in [0, 1]× [0, 1].
Q9. How many bubbles are used in the CFAF algorithm?
using only 1634 human selected bubbles (5% of the entire set), the authors already outperform CFAF [36] (51.05% versus 44.73%).
Q10. What is the way to extend to multiple classes?
Extending to Multiple Classes Extending to multiple classes is straightforward — the authors can simply obtain bubbles for all pairs of categories and then use all of them to form their the BubbleBank.
Q11. How can the authors create a sense of time pressure?
To further enhance the experience, the authors can create a sense of time pressure by adding a countdown timer and “freezing” the bubbles for a few seconds once a certain amount of area has been revealed.
Q12. What is the definition of a fine-grained crowdsourcing approach?
It is “fine-grained” in two senses: (1) the crowd not only provides class labels indicating what the object is, but also provides detailed information on how humans achieve fine grained recognition; (2) the learning algorithm not only optimizes the classification accuracy but also incorporates the “finer-grained” hints from the crowd, which would help avoid overfitting and lead to better generalization performance.
Q13. How do the authors address the issue of blurring in games?
To address this issue, the authors start with a small amount of blurring and increase it gradually in new games until the use of bubbles becomes necessary.
Q14. What is the way to use bubbles to differentiate a class from another class?
It is likely that a bubble useful for differentiating a class from another very confusing class is also helpful for discriminating the same class against less similar ones.
Q15. How many classes are used in the experiment?
The authors experiment on the full dataset as well as a subset of 14 classes from the Vireo and Woodpecker family (CUB-14) that have been used in previous work [13, 36, 38].
Q16. How does the performance of the bubbles game compare with the previous best?
with random bubbles, the performance is similar to 44.73% achieved by CFAF [36], which also uses random templates but further boosts performance by a bagging technique.
Q17. How do the authors obtain the confusion matrix of the KDES method?
The authors first obtain the confusion matrix of the KDES method [2] via cross-validation on training data (no test data is used) and then pick the top 763 most confusing pairs.
Q18. How much of the area is revealed in successful games?
Fig. 5 plots the cumulative distribution of the area revealed in successful games — over 90% of the games reveal less than 10% of the object bounding box.