scispace - formally typeset
Search or ask a question

Showing papers by "Andrew Rabinovich published in 2014"


Posted Content
TL;DR: A deep convolutional neural network architecture codenamed Inception is proposed that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. This was achieved by a carefully crafted design that allows for increasing the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC 2014 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

2,567 citations


Posted Content
TL;DR: The authors proposed a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency, where a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data.
Abstract: Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate. However, this assumption often does not hold; e.g. in recognition, class labels may be missing; in detection, objects in the image may not be localized; and in general, the labeling may be subjective. In this work we propose a generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency. We consider a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data. In experiments we demonstrate that our approach yields substantial robustness to label noise on several datasets. On MNIST handwritten digits, we show that our model is robust to label corruption. On the Toronto Face Database, we show that our model handles well the case of subjective labels in emotion recognition, achieving state-of-the- art results, and can also benefit from unlabeled face images with no modification to our method. On the ILSVRC2014 detection challenge data, we show that our approach extends to very deep networks, high resolution images and structured outputs, and results in improved scalable detection.

398 citations


Patent
Anand Pillai1, Andrew Rabinovich1
16 Apr 2014
TL;DR: In this paper, a computer-implemented method for selecting a representative image of an entity is disclosed, which includes: accessing a collection of images of the entity; clustering, based on similarity of one or more similarity features, images from the collection to form a plurality of similarity clusters; and selecting the representative image from one of said similarity clusters.
Abstract: Methods and systems for selecting a representative image of an entity are disclosed. According to one embodiment, a computer-implemented method for selecting a representative image of an entity is disclosed. The method includes: accessing a collection of images of the entity; clustering, based on similarity of one or more similarity features, images from the collection to form a plurality of similarity clusters; and selecting the representative image from one of said similarity clusters. Further, based on cluster size of said similarity clusters popular clusters can be determined, and the selection of the representative image can be from the popular clusters. In addition, the method can further include assigning a headshot score based upon a portion of the respective image covered by the entity to respective images in said popular clusters, and further selecting the representative image based upon the headshot score.

35 citations


Posted Content
TL;DR: This paper proposed a method for augmenting a trained neural network classifier with auxiliary capacity in a manner designed to significantly improve upon an already well-performing model, while minimally impacting its computational footprint.
Abstract: We study the problem of large scale, multi-label visual recognition with a large number of possible classes. We propose a method for augmenting a trained neural network classifier with auxiliary capacity in a manner designed to significantly improve upon an already well-performing model, while minimally impacting its computational footprint. Using the predictions of the network itself as a descriptor for assessing visual similarity, we define a partitioning of the label space into groups of visually similar entities. We then augment the network with auxilliary hidden layer pathways with connectivity only to these groups of label units. We report a significant improvement in mean average precision on a large-scale object recognition task with the augmented model, while increasing the number of multiply-adds by less than 3%.

9 citations


Patent
11 Mar 2014
TL;DR: A hierarchy of clusters is determined, where each leave of the hierarchy corresponds to one of the images in a group, and each cluster in the hierarchy identifies images in the group that are deemed similar to one another.
Abstract: A hierarchy of clusters is determined, where each leave of the hierarchy corresponds to one of the images in a group, and each cluster in the hierarchy identifies images in the group that are deemed similar to one another. The hierarchy identifies a similarity between each of the plurality of clusters.

7 citations