scispace - formally typeset
Author

Serge Belongie

Other affiliations: University of California, San Diego, Google, Wuhan University  ...read more
Bio: Serge Belongie is a academic researcher at Cornell University who has co-authored 329 publication(s) receiving 90315 citation(s). The author has an hindex of 99. Previous affiliations of Serge Belongie include University of California, San Diego & Google. The author has done significant research in the topic(s): Image segmentation & Object detection.

...read more

Papers
  More

Open accessBook ChapterDOI: 10.1007/978-3-319-10602-1_48
Tsung-Yi Lin1, Michael Maire2, Serge Belongie1, James Hays  +4 moreInstitutions (4)
06 Sep 2014-
Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

...read more

Topics: Object detection (54%)

18,843 Citations


Open accessProceedings ArticleDOI: 10.1109/CVPR.2017.106
Tsung-Yi Lin1, Piotr Dollár2, Ross Girshick2, Kaiming He2  +2 moreInstitutions (2)
21 Jul 2017-
Abstract: Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.

...read more

Topics: Feature (computer vision) (60%), Pyramid (58%), Object detection (56%) ...read more

7,876 Citations


Journal ArticleDOI: 10.1109/34.993558
Abstract: We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by: (1) solving for correspondences between points on the two shapes; (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits, and the COIL data set.

...read more

Topics: Shape analysis (digital geometry) (64%), Heat kernel signature (58%), Shape context (58%) ...read more

6,426 Citations


Open accessPosted Content
Tsung-Yi Lin1, Piotr Dollár2, Ross Girshick2, Kaiming He2  +2 moreInstitutions (2)
Abstract: Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using FPN in a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.

...read more

Topics: Feature (computer vision) (60%), Pyramid (58%), Object detection (56%) ...read more

5,416 Citations


Open access
Catherine Wah1, Steve Branson1, Peter Welinder1, Pietro Perona2  +1 moreInstitutions (2)
01 Jul 2011-
Abstract: CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization annotations. All images are annotated with bounding boxes, part locations, and at- tribute labels. Images and annotations were filtered by mul- tiple users of Mechanical Turk. We introduce benchmarks and baseline experiments for multi-class categorization and part localization.

...read more

2,875 Citations


Cited by
  More

Open accessProceedings ArticleDOI: 10.1109/CVPR.2016.90
Kaiming He1, Xiangyu Zhang1, Shaoqing Ren1, Jian Sun1Institutions (1)
27 Jun 2016-
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read more

Topics: Deep learning (53%), Residual (53%), Convolutional neural network (53%) ...read more

93,356 Citations


Open accessJournal ArticleDOI: 10.1136/BMJ.323.7325.1375/A

[...]

08 Dec 2001-BMJ
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read more

30,199 Citations


Open accessProceedings ArticleDOI: 10.1109/CVPR.2015.7298594
Christian Szegedy1, Wei Liu2, Yangqing Jia1, Pierre Sermanet1  +5 moreInstitutions (3)
07 Jun 2015-
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read more

29,453 Citations


Open accessProceedings ArticleDOI: 10.1109/CVPR.2005.177
Navneet Dalal1, Bill Triggs1Institutions (1)
20 Jun 2005-
Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

...read more

Topics: Histogram of oriented gradients (62%), Local binary patterns (57%), GLOH (56%) ...read more

28,803 Citations


Open accessJournal ArticleDOI: 10.1007/S11263-015-0816-Y
Olga Russakovsky1, Jia Deng2, Hao Su1, Jonathan Krause1  +8 moreInstitutions (4)
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

...read more

25,260 Citations


Performance
Metrics

Author's H-index: 99

No. of papers from the Author in previous years
YearPapers
202123
202023
201923
201825
201729
201623

Top Attributes

Show by:

Author's top 5 most impactful journals

Network Information
Related Authors (5)
Piotr Dollár

107 papers, 99.1K citations

87% related
Chad Carson

13 papers, 3.6K citations

84% related
Guandao Yang

18 papers, 650 citations

83% related
Vincent Rabaud

13 papers, 10.2K citations

83% related
David J. Kriegman

221 papers, 40.8K citations

83% related