scispace - formally typeset
Search or ask a question
Author

Brian Fulkerson

Bio: Brian Fulkerson is an academic researcher from Intel. The author has contributed to research in topics: Image segmentation & Social robot. The author has an hindex of 9, co-authored 14 publications receiving 4487 citations. Previous affiliations of Brian Fulkerson include University of California, Los Angeles.

Papers
More filters
Proceedings ArticleDOI
25 Oct 2010
TL;DR: VLFeat is an open and portable library of computer vision algorithms that includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization.
Abstract: VLFeat is an open and portable library of computer vision algorithms. It aims at facilitating fast prototyping and reproducible research for computer vision scientists and students. It includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization. The source code and interfaces are fully documented. The library integrates directly with MATLAB, a popular language for computer vision research.

3,417 citations

Proceedings ArticleDOI
01 Sep 2009
TL;DR: A method to identify and localize object classes in images by constructing a classifier on the histogram of local features found in each superpixel using superpixels as the basic unit of a class segmentation or pixel localization scheme.
Abstract: We propose a method to identify and localize object classes in images Instead of operating at the pixel level, we advocate the use of superpixels as the basic unit of a class segmentation or pixel localization scheme To this end, we construct a classifier on the histogram of local features found in each superpixel We regularize this classifier by aggregating histograms in the neighborhood of each superpixel and then refine our results further by using the classifier in a conditional random field operating on the superpixel graph Our proposed method exceeds the previously published state-of-the-art on two challenging datasets: Graz-02 and the PASCAL VOC 2007 Segmentation Challenge

743 citations

Book ChapterDOI
20 Oct 2008
TL;DR: This work presents an approach to determine the category and location of objects in images that reduces the size of a large generic dictionary to the low hundreds while increasing classification performance compared to k-means, and performs feature-based categorization efficiently on a dense grid.
Abstract: We present an approach to determine the category and location of objects in images. It performs very fast categorization of each pixel in an image, a brute-force approach made feasible by three key developments: First, our method reduces the size of a large generic dictionary (on the order of ten thousand words) to the low hundreds while increasing classification performance compared to k-means. This is achieved by creating a discriminative dictionary tailored to the task by following the information bottleneck principle. Second, we perform feature-based categorization efficiently on a dense grid by extending the concept of integral images to the computation of local histograms. Third, we compute SIFT descriptors densely in linear time. We compare our method to the state of the art and find that it excels in accuracy and simplicity, performing better while assuming less.

206 citations

Book ChapterDOI
10 Sep 2010
TL;DR: Variants of the implementation which use global memory and texture caching are presented, and the paper shows that a method backed by texture caching can produce a 10-50X speedup for practical images, making computation of super-pixels possible at 5-10Hz on modest sized images.
Abstract: The paper presents an exact GPU implementation of the quick shift image segmentation algorithm Variants of the implementation which use global memory and texture caching are presented, and the paper shows that a method backed by texture caching can produce a 10-50X speedup for practical images, making computation of super-pixels possible at 5-10Hz on modest sized (256x256) images

82 citations

Journal ArticleDOI
TL;DR: Using phylogenetic comparative methods, strong support for convergence on 1 : 3 proportions in aquatic taxa and near the inverse in terrestrial taxa is found, indicating that aquatic mustelids and pinnipeds independently acquired similar proportions of olfactory to respiratory turbinates.
Abstract: Inside the mammalian nose lies a labyrinth of bony plates covered in epithelium collectively known as turbinates. Respiratory turbinates lie anteriorly and aid in heat and water conservation, while more posterior olfactory turbinates function in olfaction. Previous observations on a few carnivorans revealed that aquatic species have relatively large, complex respiratory turbinates and greatly reduced olfactory turbinates compared with terrestrial species. Body heat is lost more quickly in water than air and increased respiratory surface area likely evolved to minimize heat loss. At the same time, olfactory surface area probably diminished due to a decreased reliance on olfaction when foraging under water. To explore how widespread these adaptations are, we documented scaling of respiratory and olfactory turbinate surface area with body size in a variety of terrestrial, freshwater, and marine carnivorans, including pinnipeds, mustelids, ursids, and procyonids. Surface areas were estimated from high-resolution CT scans of dry skulls, a novel approach that enabled a greater sampling of taxa than is practical with fresh heads. Total turbinate, respiratory, and olfactory surface areas correlate well with body size (r(2) ≥0.7), and are relatively smaller in larger species. Relative to body mass or skull length, aquatic species have significantly less olfactory surface area than terrestrial species. Furthermore, the ratio of olfactory to respiratory surface area is associated with habitat. Using phylogenetic comparative methods, we found strong support for convergence on 1:3 proportions in aquatic taxa and near the inverse in terrestrial taxa, indicating that aquatic mustelids and pinnipeds independently acquired similar proportions of olfactory to respiratory turbinates. Constraints on turbinate surface area in the nasal chamber may result in a trade-off between respiratory and olfactory function in aquatic mammals.

81 citations


Cited by
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Posted Content
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

44,703 citations

Journal ArticleDOI
TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.
Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First , we highlight convolution with upsampled filters, or ‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second , we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third , we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed “DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7 percent mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

11,856 citations

Posted Content
TL;DR: DeepLab as discussed by the authors proposes atrous spatial pyramid pooling (ASPP) to segment objects at multiple scales by probing an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views.
Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

10,120 citations

Journal ArticleDOI
TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.
Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations