scispace - formally typeset
Search or ask a question
Topic

Object-class detection

About: Object-class detection is a research topic. Over the lifetime, 5612 publications have been published within this topic receiving 185078 citations.


Papers
More filters
Book
01 Dec 2004
TL;DR: This work focuses on low-level processing on the three-dimensional world tackling the pespective n-point problem motion invariants and their applications and the need for speed - real-time electronic hardware systems.
Abstract: Introduction - vision, the challenge. Part 1 Low-level processing: images and imaging operations basic image filtering operations thresholding techniques locating objects via their edges binary shape analysis boundary pattern analysis. Part 2 Intermediate-level processing: line detection circle detection the Hough transform and its nature ellipse detection hole detection polygon and corner detection. Part 3 Application level processing: abstract pattern matching techniques the three-dimensional world tackling the pespective n-point problem motion invariants and their applications automated visual inspection statistical pattern recognition biologically inspired recognition schemes texture image acquisition the need for speed - real-time electronic hardware systems. Part 4 Perspectives on vision: machine vision, art or science?.

1,198 citations

Book ChapterDOI
05 Nov 2012
TL;DR: A framework for automatic modeling, detection, and tracking of 3D objects with a Kinect and shows how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time.
Abstract: We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.

1,114 citations

Proceedings ArticleDOI
26 Dec 2007
TL;DR: This paper describes face data as resulting from a generative model which incorporates both within- individual and between-individual variation, and calculates the likelihood that the differences between face images are entirely due to within-individual variability.
Abstract: Many current face recognition algorithms perform badly when the lighting or pose of the probe and gallery images differ. In this paper we present a novel algorithm designed for these conditions. We describe face data as resulting from a generative model which incorporates both within-individual and between-individual variation. In recognition we calculate the likelihood that the differences between face images are entirely due to within-individual variability. We extend this to the non-linear case where an arbitrary face manifold can be described and noise is position-dependent. We also develop a "tied" version of the algorithm that allows explicit comparison across quite different viewing conditions. We demonstrate that our model produces state of the art results for (i) frontal face recognition (ii) face recognition under varying pose.

1,099 citations

Journal ArticleDOI
TL;DR: A new saliency method is proposed by introducing short connections to the skip-layer structures within the HED architecture, which produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency, effectiveness, and simplicity over the existing algorithms.
Abstract: Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. The Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis of the role of training data on performance. We provide a training set for future research and fair comparisons.

1,041 citations

Proceedings ArticleDOI
01 Nov 2011
TL;DR: AFLW provides a large-scale collection of images gathered from Flickr, exhibiting a large variety in face appearance as well as general imaging and environmental conditions, and is well suited to train and test algorithms for multi-view face detection, facial landmark localization and face pose estimation.
Abstract: Face alignment is a crucial step in face recognition tasks. Especially, using landmark localization for geometric face normalization has shown to be very effective, clearly improving the recognition results. However, no adequate databases exist that provide a sufficient number of annotated facial landmarks. The databases are either limited to frontal views, provide only a small number of annotated images or have been acquired under controlled conditions. Hence, we introduce a novel database overcoming these limitations: Annotated Facial Landmarks in the Wild (AFLW). AFLW provides a large-scale collection of images gathered from Flickr, exhibiting a large variety in face appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total 25,993 faces in 21,997 real-world images are annotated with up to 21 landmarks per image. Due to the comprehensive set of annotations AFLW is well suited to train and test algorithms for multi-view face detection, facial landmark localization and face pose estimation. Further, we offer a rich set of tools that ease the integration of other face databases and associated annotations into our joint framework.

1,033 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
93% related
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Image segmentation
79.6K papers, 1.8M citations
89% related
Convolutional neural network
74.7K papers, 2M citations
87% related
Support vector machine
73.6K papers, 1.7M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022137
20215
20202
20194
201823