Topic

Orientation (computer vision)

About: Orientation (computer vision) is a research topic. Over the lifetime, 17196 publications have been published within this topic receiving 358181 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Textons, contours and regions: cue integration in image segmentation

[...]

Jitendra Malik¹, Serge Belongie¹, Jianbo Shi¹, Thomas Leung¹•Institutions (1)

University of California, Berkeley¹

20 Sep 1999

TL;DR: An operational definition of textons, the putative elementary units of texture perception, and an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in contour orientation energy and differences in texton densities across the contour.

...read moreread less

Abstract: The paper makes two contributions: it provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in contour orientation energy and differences in texton densities across the contour. B. Julesz (1981) introduced the term texton, analogous to a phoneme in speech recognition, but did not provide an operational definition for gray-level images. We re-invent textons as frequently co-occurring combinations of oriented linear filter outputs. These can be learned using a K-means approach. By mapping each pixel to its nearest texton, the image can be analyzed into texton channels, each of which is a point set where discrete techniques such as Voronoi diagrams become applicable. Local histograms of texton frequencies can be used with a /spl chi//sup 2/ test for significant differences to find texture boundaries. Natural images contain both textured and untextured regions, so we combine this cue with that of the presence of peaks of contour energy derived from outputs of odd- and even-symmetric oriented Gaussian derivative filters. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on a statistical test for isotropy of Delaunay neighbors. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown.

...read moreread less

342 citations

Journal Article•DOI•

View-based active appearance models

[...]

Timothy F. Cootes¹, Gavin V. Wheeler¹, Kevin Walker, Christopher J. Taylor¹•Institutions (1)

University of Manchester¹

01 Aug 2002-Image and Vision Computing

TL;DR: It is demonstrated that a small number of 2D linear statistical models are sufficient to capture the shape and appearance of a face from a wide range of viewpoints and can be used to predict new views of a faces seen from one view and to constrain search algorithms which seek to locate a face in multiple views simultaneously.

...read moreread less

340 citations

Book Chapter•DOI•

DeepIM: Deep Iterative Matching for 6D Pose Estimation

[...]

Yi Li¹, Gu Wang¹, Xiangyang Ji¹, Yu Xiang², Dieter Fox² - Show less +1 more•Institutions (2)

Tsinghua University¹, University of Washington²

01 Mar 2020

TL;DR: A novel deep neural network for 6D pose matching named DeepIM is proposed that is able to iteratively refine the pose by matching the rendered image against the observed image.

...read moreread less

Abstract: Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality. While direct regression of images to object poses has limited accuracy, matching rendered images of an object against the input image can produce accurate results. In this work, we propose a novel deep neural network for 6D pose matching named DeepIM. Given an initial pose estimation, our network is able to iteratively refine the pose by matching the rendered image against the observed image. The network is trained to predict a relative pose transformation using an untangled representation of 3D location and 3D orientation and an iterative training process. Experiments on two commonly used benchmarks for 6D pose estimation demonstrate that DeepIM achieves large improvements over state-of-the-art methods. We furthermore show that DeepIM is able to match previously unseen objects.

...read moreread less

340 citations

Dissertation•

Finding People in Images and Videos

[...]

Navneet Dalal

17 Jul 2006

TL;DR: This thesis introduces grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images and proposes descriptors based on oriented histograms of differential optical flow to detect moving humans in videos.

...read moreread less

Abstract: This thesis targets the detection of humans and other object classes in images and videos. Our focus is on developing robust feature extraction algorithms that encode image regions as highdimensional feature vectors that support high accuracy object/non-object decisions. To test our feature sets we adopt a relatively simple learning framework that uses linear Support Vector Machines to classify each possible image region as an object or as a non-object. The approach is data-driven and purely bottom-up using low-level appearance and motion vectors to detect objects. As a test case we focus on person detection as people are one of the most challenging object classes with many applications, for example in film and video analysis, pedestrian detection for smart cars and video surveillance. Nevertheless we do not make any strong class specific assumptions and the resulting object detection framework also gives state-of-the-art performance for many other classes including cars, motorbikes, cows and sheep. This thesis makes four main contributions. Firstly, we introduce grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images. The HOG descriptors are computed over dense and overlapping grids of spatial blocks, with image gradient orientation features extracted at fixed resolution and gathered into a highdimensional feature vector. They are designed to be robust to small changes in image contour locations and directions, and significant changes in image illumination and colour, while remaining highly discriminative for overall visual form. We show that unsmoothed gradients, fine orientation voting, moderately coarse spatial binning, strong normalisation and overlapping blocks are all needed for good performance. Secondly, to detect moving humans in videos, we propose descriptors based on oriented histograms of differential optical flow. These are similar to static HOG descriptors, but instead of image gradients, they are based on local differentials of dense optical flow. They encode the noisy optical flow estimates into robust feature vectors in a manner that is robust to the overall camera motion. Several variants are proposed, some capturing motion boundaries while others encode the relative motions of adjacent image regions. Thirdly, we propose a general method based on kernel density estimation for fusing multiple overlapping detections, that takes into account the number of detections, their confidence scores and the scales of the detections. Lastly, we present work in progress on a parts based approach to person detection that first detects local body parts like heads, torso, and legs and then fuses them to create a global overall person detector.

...read moreread less

340 citations

Journal Article•DOI•

A state-based approach to the representation and recognition of gesture

[...]

Aaron F. Bobick¹, Andrew D. Wilson¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Dec 1997-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A state-based technique for the representation and recognition of gesture is presented, using techniques for computing a prototype trajectory of an ensemble of trajectories and for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data.

...read moreread less

Abstract: A state-based technique for the representation and recognition of gesture is presented. We define a gesture to be a sequence of states in a measurement or configuration space. For a given gesture, these states are used to capture both the repeatability and variability evidenced in a training set of example trajectories. Using techniques for computing a prototype trajectory of an ensemble of trajectories, we develop methods for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data. The approach is illustrated by application to a range of gesture-related sensory data: the two-dimensional movements of a mouse input device, the movement of the hand measured by a magnetic spatial position and orientation sensor, and, lastly, the changing eigenvector projection coefficients computed from an image sequence.

...read moreread less

339 citations

Collapse

Network Information

Performance

Metrics

17,196

Papers

390,310

Citations

No. of papers in the topic in previous years
Year	Papers
2022	12
2021	535
2020	771
2019	830
2018	727
2017	691

Orientation (computer vision)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics