Topic
Orientation (computer vision)
About: Orientation (computer vision) is a research topic. Over the lifetime, 17196 publications have been published within this topic receiving 358181 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: This work presents a new method for synthesizing novel views of a 3D scene from two or three reference images in full correspondence through the use and manipulation of an algebraic entity, termed the "trilinear tensor", that links point correspondences across three images.
Abstract: Presents a new method for synthesizing novel views of a 3D scene from two or three reference images in full correspondence. The core of this work is the use and manipulation of an algebraic entity, termed the "trilinear tensor", that links point correspondences across three images. For a given virtual camera position and orientation, a new trilinear tensor can be computed based on the original tensor of the reference images. The desired view can then be created using this new trilinear tensor and point correspondences across two of the reference images.
181 citations
••
19 May 1992TL;DR: A novel method to measure the differential invariants of the image velocity field robustly by computing average values from the integral of normal image velocities around image contours, equivalent to measuring the temporal changes in the area of a closed contour.
Abstract: This paper describes a novel method to measure the differential invariants of the image velocity field robustly by computing average values from the integral of normal image velocities around image contours. This is equivalent to measuring the temporal changes in the area of a closed contour. This avoids having to recover a dense image velocity field and taking partial derivatives. It also does not require point or line correspondences. Moreover integration provides some immunity to image measurement noise.
181 citations
••
23 Mar 2017TL;DR: An encoder-decoder convolutional neural network architecture for estimating camera pose (orientation and location) from a single RGB-image with clear improvement over the previous state-of-the-art even when compared to methods that utilize sequence of test frames instead of a single frame.
Abstract: In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image. The architecture has a hourglass shape consisting of a chain of convolution and up-convolution layers followed by a regression part. The up-convolution layers are introduced to preserve the fine-grained information of the input image. Following the common practice, we train our model in end-to-end manner utilizing transfer learning from large scale classification data. The experiments demonstrate the performance of the approach on data exhibiting different lighting conditions, reflections, and motion blur The results indicate a clear improvement over the previous state-of-the-art even when compared to methods that utilize sequence of test frames instead of a single frame.
181 citations
••
TL;DR: An active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates are proposed.
Abstract: This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis.
181 citations
••
07 Oct 2012TL;DR: This work targets mountainous terrain and uses digital elevation models to extract representations for fast visual database lookup and validate the system on the scale of a whole country Switzerland, 40 000km2 using a new dataset of more than 200 landscape query pictures with ground truth.
Abstract: Given a picture taken somewhere in the world, automatic geo-localization of that image is a task that would be extremely useful e.g. for historical and forensic sciences, documentation purposes, organization of the world's photo material and also intelligence applications. While tremendous progress has been made over the last years in visual location recognition within a single city, localization in natural environments is much more difficult, since vegetation, illumination, seasonal changes make appearance-only approaches impractical. In this work, we target mountainous terrain and use digital elevation models to extract representations for fast visual database lookup. We propose an automated approach for very large scale visual localization that can efficiently exploit visual information contours and geometric constraints consistent orientation at the same time. We validate the system on the scale of a whole country Switzerland, 40 000km2 using a new dataset of more than 200 landscape query pictures with ground truth.
181 citations