scispace - formally typeset
Search or ask a question

Showing papers on "Orientation (computer vision) published in 2010"


Journal ArticleDOI
TL;DR: In this paper, the authors present a system that takes as input an astronomical image, and returns as output the pointing, scale, and orientation of that image (the astrometric calibration or World Coordinate System information).
Abstract: We have built a reliable and robust system that takes as input an astronomical image, and returns as output the pointing, scale, and orientation of that image (the astrometric calibration or World Coordinate System information). The system requires no first guess, and works with the information in the image pixels alone; that is, the problem is a generalization of the "lost in space" problem in which nothing—not even the image scale—is known. After robust source detection is performed in the input image, asterisms (sets of four or five stars) are geometrically hashed and compared to pre-indexed hashes to generate hypotheses about the astrometric calibration. A hypothesis is only accepted as true if it passes a Bayesian decision theory test against a null hypothesis. With indices built from the USNO-B catalog and designed for uniformity of coverage and redundancy, the success rate is >99.9% for contemporary near-ultraviolet and visual imaging survey data, with no false positives. The failure rate is consistent with the incompleteness of the USNO-B catalog; augmentation with indices built from the Two Micron All Sky Survey catalog brings the completeness to 100% with no false positives. We are using this system to generate consistent and standards-compliant meta-data for digital and digitized imaging from plate repositories, automated observatories, individual scientific investigators, and hobbyists. This is the first step in a program of making it possible to trust calibration meta-data for astronomical data of arbitrary provenance.

848 citations


Journal ArticleDOI
TL;DR: This paper presents a new biometric authentication system using finger-knuckle-print (FKP) imaging, which achieves much higher recognition rate and it works in real time and has great potentials for commercial applications.

345 citations


Journal ArticleDOI
TL;DR: The experimental results show that the orientationerrors using the proposed method are significantly reduced compared to the orientation errors obtained from an extended Kalman filter (EKF) approach, and the improved orientation estimation leads to better position estimation accuracy.
Abstract: This paper presents a novel methodology that estimates position and orientation using one position sensor and one inertial measurement unit. The proposed method estimates orientation using a particle filter and estimates position and velocity using a Kalman filter (KF). In addition, an expert system is used to correct the angular velocity measurement errors. The experimental results show that the orientation errors using the proposed method are significantly reduced compared to the orientation errors obtained from an extended Kalman filter (EKF) approach. The improved orientation estimation using the proposed method leads to better position estimation accuracy. This paper studies the effects of the number of particles of the proposed filter and position sensor noise on the orientation accuracy. Furthermore, the experimental results show that the orientation of the proposed method converges to the correct orientation even when the initial orientation is completely unknown.

220 citations


Journal ArticleDOI
TL;DR: In this article, a general imaging methodology, termed multi-mode total focusing method, is proposed in which any combination of modes and reflections can be used to produce an image of the test structure.
Abstract: Ultrasonic arrays allow a given scatterer to be illuminated from a wide range of angles and hence are capable of extracting significant information about the scatterer. In this paper a general imaging methodology, termed multi-mode total focusing method, is proposed in which any combination of modes and reflections can be used to produce an image of the test structure. Like the total focusing method, this approach is implemented by post-processing the full matrix of array data to achieve a synthetic focus at every pixel in the image. A hybrid model is used to predict the array data and demonstrate the performance of the multi-mode imaging concept. This hybrid model combines far field scattering coefficient matrices with a ray-based wave propagation model. This allows the inclusion of longitudinal waves, shear waves and wave mode conversions. It is shown that, with prior knowledge of likely scatterer location and orientation, the mode combination and array location can be optimised to maximise the performance of array inspections. A practically relevant weld inspection application is then described and its optimisation is discussed.

208 citations


Journal ArticleDOI
TL;DR: For most cortical locations there was a source orientation to which MEG was insensitive, and the difference in the sensitivity is expected to contribute to systematic differences in the signal-to-noise ratio between MEG and EEG.
Abstract: An important difference between magnetoencephalography (MEG) and electroencephalography (EEG) is that MEG is insensitive to radially oriented sources. We quantified computationally the dependency of MEG and EEG on the source orientation using a forward model with realistic tissue boundaries. Similar to the simpler case of a spherical head model, in which MEG cannot see radial sources at all, for most cortical locations there was a source orientation to which MEG was insensitive. The median value for the ratio of the signal magnitude for the source orientation of the lowest and the highest sensitivity was 0.06 for MEG and 0.63 for EEG. The difference in the sensitivity to the source orientation is expected to contribute to systematic differences in the signal-to-noise ratio between MEG and EEG.

204 citations


Journal ArticleDOI
TL;DR: High-resolution imaging results demonstrate a reliable millimeters-scale orientation signal, likely emerging from irregular spatial arrangements of orientation columns and their supporting vasculature, and fMRI pattern analysis methods are thus likely to be sensitive to signals originating from other irregular columnar structures elsewhere in the brain.
Abstract: Although orientation columns are less than a millimeter in width, recent neuroimaging studies indicate that viewed orientations can be decoded from cortical activity patterns sampled at relatively coarse resolutions of several millimeters One proposal is that these differential signals arise from random spatial irregularities in the columnar map However, direct support for this hypothesis has yet to be obtained Here, we used high-field, high-resolution functional magnetic resonance imaging (fMRI) and multivariate pattern analysis to determine the spatial scales at which orientation-selective information can be found in the primary visual cortex (V1) of cats and humans We applied a multiscale pattern analysis approach in which fine- and coarse-scale signals were first removed by ideal spatial lowpass and highpass filters, and the residual activity patterns then analyzed by linear classifiers Cat visual cortex, imaged at 03125 mm resolution, showed a strong orientation signal at the scale of individual columns Nonetheless, reliable orientation bias could still be found at spatial scales of several millimeters In the human visual cortex, imaged at 1 mm resolution, a majority of orientation information was found on scales of millimeters, with small contributions from global spatial biases exceeding ∼1 cm Our high-resolution imaging results demonstrate a reliable millimeters-scale orientation signal, likely emerging from irregular spatial arrangements of orientation columns and their supporting vasculature fMRI pattern analysis methods are thus likely to be sensitive to signals originating from other irregular columnar structures elsewhere in the brain

200 citations


Journal ArticleDOI
TL;DR: An active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates are proposed.
Abstract: This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis.

181 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work presents a passive computer vision method that exploits existing mapping and navigation databases in order to automatically create 3D building models and defines a grammar for representing changes in building geometry that approximately follow the Manhattan-world assumption.
Abstract: We present a passive computer vision method that exploits existing mapping and navigation databases in order to automatically create 3D building models. Our method defines a grammar for representing changes in building geometry that approximately follow the Manhattan-world assumption which states there is a predominance of three mutually orthogonal directions in the scene. By using multiple calibrated aerial images, we extend previous Manhattan-world methods to robustly produce a single, coherent, complete geometric model of a building with partial textures. Our method uses an optimization to discover a 3D building geometry that produces the same set of facade orientation changes observed in the captured images. We have applied our method to several real-world buildings and have analyzed our approach using synthetic buildings.

145 citations


Journal ArticleDOI
TL;DR: This work has derived methods for selecting features that emphasize the most significant spectral/spatial differences between the various classes in a scene, and demonstrates the performance of the 3-D Gabor features for the classification of regions in Airborne Visible/Infrared Imaging Spectrometer hyperspectral data.
Abstract: A 3-D spectral/spatial discrete Fourier transform can be used to represent a hyperspectral image region using a dense sampling in the frequency domain. In many cases, a more compact frequency-domain representation that preserves the 3-D structure of the data can be exploited. For this purpose, we have developed a new model for spectral/spatial information based on 3-D Gabor filters. These filters capture specific orientation, scale, and wavelength-dependent properties of hyperspectral image data and provide an efficient means of sampling a 3-D frequency-domain representation. Since 3-D Gabor filters allow for a large number of spectral/spatial features to be used to represent an image region, the performance and efficiency of algorithms that use this representation can be further improved if methods are available to reduce the size of the model. Thus, we have derived methods for selecting features that emphasize the most significant spectral/spatial differences between the various classes in a scene. We demonstrate the performance of the 3-D Gabor features for the classification of regions in Airborne Visible/Infrared Imaging Spectrometer hyperspectral data. The new features are compared against pure spectral features and multiband generalizations of gray-level co-occurrence matrix features.

141 citations


Journal ArticleDOI
TL;DR: This work presents a novel vision-based system for automatic detection and extraction of complex road networks from various sensor resources such as aerial photographs, satellite images, and LiDAR that merges the power of perceptual grouping theory and optimization techniques into a unified framework to address the challenging problems of geospatial feature detection and classification.
Abstract: In this work we present a novel vision-based system for automatic detection and extraction of complex road networks from various sensor resources such as aerial photographs, satellite images, and LiDAR. Uniquely, the proposed system is an integrated solution that merges the power of perceptual grouping theory (Gabor filtering, tensor voting) and optimized segmentation techniques (global optimization using graph-cuts) into a unified framework to address the challenging problems of geospatial feature detection and classification. Firstly, the local precision of the Gabor filters is combined with the global context of the tensor voting to produce accurate classification of the geospatial features. In addition, the tensorial representation used for the encoding of the data eliminates the need for any thresholds, therefore removing any data dependencies. Secondly, a novel orientation-based segmentation is presented which incorporates the classification of the perceptual grouping, and results in segmentations with better defined boundaries and continuous linear segments. Finally, a set of gaussian-based filters are applied to automatically extract centerline information (magnitude, width and orientation). This information is then used for creating road segments and transforming them to their polygonal representations.

139 citations


Book ChapterDOI
05 Sep 2010
TL;DR: A novel minimal case solution to the calibrated relative pose problem using 3 point correspondences for the case of two known orientation angles is presented and it is shown that the new 3-point algorithm can cope with planes and even collinear points.
Abstract: It this paper we present a novel minimal case solution to the calibrated relative pose problemusing 3 point correspondences for the case of two known orientation angles. This case is relevant when a camera is coupled with an inertial measurement unit (IMU) and it recently gained importance with the omnipresence of Smartphones (iPhone, Nokia N900) that are equippedwith accelerometers tomeasure the gravity normal. Similar to the 5-point (6-point), 7-point, and 8-point algorithm for computing the essential matrix in the unconstrained case, we derive a 3-point, 4-point and, 5-point algorithm for the special case of two known orientation angles. We investigate degenerate conditions and show that the new 3-point algorithm can cope with planes and even collinear points. We will show a detailed analysis and comparison on synthetic data and present results on cell phone images. As an additional application we demonstrate the algorithm on relative pose estimation for a micro aerial vehicle's (MAV) camera-IMU system.

Patent
12 Feb 2010
TL;DR: In this paper, a method for determining the pose of a camera with respect to at least one object of a real environment for use in authoring/augmented reality application that includes generating a first image by the camera capturing a real object of the real environment, generating first orientation data from at least 1 orientation sensor associated with the camera or from an algorithm which analyses the first image for finding and determining features which are indicative of an orientation of the camera.
Abstract: Method for determining the pose of a camera with respect to at least one object of a real environment for use in authoring/augmented reality application that includes generating a first image by the camera capturing a real object of a real environment, generating first orientation data from at least one orientation sensor associated with the camera or from an algorithm which analyses the first image for finding and determining features which are indicative of an orientation of the camera, allocating a distance of the camera to the real object, generating distance data indicative of the allocated distance, determining the pose of the camera with respect to a coordinate system related to the real object of the real environment using the distance data and the first orientation data. May be performed with reduced processing requirements and/or higher processing speed, in mobile device such as mobile phones having display, camera and orientation sensor.

Proceedings ArticleDOI
12 Apr 2010
TL;DR: This paper studies the relationship between the region geometry and reachable set accuracy and proposes a method for constructing hybridization regions using tighter interpolation error bounds and presents some experimental results on a high-dimensional biological system to demonstrate the performance improvement.
Abstract: This paper is concerned with reachable set computation for non-linear systems using hybridization. The essence of hybridization is to approximate a non-linear vector field by a simpler (such as affine) vector field. This is done by partitioning the state space into small regions within each of which a simpler vector field is defined. This approach relies on the availability of methods for function approximation and for handling the resulting dynamical systems. Concerning function approximation using interpolation, the accuracy depends on the shapes and sizes of the regions which can compromise as well the speed of reachability computation since it may generate spurious classes of trajectories. In this paper we study the relationship between the region geometry and reachable set accuracy and propose a method for constructing hybridization regions using tighter interpolation error bounds. In addition, our construction exploits the dynamics of the system to adapt the orientation of the regions, in order to achieve better time-efficiency. We also present some experimental results on a high-dimensional biological system, to demonstrate the performance improvement.

Patent
21 Jun 2010
TL;DR: In this article, a system and method provides maps identifying the 3D location of traffic lights, which can then be used to assist robotic vehicles or human drivers to identify the location and status of a traffic signal.
Abstract: A system and method provides maps identifying the 3D location of traffic lights. The position, location, and orientation of a traffic light may be automatically extrapolated from two or more images. The maps may then be used to assist robotic vehicles or human drivers to identify the location and status of a traffic signal.

Journal ArticleDOI
TL;DR: A novel x-ray imaging approach that yields information about the local texture of structures smaller than the image pixel resolution inside an object, using scattering from sub-micron structures in the sample is introduced.
Abstract: We introduce a novel x-ray imaging approach that yields information about the local texture of structures smaller than the image pixel resolution inside an object. The approach is based on a recently developed x-ray dark-field imaging technique, using scattering from sub-micron structures in the sample. We show that the method can be used to determine the local angle and degree of orientation of bone, and fibers in a leaf. As the method is based on the use of a conventional x-ray tube we believe that it can have a great impact on medical diagnostics and non-destructive testing applications.

Proceedings ArticleDOI
08 Mar 2010
TL;DR: An active vision system for the automatic detection of falls and the recognition of several postures for elderly homecare applications using a wall-mounted Time-Of-Flight camera with high performances in terms of efficiency and reliability on a large real dataset.
Abstract: The paper presents an active vision system for the automatic detection of falls and the recognition of several postures for elderly homecare applications. A wall-mounted Time-Of-Flight camera provides accurate measurements of the acquired scene in all illumination conditions, allowing the reliable detection of critical events. Preliminarily, an off-line calibration procedure estimates the external camera parameters automatically without landmarks, calibration patterns or user intervention. The calibration procedure searches for different planes in the scene selecting the one that accomplishes the floor plane constraints. Subsequently, the moving regions are detected in real-time by applying a Bayesian segmentation to the whole 3D points cloud. The distance of the 3D human centroid from the floor plane is evaluated by using the previously defined calibration parameters and the corresponding trend is used as feature in a thresholding-based clustering for fall detection. The fall detection shows high performances in terms of efficiency and reliability on a large real dataset in which almost one half of events are falls acquired in different conditions. The posture recognition is carried out by using both the 3D human centroid distance from the floor plane and the orientation of the body spine estimated by applying a topological approach to the range images. Experimental results on synthetic data validate the correctness of the proposed posture recognition approach.

Patent
25 May 2010
TL;DR: In this paper, a system, method, and computer program product for controlling stereo glasses shutters is described, in which a right eye shutter of stereo glasses is controlled to switch between a closed orientation and an open orientation.
Abstract: A system, method, and computer program product are provided for controlling stereo glasses shutters. In use, a right eye shutter of stereo glasses is controlled to switch between a closed orientation and an open orientation. Further, a left eye shutter of the stereo glasses is controlled to switch between the closed orientation and the open orientation. To this end, the right eye shutter and the left eye shutter of the stereo glasses may be controlled such that the right eye shutter and the left eye shutter simultaneously remain in the closed orientation for a predetermined amount of time.

Journal ArticleDOI
TL;DR: This work presents a fast technique that requires less than a second to localize the Optic Disc, based upon obtaining two projections of certain image features that encode the x- and y- coordinates of the OD.
Abstract: Optic Disc (OD) localization is an important pre-processing step that significantly simplifies subsequent segmentation of the OD and other retinal structures. Current OD localization techniques suffer from impractically-high computation times (few minutes per image). In this work, we present a fast technique that requires less than a second to localize the OD. The technique is based upon obtaining two projections of certain image features that encode the x- and y- coordinates of the OD. The resulting 1-D projections are then searched to determine the location of the OD. This avoids searching the 2-D image space and, thus, enhances the speed of the OD localization process. Image features such as retinal vessels orientation and the OD brightness are used in the current method. Four publicly-available databases, including STARE and DRIVE, are used to evaluate the proposed technique. The OD was successfully located in 330 images out of 340 images (97%) with an average computation time of 0.65 s.

Book ChapterDOI
06 Jan 2010
TL;DR: The proposed bottom-up SVA model is based on multiple perceptual stimuli including depth information, luminance, color, orientation and motion contrast, and is able to efficiently simulate SVA of human eyes.
Abstract: Compared with traditional mono-view video, three-dimensional video (3DV) provides user interactive functionalities and stereoscopic perception, which makes people more interested in pop-out regions or the regions with small depth value. Thus, traditional visual attention model for mono-view video can hardly be directly applied to stereoscopic visual attention (SVA) analysis for 3DV. In this paper, we propose a bottom-up SVA model to simulate human visual system with stereoscopic vision more accurately. The proposed model is based on multiple perceptual stimuli including depth information, luminance, color, orientation and motion contrast. Then, a depth based dynamic fusion is proposed to integrate these features. The experimental results on multi-view video test sequences show that the proposed model maintains high robustness and is able to efficiently simulate SVA of human eyes.

Patent
31 Mar 2010
TL;DR: In this paper, an image reader can capture an image, identify a bar code or IBI form within the captured image, and display the image responsive to the an orientation of the bar code.
Abstract: Embodiments of an image reader and/or methods of operating an image reader can capture an image, identify a bar code or IBI form within the captured image, and, store or display the captured image responsive to the an orientation of the bar code.

Journal ArticleDOI
TL;DR: In this paper, the authors describe image processing methods to overcome these challenges and describes methods for computation of size, location, contact points and orientation of the aggregates in HMA.
Abstract: X-ray computed tomography (CT) is a novel tool to quantify the aggregate characteristics in asphalt pavements. This tool can potentially be used in QA, acceptance, design and forensic applications in pavement engineering. However, there have been challenges associated with the processing of the 3D X-ray CT images, including: (1) segmentation of aggregates that are in close proximity and (2) processing noisy or poor contrast images. This paper describes image processing methods to overcome these challenges and describes methods for computation of size, location, contact points and orientation of the aggregates in HMA. Validations of the algorithms as well as example computations of contact points and orientation have been presented. A significant increase in the number of contact points with increasing compaction level and preferred orientation perpendicular to the direction of compaction in the gyratory compactor were some of the findings presented in this paper.

Journal ArticleDOI
TL;DR: This work adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair of lane lines to the image features, based on both ridgeness and ridge orientation, which it claims addresses detection reliability better.
Abstract: Detection of lane markings based on a camera sensor can be a low-cost solution to lane departure and curve-over-speed warnings A number of methods and implementations have been reported in the literature However, reliable detection is still an issue because of cast shadows, worn and occluded markings, variable ambient lighting conditions, for example We focus on increasing detection reliability in two ways First, we employed an image feature other than the commonly used edges: ridges, which we claim addresses this problem better Second, we adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair of lane lines to the image features, based on both ridgeness and ridge orientation In addition, the model was fitted for the left and right lane lines simultaneously to enforce a consistent result Four measures of interest for driver assistance applications were directly computed from the fitted parametric model at each frame: lane width, lane curvature, and vehicle yaw angle and lateral offset with regard the lane medial axis We qualitatively assessed our method in video sequences captured on several road types and under very different lighting conditions We also quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known

Journal ArticleDOI
TL;DR: The aim of the paper is to show the advantages of using a efficient modeling of the processing occurring at retina level and in the V1 visual cortex in order to develop efficient and fast bio-inspired modules for low level image processing.

Journal ArticleDOI
TL;DR: In this paper, a methodology for correction of measurement bias due to the orientation of a discrete discontinuity surface with respect to the line-of-sight of the Lidar scanner and for occlusion is presented.
Abstract: Lidar is a remote sensing technology that uses time-of-flight and line-of-sight to calculate the accurate locations of physical objects in a known space (the known space is in relation to the scanner). The resultant point-cloud data can be used to virtually identify and measure geomechanical data such as joint set orientations, spacing and roughness. The line-of-sight property of static Lidar scanners results in occluded (hidden) zones in the point-cloud and significant quantifiable bias when analyzing the data generated from a single scanning location. While the use of multiple scanning locations and orientations, with merging of aligned (registered) scans, is recommended, practical limitations often limit setup to a single location or a consistent orientation with respect to the slope and rock structure. Such setups require correction for measurement bias. Recent advancements in Lidar scanning and processing technology have facilitated the routine use of Lidar data for geotechnical investigation. Current developments in static scanning have lead to large datasets and generated the need for automated bias correction methods. In addition to the traditional bias correction due to outcrop or scanline orientation, this paper presents a methodology for correction of measurement bias due to the orientation of a discrete discontinuity surface with respect to the line-of-sight of the Lidar scanner and for occlusion. Bias can be mathematically minimized from the analyzed discontinuity orientation data.

Patent
03 Feb 2010
TL;DR: In this paper, a system, apparatus, method, and computer-readable media are provided for the capture of stereoscopic three dimensional (3D) images using multiple cameras or a single camera manipulated to deduce stereoscopic data.
Abstract: A system, apparatus, method, and computer-readable media are provided for the capture of stereoscopic three dimensional (3D) images using multiple cameras or a single camera manipulated to deduce stereoscopic data. According to one method, a dongle or cradle is added to a mobile phone or other device to capture stereoscopic images. According to another method, the images are captured from cameras with oblique orientation such that the images may need to be rotated, cropped, or both to determine the appropriate stereoscopic 3D regions of interest. According to another method, a single camera is manipulated such that stereoscopic 3D information is deduced.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper describes a photometric stereo method that works with a wide range of surface reflectances and shows that the monotonicity and isotropy properties hold specular lobes with respect to the cosine of the surface orientation and the bisector between the light direction and view direction.
Abstract: This paper describes a photometric stereo method that works with a wide range of surface reflectances. Unlike previous approaches that assume simple parametric models such as Lambertian reflectance, the only assumption that we make is that the reflectance has three properties; monotonicity, visibility, and isotropy with respect to the cosine of light direction and surface orientation. In fact, these properties are observed in many non-Lambertian diffuse reflectances. We also show that the monotonicity and isotropy properties hold specular lobes with respect to the cosine of the surface orientation and the bisector between the light direction and view direction. Each of these three properties independently gives a possible solution space of the surface orientation. By taking the intersection of the solution spaces, our method determines the surface orientation in a consensus manner. Our method naturally avoids the need for radiometrically calibrating cameras because the radiometric response function preserves these three properties. The effectiveness of the proposed method is demonstrated using various simulated and real-world scenes that contain a variety of diffuse and specular surfaces.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper presents a novel approach to single-frame pedestrian classification and orientation estimation in terms of a set of view-related models which couple discriminative expert classifiers with sample-dependent priors, facilitating easy integration of other cues in a Bayesian fashion.
Abstract: This paper presents a novel approach to single-frame pedestrian classification and orientation estimation. Unlike previous work which addressed classification and orientation separately with different models, our method involves a probabilistic framework to approach both in a unified fashion. We address both problems in terms of a set of view-related models which couple discriminative expert classifiers with sample-dependent priors, facilitating easy integration of other cues (e.g. motion, shape) in a Bayesian fashion. This mixture-of-experts formulation approximates the probability density of pedestrian orientation and scales-up to the use of multiple cameras. Experiments on large real-world data show a significant performance improvement in both pedestrian classification and orientation estimation of up to 50%, compared to state-of-the-art, using identical data and evaluation techniques.

Patent
11 Oct 2010
TL;DR: In this article, a method for representing virtual information in a view of a real environment comprises providing a virtual object having a global position and orientation with respect to a geographic global coordinate system.
Abstract: A method for representing virtual information in a view of a real environment comprises providing a virtual object having a global position and orientation with respect to a geographic global coordinate system, with first pose data on the global position and orientation of the virtual object, in a database of a server, taking an image of a real environment by a mobile device and providing second pose data as to at which position and with which orientation with respect to the geographic global coordinate system the image was taken. The method further includes displaying the image on a display of the mobile device, accessing the virtual object in the database and positioning the virtual object in the image on the basis of the first and second pose data, manipulating the virtual object or adding a further virtual object, and providing the manipulated virtual object with modified first pose data or the further virtual object with third pose data in the database.

Journal ArticleDOI
01 Jun 2010
TL;DR: A new region-based unified tensor level set model for image segmentation that possesses the capacity to cope with data varying from scalar to vector, then to high-order tensor and is robust against noise.
Abstract: This paper presents a new region-based unified tensor level set model for image segmentation. This model introduces a three-order tensor to comprehensively depict features of pixels, e.g., gray value and the local geometrical features, such as orientation and gradient, and then, by defining a weighted distance, we generalized the representative region-based level set method from scalar to tensor. The proposed model has four main advantages compared with the traditional representative method as follows. First, involving the Gaussian filter bank, the model is robust against noise, particularly the salt- and pepper-type noise. Second, considering the local geometrical features, e.g., orientation and gradient, the model pays more attention to boundaries and makes the evolving curve stop more easily at the boundary location. Third, due to the unified tensor pixel representation representing the pixels, the model segments images more accurately and naturally. Fourth, based on a weighted distance definition, the model possesses the capacity to cope with data varying from scalar to vector, then to high-order tensor. We apply the proposed method to synthetic, medical, and natural images, and the result suggests that the proposed method is superior to the available representative region-based level set method.

Journal ArticleDOI
TL;DR: A new algorithm for estimating the relative translation and orientation of an inertial measurement unit and a camera, which does not require any additional hardware, except a piece of paper with a checkerboard pattern on it, which works well in practice, both for perspective and spherical cameras.
Abstract: This paper is concerned with the problem of estimating the relative translation and orientation of an inertial measurement unit and a camera, which are rigidly connected. The key is to realize that this problem is in fact an instance of a standard problem within the area of system identification, referred to as a gray-box problem. We propose a new algorithm for estimating the relative translation and orientation, which does not require any additional hardware, except a piece of paper with a checkerboard pattern on it. The method is based on a physical model which can also be used in solving, for example, sensor fusion problems. The experimental results show that the method works well in practice, both for perspective and spherical cameras.