Showing papers on "Orientation (computer vision) published in 2011"

PDF

Open Access

Proceedings Article•DOI•

A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation

[...]

Laurent Kneip¹, Davide Scaramuzza¹, Roland Siegwart¹•Institutions (1)

Institute of Robotics and Intelligent Systems¹

20 Jun 2011

TL;DR: This paper proposes a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame, at much lower computational cost.

...read moreread less

Abstract: The Perspective-Three-Point (P3P) problem aims at determining the position and orientation of the camera in the world reference frame from three 2D-3D point correspondences. This problem is known to provide up to four solutions that can then be disambiguated using a fourth point. All existing solutions attempt to first solve for the position of the points in the camera reference frame, and then compute the position and orientation of the camera in the world frame, which alignes the two point sets. In contrast, in this paper we propose a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame. This is made possible by introducing intermediate camera and world reference frames, and expressing their relative position and orientation using only two parameters. The projection of a world point into the parametrized camera pose then leads to two conditions and finally a quartic equation for finding up to four solutions for the parameter pair. A subsequent backsubstitution directly leads to the corresponding camera poses with respect to the world reference frame. We show that the proposed algorithm offers accuracy and precision comparable to a popular, standard, state-of-the-art approach but at much lower computational cost (15 times faster). Furthermore, it provides improved numerical stability and is less affected by degenerate configurations of the selected world points. The superior computational efficiency is particularly suitable for any RANSAC-outlier-rejection step, which is always recommended before applying PnP or non-linear optimization of the final solution.

...read moreread less

563 citations

Proceedings Article•DOI•

Efficient grasping from RGBD images: Learning using a new rectangle representation

[...]

Yun Jiang¹, Stephen Moseson¹, Ashutosh Saxena¹•Institutions (1)

Cornell University¹

09 May 2011

TL;DR: This work proposes a new ‘grasping rectangle’ representation: an oriented rectangle in the image plane that takes into account the location, the orientation as well as the gripper opening width and shows that this algorithm is successfully used to pick up a variety of novel objects.

...read moreread less

Abstract: Given an image and an aligned depth map of an object, our goal is to estimate the full 7-dimensional gripper configuration—its 3D location, 3D orientation and the gripper opening width. Recently, learning algorithms have been successfully applied to grasp novel objects—ones not seen by the robot before. While these approaches use low-dimensional representations such as a ‘grasping point’ or a ‘pair of points’ that are perhaps easier to learn, they only partly represent the gripper configuration and hence are sub-optimal. We propose to learn a new ‘grasping rectangle’ representation: an oriented rectangle in the image plane. It takes into account the location, the orientation as well as the gripper opening width. However, inference with such a representation is computationally expensive. In this work, we present a two step process in which the first step prunes the search space efficiently using certain features that are fast to compute. For the remaining few cases, the second step uses advanced features to accurately select a good grasp. In our extensive experiments, we show that our robot successfully uses our algorithm to pick up a variety of novel objects.

...read moreread less

487 citations

Journal Article•DOI•

A Novel Image Analysis Toolbox Enabling Quantitative Analysis of Root System Architecture

[...]

Guillaume Lobet¹, Loïc Pagès², Xavier Draye¹•Institutions (2)

Université catholique de Louvain¹, Institut national de la recherche agronomique²

01 Sep 2011-Plant Physiology

TL;DR: A novel, semiautomated image-analysis software to streamline the quantitative analysis of root growth and architecture of complex root systems, which combines a vectorial representation of root objects with a powerful tracing algorithm that accommodates a wide range of image sources and quality.

...read moreread less

Abstract: We present in this paper a novel, semi-automated image analysis software to streamline the quantitative analysis of root growth and architecture of complex root systems. The software combines a vectorial representation of root objects with a powerful tracing algorithm which accommodates a wide range of image sources and quality. The root system is treated as a collection of roots (possibly connected) that are individually represented as parsimonious sets of connected segments. Pixel coordinates and grey level are therefore turned into intuitive biological attributes such as segment diameter and orientation, distance to any other segment or topological position. As a consequence, user interaction and data analysis directly operate on biologicalentities (roots) and are not hampered by the spatially discrete, pixel-based nature of the original image. The software supports a sampling-based analysis of root system images, in which detailed information is collected on a limited number of roots selected by the user according to specific research requirements. The use of the software is illustrated with a time-lapse analysis of cluster root formation in lupin (Lupinus albus) and with an architectural analysis of maize root system (Zea mays). The software, SmartRoot, is an operating system independent freeware based on ImageJ and relies on cross-platform standards for communication with data analysis softwares.

...read moreread less

397 citations

Journal Article•DOI•

A Hybrid Approach to Detect and Localize Texts in Natural Scene Images

[...]

Yi-Feng Pan¹, Xinwen Hou¹, Cheng-Lin Liu¹•Institutions (1)

Chinese Academy of Sciences¹

01 Mar 2011-IEEE Transactions on Image Processing

TL;DR: A hybrid approach to robustly detect and localize texts in natural scene images using a text region detector, a conditional random field model, and a learning-based energy minimization method are presented.

...read moreread less

Abstract: Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring manual tuning. Experimental results evaluated on the ICDAR 2005 competition dataset show that our approach yields higher precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image dataset with promising results.

...read moreread less

394 citations

Journal Article•DOI•

A Laplacian Approach to Multi-Oriented Text Detection in Video

[...]

Palaiahnakote Shivakumara¹, Trung Quy Phan¹, Chew Lim Tan¹•Institutions (1)

National University of Singapore¹

01 Feb 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.

...read moreread less

Abstract: In this paper, we propose a method based on the Laplacian in the frequency domain for video text detection. Unlike many other approaches which assume that text is horizontally-oriented, our method is able to handle text of arbitrary orientation. The input image is first filtered with Fourier-Laplacian. K-means clustering is then used to identify candidate text regions based on the maximum difference. The skeleton of each connected component helps to separate the different text strings from each other. Finally, text string straightness and edge density are used for false positive elimination. Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.

...read moreread less

278 citations

Evaluation of Existing Image Matching Methods for Deriving Glacier Surface Displacements Globally from Optical Satellite Imagery

[...]

Torborg Heid, Andreas Kääb

01 Dec 2011

TL;DR: In this article, the authors compare and evaluate different image matching methods for glacier flow determination over large scales, and they consider CCF-O and COSI-Corr to be the two most robust matching methods.

...read moreread less

Abstract: Automatic matching of images from two different times is a method that is often used to derive glacier surface velocity. Nearly global repeat coverage of the Earth's surface by optical satellite sensors now opens the possibility for global-scale mapping and monitoring of glacier flow with a number of applications in, for example, glacier physics, glacier-related climate change and impact assessment, and glacier hazard management. The purpose of this study is to compare and evaluate different existing image matching methods for glacier flow determination over large scales. The study compares six different matching methods: normalized cross-correlation (NCC), the phase correlation algorithm used in the COSI-Corr software, and four other Fourier methods with different normalizations. We compare the methods over five regions of the world with different representative glacier characteristics: Karakoram, the European Alps, Alaska, Pine Island (Antarctica) and southwest Greenland. Landsat images are chosen for matching because they expand back to 1972, they cover large areas, and at the same time their spatial resolution is as good as 15 m for images after 1999 (ETM + pan). Cross-correlation on orientation images (CCF-O) outperforms the three similar Fourier methods, both in areas with high and low visual contrast. NCC experiences problems in areas with low visual contrast, areas with thin clouds or changing snow conditions between the images. CCF-O has problems on narrow outlet glaciers where small window sizes (about 16 pixels by 16 pixels or smaller) are needed, and it also obtains fewer correct matches than COSI-Corr in areas with low visual contrast. COSI-Corr has problems on narrow outlet glaciers and it obtains fewer correct matches compared to CCF-O when thin clouds cover the surface, or if one of the images contains snow dunes. In total, we consider CCF-O and COSI-Corr to be the two most robust matching methods for global-scale mapping and monitoring of glacier velocities. If combining CCF-O with locally adaptive template sizes and by filtering the matching results automatically by comparing the displacement matrix to its low pass filtered version, the matching process can be automated to a large degree. This allows the derivation of glacier velocities with minimal (but not without!) user interaction and hence also opens up the possibility of global-scale mapping and monitoring of glacier flow.

...read moreread less

272 citations

Proceedings Article•DOI•

Blind/Referenceless Image Spatial Quality Evaluator

[...]

Anish Mittal¹, Anush K. Moorthy¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Nov 2011

TL;DR: A natural scene statistic based Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) which extracts the point wise statistics of local normalized luminance signals and measures image naturalness (or lack there of) based on measured deviations from a natural image model.

...read moreread less

Abstract: We propose a natural scene statistic based Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) which extracts the point wise statistics of local normalized luminance signals and measures image naturalness (or lack there of) based on measured deviations from a natural image model. We also model the distribution of pairwise statistics of adjacent normalized luminance signals which provides distortion orientation information. Although multi scale, the model uses easy to compute features making it computationally fast and time efficient. The frame work is shown to perform statistically better than other proposed no reference algorithms and full reference structural similarity index (SSIM).

...read moreread less

263 citations

Journal Article•DOI•

Semi-automatic extraction of rock mass structural data from high resolution LIDAR point clouds

[...]

Giovanni Gigli¹, Nicola Casagli¹•Institutions (1)

University of Florence¹

01 Feb 2011-International Journal of Rock Mechanics and Mining Sciences

TL;DR: In this paper, a Matlab tool called DiAna (Discontinuity Analysis) is used for the 2D and 3D geo-structural analysis of rock mass discontinuities on high-resolution laser scanning data.

...read moreread less

256 citations

Journal Article•DOI•

Fingerprint Reconstruction: From Minutiae to Phase

[...]

Jianjiang Feng¹, Anil K. Jain²•Institutions (2)

Tsinghua University¹, Michigan State University²

01 Feb 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel fingerprint reconstruction algorithm is proposed to reconstruct the phase image, which is then converted into the grayscale image, and it is shown that both types of attacks can be successfully launched against a fingerprint recognition system.

...read moreread less

Abstract: Fingerprint matching systems generally use four types of representation schemes: grayscale image, phase image, skeleton image, and minutiae, among which minutiae-based representation is the most widely adopted one. The compactness of minutiae representation has created an impression that the minutiae template does not contain sufficient information to allow the reconstruction of the original grayscale fingerprint image. This belief has now been shown to be false; several algorithms have been proposed that can reconstruct fingerprint images from minutiae templates. These techniques try to either reconstruct the skeleton image, which is then converted into the grayscale image, or reconstruct the grayscale image directly from the minutiae template. However, they have a common drawback: Many spurious minutiae not included in the original minutiae template are generated in the reconstructed image. Moreover, some of these reconstruction techniques can only generate a partial fingerprint. In this paper, a novel fingerprint reconstruction algorithm is proposed to reconstruct the phase image, which is then converted into the grayscale image. The proposed reconstruction algorithm not only gives the whole fingerprint, but the reconstructed fingerprint contains very few spurious minutiae. Specifically, a fingerprint image is represented as a phase image which consists of the continuous phase and the spiral phase (which corresponds to minutiae). An algorithm is proposed to reconstruct the continuous phase from minutiae. The proposed reconstruction algorithm has been evaluated with respect to the success rates of type-I attack (match the reconstructed fingerprint against the original fingerprint) and type-II attack (match the reconstructed fingerprint against different impressions of the original fingerprint) using a commercial fingerprint recognition system. Given the reconstructed image from our algorithm, we show that both types of attacks can be successfully launched against a fingerprint recognition system.

...read moreread less

253 citations

Journal Article•DOI•

Mobile Augmented Reality: Robust detection and tracking of annotations for outdoor augmented reality browsing

[...]

Tobias Langlotz¹, Claus Degendorfer¹, Alessandro Mulloni¹, Gerhard Schall¹, Gerhard Reitmayr¹, Dieter Schmalstieg¹ - Show less +2 more•Institutions (1)

Graz University of Technology¹

01 Aug 2011-Computers & Graphics

TL;DR: An enhanced approach for registering and tracking such anchor points, which is suitable for current generation mobile phones and can also successfully deal with the wide variety of viewing conditions encountered in real life outdoor use is presented.

...read moreread less

237 citations

Journal Article•DOI•

Three-Dimensional Structure Determination from Common Lines in Cryo-EM by Eigenvectors and Semidefinite Programming

[...]

Amit Singer¹, Yoel Shkolnisky•Institutions (1)

Princeton University¹

01 May 2011-Siam Journal on Imaging Sciences

TL;DR: Two algorithms for finding the unknown imaging directions of all projections by minimizing global self-consistency errors are described and are optimal in the sense that they reach the information theoretic Shannon bound up to a constant for an idealized probabilistic model.

...read moreread less

Abstract: The cryo-electron microscopy reconstruction problem is to find the three-dimensional (3D) structure of a macromolecule given noisy samples of its two-dimensional projection images at unknown random directions. Present algorithms for finding an initial 3D structure model are based on the “angular reconstitution” method in which a coordinate system is established from three projections, and the orientation of the particle giving rise to each image is deduced from common lines among the images. However, a reliable detection of common lines is difficult due to the low signal-to-noise ratio of the images. In this paper we describe two algorithms for finding the unknown imaging directions of all projections by minimizing global self-consistency errors. In the first algorithm, the minimizer is obtained by computing the three largest eigenvectors of a specially designed symmetric matrix derived from the common lines, while the second algorithm is based on semidefinite programming (SDP). Compared with existing algorithms, the advantages of our algorithms are five-fold: first, they accurately estimate all orientations at very low common-line detection rates; second, they are extremely fast, as they involve only the computation of a few top eigenvectors or a sparse SDP; third, they are nonsequential and use the information in all common lines at once; fourth, they are amenable to a rigorous mathematical analysis using spectral analysis and random matrix theory; and finally, the algorithms are optimal in the sense that they reach the information theoretic Shannon bound up to a constant for an idealized probabilistic model.

...read moreread less

Patent•

Digital camera with integrated accelerometers

[...]

Claus Molgaard

26 Apr 2011

TL;DR: In this article, a digital camera system has integrated accelerometers for determining static and dynamic accelerations of the digital cameral system, which can be used for correcting image data for roll, pitch and vibrations.

...read moreread less

Abstract: A digital camera system has integrated accelerometers for determining static and dynamic accelerations of the digital cameral system. Data relating to static and dynamic accelerations are stored with recorded image data for further processing, such as for correcting image data for roll, pitch and vibrations and for displaying recorded images with a predetermined orientation using information about, e.g., roll. Data may also be used on-the-fly for smear suppression caused by vibrations.

...read moreread less

Journal Article•DOI•

Ensemble of local and global information for finger-knuckle-print recognition

[...]

Lin Zhang¹, Lei Zhang¹, David Zhang¹, Hailong Zhu¹•Institutions (1)

Hong Kong Polytechnic University¹

01 Sep 2011-Pattern Recognition

TL;DR: The experimental results demonstrate that the proposed local-global information combination scheme could significantly improve the recognition accuracy obtained by either local or global information and lead to promising performance of an FKP-based personal authentication system.

...read moreread less

Journal Article•DOI•

Face Recognition by Exploring Information Jointly in Space, Scale and Orientation

[...]

Zhen Lei¹, Shengcai Liao¹, Matti Pietikäinen², Stan Z. Li¹•Institutions (2)

Chinese Academy of Sciences¹, University of Oulu²

01 Jan 2011-IEEE Transactions on Image Processing

TL;DR: A novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains by convolving multiscale and multi-orientation Gabor filters is proposed.

...read moreread less

Abstract: Information jointly contained in image space, scale and orientation domains can provide rich important clues not seen in either individual of these domains. The position, spatial frequency and orientation selectivity properties are believed to have an important role in visual perception. This paper proposes a novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains. Specifically, the face image is first decomposed into different scale and orientation responses by convolving multiscale and multi-orientation Gabor filters. Second, local binary pattern analysis is used to describe the neighboring relationship not only in image space, but also in different scale and orientation responses. This way, information from different domains is explored to give a good face representation for recognition. Discriminant classification is then performed based upon weighted histogram intersection or conditional mutual information with linear discriminant analysis techniques. Extensive experimental results on FERET, AR, and FRGC ver 2.0 databases show the significant advantages of the proposed method over the existing ones.

...read moreread less

Journal Article•DOI•

Adaptive Model-Based Decomposition of Polarimetric SAR Covariance Matrices

[...]

Motofumi Arii¹, J.J. van Zyl², Yunjin Kim²•Institutions (2)

Mitsubishi¹, California Institute of Technology²

01 Mar 2011-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The randomness and mean orientation angle maps generated using the adaptive decomposition significantly improve the physical interpretation of the scattering observed at the three different frequencies.

...read moreread less

Abstract: Previous model-based decomposition techniques are applicable to a limited range of vegetation types because of their specific assumptions about the volume scattering component. Furthermore, most of these techniques use the same model, or just a few models, to characterize the volume scattering component in the decomposition for all pixels in an image. In this paper, we extend the model-based decomposition idea by creating an adaptive model-based decomposition technique, allowing us to estimate both the mean orientation angle and a degree of randomness for the canopy scattering for each pixel in an image. No scattering reflection symmetry assumption is required to determine the volume contribution. We examined the usefulness of the proposed decomposition technique by decomposing the covariance matrix using the National Aeronautics and Space Administration/Jet Propulsion Laboratory Airborne Synthetic Aperture Radar data at the C-, L-, and P-bands. The randomness and mean orientation angle maps generated using our adaptive decomposition significantly improve the physical interpretation of the scattering observed at the three different frequencies.

...read moreread less

Patent•

Gesture mapping for image filter input parameters

[...]

David Hayward¹, Chendi Zhang¹, Alexandre Naaman¹, Richard R. Dellinger¹, Giridhar Sreenivasa Murthy¹ - Show less +1 more•Institutions (1)

Apple Inc.¹

21 Mar 2011

TL;DR: In this article, the authors present a system for mapping user interactions to the input parameters of image processing routines, e.g., image filters, in a way that provides a seamless, dynamic, and intuitive experience for both the user and the software developer.

...read moreread less

Abstract: This disclosure pertains to systems, methods, and computer readable medium for mapping particular user interactions, e.g., gestures, to the input parameters of various image processing routines, e.g., image filters, in a way that provides a seamless, dynamic, and intuitive experience for both the user and the software developer. Such techniques may handle the processing of both “relative” gestures, i.e., those gestures having values dependent on how much an input to the device has changed relative to a previous value of the input, and “absolute” gestures, i.e., those gestures having values dependent only on the instant value of the input to the device. Additionally, inputs to the device beyond user-input gestures may be utilized as input parameters to one or more image processing routines. For example, the device's orientation, acceleration, and/or position in three-dimensional space may be used as inputs to particular image processing routines.

...read moreread less

Patent•

Orientation-based audio

[...]

Martin E. Johnson¹, Ruchi Goel¹, Darby E. Hadley¹•Institutions (1)

Apple Inc.¹

22 Nov 2011

TL;DR: In this paper, a method and apparatus for outputting audio based on an orientation of an electronic device or video shown by the electronic device is presented, where the audio is mapped to a set of speakers using either or both of the device and video orientation to determine which speakers receive certain audio channels.

...read moreread less

Abstract: A method and apparatus for outputting audio based on an orientation of an electronic device, or video shown by the electronic device. The audio may be mapped to a set of speakers using either or both of the device and video orientation to determine which speakers receive certain audio channels.

...read moreread less

Proceedings Article•DOI•

Shape estimation in natural illumination

[...]

Micah K. Johnson¹, Edward H. Adelson¹•Institutions (1)

Massachusetts Institute of Technology¹

20 Jun 2011

TL;DR: It is demonstrated that many natural lighting environments already have sufficient variability to constrain local shape, and a novel optimization scheme is described that exploits this variability to estimate surface normals from a single image of a diffuse object in natural illumination.

...read moreread less

Abstract: The traditional shape-from-shading problem, with a single light source and Lambertian reflectance, is challenging since the constraints implied by the illumination are not sufficient to specify local orientation. Photometric stereo algorithms, a variant of shape-from-shading, simplify the problem by controlling the illumination to obtain additional constraints. In this paper, we demonstrate that many natural lighting environments already have sufficient variability to constrain local shape. We describe a novel optimization scheme that exploits this variability to estimate surface normals from a single image of a diffuse object in natural illumination. We demonstrate the effectiveness of our method on both simulated and real images.

...read moreread less

Journal Article•DOI•

2D–3D shape reconstruction of the distal femur from stereo X-ray imaging using statistical shape models

[...]

Nora Baka¹, Bart L. Kaptein¹, M. de Bruijne², M. de Bruijne³, T. van Walsum², J. E. Giphart, Wiro J. Niessen⁴, Wiro J. Niessen², Boudewijn P. F. Lelieveldt¹, Boudewijn P. F. Lelieveldt⁴ - Show less +6 more•Institutions (4)

Leiden University Medical Center¹, Erasmus University Rotterdam², University of Copenhagen³, Delft University of Technology⁴

01 Dec 2011-Medical Image Analysis

TL;DR: A method for pose estimation and shape reconstruction of 3D bone surfaces from two (or more) calibrated X-ray images using a statistical shape model (SSM) and automatic edge selection on a Canny edge map is proposed.

...read moreread less

Proceedings Article•

Joint 3D Estimation of Objects and Scene Layout

[...]

Andreas Geiger¹, Christian Wojek, Raquel Urtasun•Institutions (1)

Karlsruhe Institute of Technology¹

12 Dec 2011

TL;DR: A novel generative model is proposed that is able to reason jointly about the 3D scene layout as well as the3D location and orientation of objects in the scene and significantly increase the performance of state-of-the-art object detectors in their ability to estimate object orientation.

...read moreread less

Abstract: We propose a novel generative model that is able to reason jointly about the 3D scene layout as well as the 3D location and orientation of objects in the scene. In particular, we infer the scene topology, geometry as well as traffic activities from a short video sequence acquired with a single camera mounted on a moving car. Our generative model takes advantage of dynamic information in the form of vehicle tracklets as well as static information coming from semantic labels and geometry (i.e., vanishing points). Experiments show that our approach outperforms a discriminative baseline based on multiple kernel learning (MKL) which has access to the same image information. Furthermore, as we reason about objects in 3D, we are able to significantly increase the performance of state-of-the-art object detectors in their ability to estimate object orientation.

...read moreread less

Journal Article•DOI•

Real-Time Gaze Estimator Based on Driver's Head Orientation for Forward Collision Warning System

[...]

Sung Joo Lee¹, Jaeik Jo¹, Ho Gi Jung², Kang Ryoung Park³, Jaihie Kim¹ - Show less +1 more•Institutions (3)

Yonsei University¹, Mando Corporation², Dongguk University³

01 Mar 2011-IEEE Transactions on Intelligent Transportation Systems

TL;DR: This paper presents a vision-based real-time gaze zone estimator based on a driver's head orientation composed of yaw and pitch that can work under both day and night conditions and is robust to facial image variation caused by eyeglasses.

...read moreread less

Abstract: This paper presents a vision-based real-time gaze zone estimator based on a driver's head orientation composed of yaw and pitch. Generally, vision-based methods are vulnerable to the wearing of eyeglasses and image variations between day and night. The proposed method is novel in the following four ways: First, the proposed method can work under both day and night conditions and is robust to facial image variation caused by eyeglasses because it only requires simple facial features and not specific features such as eyes, lip corners, and facial contours. Second, an ellipsoidal face model is proposed instead of a cylindrical face model to exactly determine a driver's yaw. Third, we propose new features-the normalized mean and the standard deviation of the horizontal edge projection histogram-to reliably and rapidly estimate a driver's pitch. Fourth, the proposed method obtains an accurate gaze zone by using a support vector machine. Experimental results from 200 000 images showed that the root mean square errors of the estimated yaw and pitch angles are below 7 under both daylight and nighttime conditions. Equivalent results were obtained for drivers with glasses or sunglasses, and 18 gaze zones were accurately estimated using the proposed gaze estimation method.

...read moreread less

Journal Article•DOI•

Automatic Optic Disc Detection From Retinal Images by a Line Operator

[...]

Shijian Lu¹, Joo-Hwee Lim¹•Institutions (1)

Agency for Science, Technology and Research¹

01 Jan 2011-IEEE Transactions on Biomedical Engineering

TL;DR: The proposed technique makes use of the unique circular brightness structure associated with the OD, which usually has a circular shape and is brighter than the surrounding pixels whose intensity becomes darker gradually with their distances from the OD center.

...read moreread less

Abstract: Under the framework of computer-aided eye disease diagnosis, this paper presents an automatic optic disc (OD) detection technique. The proposed technique makes use of the unique circular brightness structure associated with the OD, i.e., the OD usually has a circular shape and is brighter than the surrounding pixels whose intensity becomes darker gradually with their distances from the OD center. A line operator is designed to capture such circular brightness structure, which evaluates the image brightness variation along multiple line segments of specific orientations that pass through each retinal image pixel. The orientation of the line segment with the minimum/maximum variation has specific pattern that can be used to locate the OD accurately. The proposed technique has been tested over four public datasets that include 130, 89, 40, and 81 images of healthy and pathological retinas, respectively. Experiments show that the designed line operator is tolerant to different types of retinal lesion and imaging artifacts, and an average OD detection accuracy of 97.4% is obtained.

...read moreread less

Patent•

Apparatus and methods for pulse-code invariant object recognition

[...]

Eugene M. Izhikevich

02 Jun 2011

TL;DR: In this paper, the information is encoded in one variant as a pattern of pulse latencies relative to an occurrence of a temporal event; e.g., the appearance of a new visual frame or movement of the image.

...read moreread less

Abstract: Object recognition apparatus and methods useful for extracting information from sensory input. In one embodiment, the input signal is representative of an element of an image, and the extracted information is encoded in a pulsed output signal. The information is encoded in one variant as a pattern of pulse latencies relative to an occurrence of a temporal event; e.g., the appearance of a new visual frame or movement of the image. The pattern of pulses advantageously is substantially insensitive to such image parameters as size, position, and orientation, so the image identity can be readily decoded. The size, position, and rotation affect the timing of occurrence of the pattern relative to the event; hence, changing the image size or position will not change the pattern of relative pulse latencies but will shift it in time, e.g., will advance or delay its occurrence.

...read moreread less

Journal Article•DOI•

Curved Gabor Filters for Fingerprint Image Enhancement

[...]

Carsten Gottschlich¹•Institutions (1)

University of Göttingen¹

21 Apr 2011-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images for the purpose of enhancing curved structures in noisy images, which locally adapt their shape to the direction of flow.

...read moreread less

Abstract: Gabor filters play an important role in many application areas for the enhancement of various types of images and the extraction of Gabor features. For the purpose of enhancing curved structures in noisy images, we introduce curved Gabor filters which locally adapt their shape to the direction of flow. These curved Gabor filters enable the choice of filter parameters which increase the smoothing power without creating artifacts in the enhanced image. In this paper, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images. First, we combine two orientation field estimation methods in order to obtain a more robust estimation for very noisy images. Next, curved regions are constructed by following the respective local orientation and they are used for estimating the local ridge frequency. Lastly, curved Gabor filters are defined based on curved regions and they are applied for the enhancement of low-quality fingerprint images. Experimental results on the FVC2004 databases show improvements of this approach in comparison to state-of-the-art enhancement methods.

...read moreread less

Proceedings Article•DOI•

Geo-localization of street views with aerial image databases

[...]

Mayank Bansal¹, Harpreet Sawhney¹, Hui Cheng¹, Kostas Daniilidis²•Institutions (2)

SRI International¹, University of Pennsylvania²

28 Nov 2011

TL;DR: The feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images is studied.

...read moreread less

Abstract: We study the feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images. We observe that comprehensive aerial image databases are widely available while complete coverage of urban areas from the ground is at best spotty. As a result, localization of ground level imagery with respect to aerial collections is a technically important and practically significant problem. We exploit two key insights: (1) satellite image to oblique aerial image correspondences are used to extract building facades, and (2) building facades are matched between oblique aerial and ground images for geo-localization. Key contributions include: (1) A novel method for extracting building facades using building outlines; (2) Correspondence of building facades between oblique aerial and ground images without direct matching; and (3) Position and orientation estimation of ground images. We show results of ground image localization in a dense urban area.

...read moreread less

Patent•

Method and apparatus for managing and controlling manned and automated utility vehicles

[...]

Robert S. Kunzig, Robert M. Taylor, David C. Emanuel, Leonard J. Maxwell

17 Jun 2011

TL;DR: In this paper, a machine vision image acquisition apparatus determines the position and the rotational orientation of vehicles in a predefined coordinate space by acquiring an image of one or more position markers and processing the acquired image to calculate the vehicle's position and rotation based on processed image data.

...read moreread less

Abstract: A method and apparatus for managing manned and automated utility vehicles, and for picking up and delivering objects by automated vehicles. A machine vision image acquisition apparatus determines the position and the rotational orientation of vehicles in a predefined coordinate space by acquiring an image of one or more position markers and processing the acquired image to calculate the vehicle's position and rotational orientation based on processed image data. The position of the vehicle is determined in two dimensions. Rotational orientation (heading) is determined in the plane of motion. An improved method of position and rotational orientation is presented. Based upon the determined position and rotational orientation of the vehicles stored in a map of the coordinate space, a vehicle controller, implemented as part of a computer, controls the automated vehicles through motion and steering commands, and communicates with the manned vehicle operators by transmitting control messages to each operator.

...read moreread less

Patent•

Systems for mobile image capture and remittance processing

[...]

Grigori Nepomniachtchi, James DeBello, Josh Roach

17 Oct 2011

TL;DR: In this paper, a method and system for document image capture and processing using mobile devices is presented, where the image is optimized and enhanced for data extraction from the document as depicted.

...read moreread less

Abstract: The present invention relates to automated document processing and more particularly, to methods and systems for document image capture and processing using mobile devices. In accordance with various embodiments, methods and systems for document image capture on a mobile communication device are provided such that the image is optimized and enhanced for data extraction from the document as depicted. These methods and systems may comprise capturing an image of a document using a mobile communication device; transmitting the image to a server; and processing the image to create a bi-tonal image of the document for data extraction. Additionally, these methods and systems may comprise capturing a first image of a document using the mobile communication device; automatically detecting the document within the image; geometrically correcting the image; binarizing the image; correcting the orientation of the image; correcting the size of the image; and outputting the resulting image of the document.

...read moreread less

Proceedings Article•DOI•

Free-form anisotropy: A new method for crack detection on pavement surface images

[...]

Tien Sy Nguyen¹, Stéphane Begot¹, Florent Duculty¹, Manuel Avila¹•Institutions (1)

University of Orléans¹

29 Dec 2011

TL;DR: A new measure which takes into accounts simultaneously brightness and connectivity, in the segmentation step, for crack detection on road pavement images is presented, which does not need learning stage of free defect texture to perform default detection.

...read moreread less

Abstract: This paper presents a new measure which takes into accounts simultaneously brightness and connectivity, in the segmentation step, for crack detection on road pavement images. Features which are calculated along every free-form paths provide detection of cracks with any form and any orientation. The method proposed does not need learning stage of free defect texture to perform default detection. Experimental results were conducted on some samples of different kinds of pavements. Results of the method are also given on other kinds of images and can provide perspectives on other domains as road extraction on satellite images or segment blood vessels in retinal images.

...read moreread less

Journal Article•DOI•

Directional features for automatic tumor classification of mammogram images

[...]

Ioan Buciu¹, A. Gacsadi¹•Institutions (1)

Information Technology University¹

01 Oct 2011-Biomedical Signal Processing and Control

TL;DR: This work aims at describing an approach to deal with the classification of digital mammograms using Principal Component Analysis to reduce the dimension of filtered and unfiltered high-dimensional data and Gabor features are extracted instead of using original mammogram images.

...read moreread less

Journal Article•DOI•

Automatic brain extraction methods for T1 magnetic resonance images using region labeling and morphological operations

[...]

K. Somasundaram¹, T. Kalaiselvi¹•Institutions (1)

Gandhigram Rural Institute¹

01 Aug 2011-Computers in Biology and Medicine

TL;DR: Two brain extraction methods (BEM) that solely depend on the brain anatomy and its intensity characteristics are proposed that give better results than the popular methods, FSL's Brain Extraction Tool (BET), BrainSuite's Brain Surface Extractor (BSE) and works well even where MLS failed.

...read moreread less

Collapse