scispace - formally typeset
Search or ask a question

Showing papers on "Orientation (computer vision) published in 2011"


Proceedings ArticleDOI
20 Jun 2011
TL;DR: This paper proposes a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame, at much lower computational cost.
Abstract: The Perspective-Three-Point (P3P) problem aims at determining the position and orientation of the camera in the world reference frame from three 2D-3D point correspondences. This problem is known to provide up to four solutions that can then be disambiguated using a fourth point. All existing solutions attempt to first solve for the position of the points in the camera reference frame, and then compute the position and orientation of the camera in the world frame, which alignes the two point sets. In contrast, in this paper we propose a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame. This is made possible by introducing intermediate camera and world reference frames, and expressing their relative position and orientation using only two parameters. The projection of a world point into the parametrized camera pose then leads to two conditions and finally a quartic equation for finding up to four solutions for the parameter pair. A subsequent backsubstitution directly leads to the corresponding camera poses with respect to the world reference frame. We show that the proposed algorithm offers accuracy and precision comparable to a popular, standard, state-of-the-art approach but at much lower computational cost (15 times faster). Furthermore, it provides improved numerical stability and is less affected by degenerate configurations of the selected world points. The superior computational efficiency is particularly suitable for any RANSAC-outlier-rejection step, which is always recommended before applying PnP or non-linear optimization of the final solution.

563 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: This work proposes a new ‘grasping rectangle’ representation: an oriented rectangle in the image plane that takes into account the location, the orientation as well as the gripper opening width and shows that this algorithm is successfully used to pick up a variety of novel objects.
Abstract: Given an image and an aligned depth map of an object, our goal is to estimate the full 7-dimensional gripper configuration—its 3D location, 3D orientation and the gripper opening width. Recently, learning algorithms have been successfully applied to grasp novel objects—ones not seen by the robot before. While these approaches use low-dimensional representations such as a ‘grasping point’ or a ‘pair of points’ that are perhaps easier to learn, they only partly represent the gripper configuration and hence are sub-optimal. We propose to learn a new ‘grasping rectangle’ representation: an oriented rectangle in the image plane. It takes into account the location, the orientation as well as the gripper opening width. However, inference with such a representation is computationally expensive. In this work, we present a two step process in which the first step prunes the search space efficiently using certain features that are fast to compute. For the remaining few cases, the second step uses advanced features to accurately select a good grasp. In our extensive experiments, we show that our robot successfully uses our algorithm to pick up a variety of novel objects.

487 citations


Journal ArticleDOI
TL;DR: A novel, semiautomated image-analysis software to streamline the quantitative analysis of root growth and architecture of complex root systems, which combines a vectorial representation of root objects with a powerful tracing algorithm that accommodates a wide range of image sources and quality.
Abstract: We present in this paper a novel, semi-automated image analysis software to streamline the quantitative analysis of root growth and architecture of complex root systems. The software combines a vectorial representation of root objects with a powerful tracing algorithm which accommodates a wide range of image sources and quality. The root system is treated as a collection of roots (possibly connected) that are individually represented as parsimonious sets of connected segments. Pixel coordinates and grey level are therefore turned into intuitive biological attributes such as segment diameter and orientation, distance to any other segment or topological position. As a consequence, user interaction and data analysis directly operate on biologicalentities (roots) and are not hampered by the spatially discrete, pixel-based nature of the original image. The software supports a sampling-based analysis of root system images, in which detailed information is collected on a limited number of roots selected by the user according to specific research requirements. The use of the software is illustrated with a time-lapse analysis of cluster root formation in lupin (Lupinus albus) and with an architectural analysis of maize root system (Zea mays). The software, SmartRoot, is an operating system independent freeware based on ImageJ and relies on cross-platform standards for communication with data analysis softwares.

397 citations


Journal ArticleDOI
TL;DR: A hybrid approach to robustly detect and localize texts in natural scene images using a text region detector, a conditional random field model, and a learning-based energy minimization method are presented.
Abstract: Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring manual tuning. Experimental results evaluated on the ICDAR 2005 competition dataset show that our approach yields higher precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image dataset with promising results.

394 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.
Abstract: In this paper, we propose a method based on the Laplacian in the frequency domain for video text detection. Unlike many other approaches which assume that text is horizontally-oriented, our method is able to handle text of arbitrary orientation. The input image is first filtered with Fourier-Laplacian. K-means clustering is then used to identify candidate text regions based on the maximum difference. The skeleton of each connected component helps to separate the different text strings from each other. Finally, text string straightness and edge density are used for false positive elimination. Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.

278 citations


01 Dec 2011
TL;DR: In this article, the authors compare and evaluate different image matching methods for glacier flow determination over large scales, and they consider CCF-O and COSI-Corr to be the two most robust matching methods.
Abstract: Automatic matching of images from two different times is a method that is often used to derive glacier surface velocity. Nearly global repeat coverage of the Earth's surface by optical satellite sensors now opens the possibility for global-scale mapping and monitoring of glacier flow with a number of applications in, for example, glacier physics, glacier-related climate change and impact assessment, and glacier hazard management. The purpose of this study is to compare and evaluate different existing image matching methods for glacier flow determination over large scales. The study compares six different matching methods: normalized cross-correlation (NCC), the phase correlation algorithm used in the COSI-Corr software, and four other Fourier methods with different normalizations. We compare the methods over five regions of the world with different representative glacier characteristics: Karakoram, the European Alps, Alaska, Pine Island (Antarctica) and southwest Greenland. Landsat images are chosen for matching because they expand back to 1972, they cover large areas, and at the same time their spatial resolution is as good as 15 m for images after 1999 (ETM + pan). Cross-correlation on orientation images (CCF-O) outperforms the three similar Fourier methods, both in areas with high and low visual contrast. NCC experiences problems in areas with low visual contrast, areas with thin clouds or changing snow conditions between the images. CCF-O has problems on narrow outlet glaciers where small window sizes (about 16 pixels by 16 pixels or smaller) are needed, and it also obtains fewer correct matches than COSI-Corr in areas with low visual contrast. COSI-Corr has problems on narrow outlet glaciers and it obtains fewer correct matches compared to CCF-O when thin clouds cover the surface, or if one of the images contains snow dunes. In total, we consider CCF-O and COSI-Corr to be the two most robust matching methods for global-scale mapping and monitoring of glacier velocities. If combining CCF-O with locally adaptive template sizes and by filtering the matching results automatically by comparing the displacement matrix to its low pass filtered version, the matching process can be automated to a large degree. This allows the derivation of glacier velocities with minimal (but not without!) user interaction and hence also opens up the possibility of global-scale mapping and monitoring of glacier flow.

272 citations


Proceedings ArticleDOI
01 Nov 2011
TL;DR: A natural scene statistic based Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) which extracts the point wise statistics of local normalized luminance signals and measures image naturalness (or lack there of) based on measured deviations from a natural image model.
Abstract: We propose a natural scene statistic based Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) which extracts the point wise statistics of local normalized luminance signals and measures image naturalness (or lack there of) based on measured deviations from a natural image model. We also model the distribution of pairwise statistics of adjacent normalized luminance signals which provides distortion orientation information. Although multi scale, the model uses easy to compute features making it computationally fast and time efficient. The frame work is shown to perform statistically better than other proposed no reference algorithms and full reference structural similarity index (SSIM).

263 citations


Journal ArticleDOI
TL;DR: In this paper, a Matlab tool called DiAna (Discontinuity Analysis) is used for the 2D and 3D geo-structural analysis of rock mass discontinuities on high-resolution laser scanning data.

256 citations


Journal ArticleDOI
TL;DR: A novel fingerprint reconstruction algorithm is proposed to reconstruct the phase image, which is then converted into the grayscale image, and it is shown that both types of attacks can be successfully launched against a fingerprint recognition system.
Abstract: Fingerprint matching systems generally use four types of representation schemes: grayscale image, phase image, skeleton image, and minutiae, among which minutiae-based representation is the most widely adopted one. The compactness of minutiae representation has created an impression that the minutiae template does not contain sufficient information to allow the reconstruction of the original grayscale fingerprint image. This belief has now been shown to be false; several algorithms have been proposed that can reconstruct fingerprint images from minutiae templates. These techniques try to either reconstruct the skeleton image, which is then converted into the grayscale image, or reconstruct the grayscale image directly from the minutiae template. However, they have a common drawback: Many spurious minutiae not included in the original minutiae template are generated in the reconstructed image. Moreover, some of these reconstruction techniques can only generate a partial fingerprint. In this paper, a novel fingerprint reconstruction algorithm is proposed to reconstruct the phase image, which is then converted into the grayscale image. The proposed reconstruction algorithm not only gives the whole fingerprint, but the reconstructed fingerprint contains very few spurious minutiae. Specifically, a fingerprint image is represented as a phase image which consists of the continuous phase and the spiral phase (which corresponds to minutiae). An algorithm is proposed to reconstruct the continuous phase from minutiae. The proposed reconstruction algorithm has been evaluated with respect to the success rates of type-I attack (match the reconstructed fingerprint against the original fingerprint) and type-II attack (match the reconstructed fingerprint against different impressions of the original fingerprint) using a commercial fingerprint recognition system. Given the reconstructed image from our algorithm, we show that both types of attacks can be successfully launched against a fingerprint recognition system.

253 citations


Journal ArticleDOI
TL;DR: An enhanced approach for registering and tracking such anchor points, which is suitable for current generation mobile phones and can also successfully deal with the wide variety of viewing conditions encountered in real life outdoor use is presented.

237 citations


Journal ArticleDOI
TL;DR: Two algorithms for finding the unknown imaging directions of all projections by minimizing global self-consistency errors are described and are optimal in the sense that they reach the information theoretic Shannon bound up to a constant for an idealized probabilistic model.
Abstract: The cryo-electron microscopy reconstruction problem is to find the three-dimensional (3D) structure of a macromolecule given noisy samples of its two-dimensional projection images at unknown random directions. Present algorithms for finding an initial 3D structure model are based on the “angular reconstitution” method in which a coordinate system is established from three projections, and the orientation of the particle giving rise to each image is deduced from common lines among the images. However, a reliable detection of common lines is difficult due to the low signal-to-noise ratio of the images. In this paper we describe two algorithms for finding the unknown imaging directions of all projections by minimizing global self-consistency errors. In the first algorithm, the minimizer is obtained by computing the three largest eigenvectors of a specially designed symmetric matrix derived from the common lines, while the second algorithm is based on semidefinite programming (SDP). Compared with existing algorithms, the advantages of our algorithms are five-fold: first, they accurately estimate all orientations at very low common-line detection rates; second, they are extremely fast, as they involve only the computation of a few top eigenvectors or a sparse SDP; third, they are nonsequential and use the information in all common lines at once; fourth, they are amenable to a rigorous mathematical analysis using spectral analysis and random matrix theory; and finally, the algorithms are optimal in the sense that they reach the information theoretic Shannon bound up to a constant for an idealized probabilistic model.

Patent
26 Apr 2011
TL;DR: In this article, a digital camera system has integrated accelerometers for determining static and dynamic accelerations of the digital cameral system, which can be used for correcting image data for roll, pitch and vibrations.
Abstract: A digital camera system has integrated accelerometers for determining static and dynamic accelerations of the digital cameral system. Data relating to static and dynamic accelerations are stored with recorded image data for further processing, such as for correcting image data for roll, pitch and vibrations and for displaying recorded images with a predetermined orientation using information about, e.g., roll. Data may also be used on-the-fly for smear suppression caused by vibrations.

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed local-global information combination scheme could significantly improve the recognition accuracy obtained by either local or global information and lead to promising performance of an FKP-based personal authentication system.

Journal ArticleDOI
TL;DR: A novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains by convolving multiscale and multi-orientation Gabor filters is proposed.
Abstract: Information jointly contained in image space, scale and orientation domains can provide rich important clues not seen in either individual of these domains. The position, spatial frequency and orientation selectivity properties are believed to have an important role in visual perception. This paper proposes a novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains. Specifically, the face image is first decomposed into different scale and orientation responses by convolving multiscale and multi-orientation Gabor filters. Second, local binary pattern analysis is used to describe the neighboring relationship not only in image space, but also in different scale and orientation responses. This way, information from different domains is explored to give a good face representation for recognition. Discriminant classification is then performed based upon weighted histogram intersection or conditional mutual information with linear discriminant analysis techniques. Extensive experimental results on FERET, AR, and FRGC ver 2.0 databases show the significant advantages of the proposed method over the existing ones.

Journal ArticleDOI
TL;DR: The randomness and mean orientation angle maps generated using the adaptive decomposition significantly improve the physical interpretation of the scattering observed at the three different frequencies.
Abstract: Previous model-based decomposition techniques are applicable to a limited range of vegetation types because of their specific assumptions about the volume scattering component. Furthermore, most of these techniques use the same model, or just a few models, to characterize the volume scattering component in the decomposition for all pixels in an image. In this paper, we extend the model-based decomposition idea by creating an adaptive model-based decomposition technique, allowing us to estimate both the mean orientation angle and a degree of randomness for the canopy scattering for each pixel in an image. No scattering reflection symmetry assumption is required to determine the volume contribution. We examined the usefulness of the proposed decomposition technique by decomposing the covariance matrix using the National Aeronautics and Space Administration/Jet Propulsion Laboratory Airborne Synthetic Aperture Radar data at the C-, L-, and P-bands. The randomness and mean orientation angle maps generated using our adaptive decomposition significantly improve the physical interpretation of the scattering observed at the three different frequencies.

Patent
21 Mar 2011
TL;DR: In this article, the authors present a system for mapping user interactions to the input parameters of image processing routines, e.g., image filters, in a way that provides a seamless, dynamic, and intuitive experience for both the user and the software developer.
Abstract: This disclosure pertains to systems, methods, and computer readable medium for mapping particular user interactions, e.g., gestures, to the input parameters of various image processing routines, e.g., image filters, in a way that provides a seamless, dynamic, and intuitive experience for both the user and the software developer. Such techniques may handle the processing of both “relative” gestures, i.e., those gestures having values dependent on how much an input to the device has changed relative to a previous value of the input, and “absolute” gestures, i.e., those gestures having values dependent only on the instant value of the input to the device. Additionally, inputs to the device beyond user-input gestures may be utilized as input parameters to one or more image processing routines. For example, the device's orientation, acceleration, and/or position in three-dimensional space may be used as inputs to particular image processing routines.

Patent
22 Nov 2011
TL;DR: In this paper, a method and apparatus for outputting audio based on an orientation of an electronic device or video shown by the electronic device is presented, where the audio is mapped to a set of speakers using either or both of the device and video orientation to determine which speakers receive certain audio channels.
Abstract: A method and apparatus for outputting audio based on an orientation of an electronic device, or video shown by the electronic device. The audio may be mapped to a set of speakers using either or both of the device and video orientation to determine which speakers receive certain audio channels.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: It is demonstrated that many natural lighting environments already have sufficient variability to constrain local shape, and a novel optimization scheme is described that exploits this variability to estimate surface normals from a single image of a diffuse object in natural illumination.
Abstract: The traditional shape-from-shading problem, with a single light source and Lambertian reflectance, is challenging since the constraints implied by the illumination are not sufficient to specify local orientation. Photometric stereo algorithms, a variant of shape-from-shading, simplify the problem by controlling the illumination to obtain additional constraints. In this paper, we demonstrate that many natural lighting environments already have sufficient variability to constrain local shape. We describe a novel optimization scheme that exploits this variability to estimate surface normals from a single image of a diffuse object in natural illumination. We demonstrate the effectiveness of our method on both simulated and real images.

Journal ArticleDOI
TL;DR: A method for pose estimation and shape reconstruction of 3D bone surfaces from two (or more) calibrated X-ray images using a statistical shape model (SSM) and automatic edge selection on a Canny edge map is proposed.

Proceedings Article
12 Dec 2011
TL;DR: A novel generative model is proposed that is able to reason jointly about the 3D scene layout as well as the3D location and orientation of objects in the scene and significantly increase the performance of state-of-the-art object detectors in their ability to estimate object orientation.
Abstract: We propose a novel generative model that is able to reason jointly about the 3D scene layout as well as the 3D location and orientation of objects in the scene. In particular, we infer the scene topology, geometry as well as traffic activities from a short video sequence acquired with a single camera mounted on a moving car. Our generative model takes advantage of dynamic information in the form of vehicle tracklets as well as static information coming from semantic labels and geometry (i.e., vanishing points). Experiments show that our approach outperforms a discriminative baseline based on multiple kernel learning (MKL) which has access to the same image information. Furthermore, as we reason about objects in 3D, we are able to significantly increase the performance of state-of-the-art object detectors in their ability to estimate object orientation.

Journal ArticleDOI
TL;DR: This paper presents a vision-based real-time gaze zone estimator based on a driver's head orientation composed of yaw and pitch that can work under both day and night conditions and is robust to facial image variation caused by eyeglasses.
Abstract: This paper presents a vision-based real-time gaze zone estimator based on a driver's head orientation composed of yaw and pitch. Generally, vision-based methods are vulnerable to the wearing of eyeglasses and image variations between day and night. The proposed method is novel in the following four ways: First, the proposed method can work under both day and night conditions and is robust to facial image variation caused by eyeglasses because it only requires simple facial features and not specific features such as eyes, lip corners, and facial contours. Second, an ellipsoidal face model is proposed instead of a cylindrical face model to exactly determine a driver's yaw. Third, we propose new features-the normalized mean and the standard deviation of the horizontal edge projection histogram-to reliably and rapidly estimate a driver's pitch. Fourth, the proposed method obtains an accurate gaze zone by using a support vector machine. Experimental results from 200 000 images showed that the root mean square errors of the estimated yaw and pitch angles are below 7 under both daylight and nighttime conditions. Equivalent results were obtained for drivers with glasses or sunglasses, and 18 gaze zones were accurately estimated using the proposed gaze estimation method.

Journal ArticleDOI
TL;DR: The proposed technique makes use of the unique circular brightness structure associated with the OD, which usually has a circular shape and is brighter than the surrounding pixels whose intensity becomes darker gradually with their distances from the OD center.
Abstract: Under the framework of computer-aided eye disease diagnosis, this paper presents an automatic optic disc (OD) detection technique. The proposed technique makes use of the unique circular brightness structure associated with the OD, i.e., the OD usually has a circular shape and is brighter than the surrounding pixels whose intensity becomes darker gradually with their distances from the OD center. A line operator is designed to capture such circular brightness structure, which evaluates the image brightness variation along multiple line segments of specific orientations that pass through each retinal image pixel. The orientation of the line segment with the minimum/maximum variation has specific pattern that can be used to locate the OD accurately. The proposed technique has been tested over four public datasets that include 130, 89, 40, and 81 images of healthy and pathological retinas, respectively. Experiments show that the designed line operator is tolerant to different types of retinal lesion and imaging artifacts, and an average OD detection accuracy of 97.4% is obtained.

Patent
02 Jun 2011
TL;DR: In this paper, the information is encoded in one variant as a pattern of pulse latencies relative to an occurrence of a temporal event; e.g., the appearance of a new visual frame or movement of the image.
Abstract: Object recognition apparatus and methods useful for extracting information from sensory input. In one embodiment, the input signal is representative of an element of an image, and the extracted information is encoded in a pulsed output signal. The information is encoded in one variant as a pattern of pulse latencies relative to an occurrence of a temporal event; e.g., the appearance of a new visual frame or movement of the image. The pattern of pulses advantageously is substantially insensitive to such image parameters as size, position, and orientation, so the image identity can be readily decoded. The size, position, and rotation affect the timing of occurrence of the pattern relative to the event; hence, changing the image size or position will not change the pattern of relative pulse latencies but will shift it in time, e.g., will advance or delay its occurrence.

Journal ArticleDOI
TL;DR: In this article, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images for the purpose of enhancing curved structures in noisy images, which locally adapt their shape to the direction of flow.
Abstract: Gabor filters play an important role in many application areas for the enhancement of various types of images and the extraction of Gabor features. For the purpose of enhancing curved structures in noisy images, we introduce curved Gabor filters which locally adapt their shape to the direction of flow. These curved Gabor filters enable the choice of filter parameters which increase the smoothing power without creating artifacts in the enhanced image. In this paper, curved Gabor filters are applied to the curved ridge and valley structure of low-quality fingerprint images. First, we combine two orientation field estimation methods in order to obtain a more robust estimation for very noisy images. Next, curved regions are constructed by following the respective local orientation and they are used for estimating the local ridge frequency. Lastly, curved Gabor filters are defined based on curved regions and they are applied for the enhancement of low-quality fingerprint images. Experimental results on the FVC2004 databases show improvements of this approach in comparison to state-of-the-art enhancement methods.

Proceedings ArticleDOI
28 Nov 2011
TL;DR: The feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images is studied.
Abstract: We study the feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images. We observe that comprehensive aerial image databases are widely available while complete coverage of urban areas from the ground is at best spotty. As a result, localization of ground level imagery with respect to aerial collections is a technically important and practically significant problem. We exploit two key insights: (1) satellite image to oblique aerial image correspondences are used to extract building facades, and (2) building facades are matched between oblique aerial and ground images for geo-localization. Key contributions include: (1) A novel method for extracting building facades using building outlines; (2) Correspondence of building facades between oblique aerial and ground images without direct matching; and (3) Position and orientation estimation of ground images. We show results of ground image localization in a dense urban area.

Patent
17 Jun 2011
TL;DR: In this paper, a machine vision image acquisition apparatus determines the position and the rotational orientation of vehicles in a predefined coordinate space by acquiring an image of one or more position markers and processing the acquired image to calculate the vehicle's position and rotation based on processed image data.
Abstract: A method and apparatus for managing manned and automated utility vehicles, and for picking up and delivering objects by automated vehicles. A machine vision image acquisition apparatus determines the position and the rotational orientation of vehicles in a predefined coordinate space by acquiring an image of one or more position markers and processing the acquired image to calculate the vehicle's position and rotational orientation based on processed image data. The position of the vehicle is determined in two dimensions. Rotational orientation (heading) is determined in the plane of motion. An improved method of position and rotational orientation is presented. Based upon the determined position and rotational orientation of the vehicles stored in a map of the coordinate space, a vehicle controller, implemented as part of a computer, controls the automated vehicles through motion and steering commands, and communicates with the manned vehicle operators by transmitting control messages to each operator.

Patent
17 Oct 2011
TL;DR: In this paper, a method and system for document image capture and processing using mobile devices is presented, where the image is optimized and enhanced for data extraction from the document as depicted.
Abstract: The present invention relates to automated document processing and more particularly, to methods and systems for document image capture and processing using mobile devices. In accordance with various embodiments, methods and systems for document image capture on a mobile communication device are provided such that the image is optimized and enhanced for data extraction from the document as depicted. These methods and systems may comprise capturing an image of a document using a mobile communication device; transmitting the image to a server; and processing the image to create a bi-tonal image of the document for data extraction. Additionally, these methods and systems may comprise capturing a first image of a document using the mobile communication device; automatically detecting the document within the image; geometrically correcting the image; binarizing the image; correcting the orientation of the image; correcting the size of the image; and outputting the resulting image of the document.

Proceedings ArticleDOI
29 Dec 2011
TL;DR: A new measure which takes into accounts simultaneously brightness and connectivity, in the segmentation step, for crack detection on road pavement images is presented, which does not need learning stage of free defect texture to perform default detection.
Abstract: This paper presents a new measure which takes into accounts simultaneously brightness and connectivity, in the segmentation step, for crack detection on road pavement images. Features which are calculated along every free-form paths provide detection of cracks with any form and any orientation. The method proposed does not need learning stage of free defect texture to perform default detection. Experimental results were conducted on some samples of different kinds of pavements. Results of the method are also given on other kinds of images and can provide perspectives on other domains as road extraction on satellite images or segment blood vessels in retinal images.

Journal ArticleDOI
TL;DR: This work aims at describing an approach to deal with the classification of digital mammograms using Principal Component Analysis to reduce the dimension of filtered and unfiltered high-dimensional data and Gabor features are extracted instead of using original mammogram images.

Journal ArticleDOI
TL;DR: Two brain extraction methods (BEM) that solely depend on the brain anatomy and its intensity characteristics are proposed that give better results than the popular methods, FSL's Brain Extraction Tool (BET), BrainSuite's Brain Surface Extractor (BSE) and works well even where MLS failed.