scispace - formally typeset
Search or ask a question

Showing papers on "Homography (computer vision) published in 2007"


Journal ArticleDOI
TL;DR: Theoretical analysis and comparative experiments show that the proposed visual tracking algorithm has a higher convergence rate than standard first-order minimization techniques, and is well adapted to real-time robotic applications.
Abstract: The objective of this paper is to propose a new homography-based approach to image-based visual tracking and servoing. The visual tracking algorithm proposed in the paper is based on a new efficient second-order minimization method. Theoretical analysis and comparative experiments with other tracking approaches show that the proposed method has a higher convergence rate than standard first-order minimization techniques. Therefore, it is well adapted to real-time robotic applications. The output of the visual tracking is a homography linking the current and the reference image of a planar target. Using the homography, a task function isomorphic to the camera pose has been designed. A new image-based control law is proposed which does not need any measure of the 3D structure of the observed target (e.g. the normal to the plane). The theoretical proof of the existence of the isomorphism between the task function and the camera pose and the theoretical proof of the stability of the control law are provided. The experimental results, obtained with a 6 d.o.f. robot, show the advantages of the proposed method with respect to the existing approaches.

332 citations


Patent
13 Feb 2007
TL;DR: In this paper, the homography information and/or corner metadata is calculated for each frame from each video stream, and the data is used to mosaic the separate frames into a single video frame.
Abstract: Methods and systems for combining multiple video streams are provided. Video feeds are received from multiple optical sensors, and homography information and/or corner metadata is calculated for each frame from each video stream. This data is used to mosaic the separate frames into a single video frame. Local translation of each image may also be used to synchronize the video frames. The optical sensors can be provided by an airborne platform, such as a manned or unmanned surveillance vehicle. Image data can be requested from a ground operator, and transmitted from the airborne platform to the user in real time or at a later time. Various data arrangements may be used by an aggregation system to serialize and/or multiplex image data received from multiple sensor modules. Fixed-size record arrangement and variable-size record arrangement systems are provided.

133 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed method is effective in performing image registration and SR for simulated as well as real-life images, and an iterative scheme is developed to solve the arising nonlinear least squares problem.
Abstract: This paper proposes a new algorithm to integrate image registration into image super-resolution (SR) Image SR is a process to reconstruct a high-resolution (HR) image by fusing multiple low-resolution (LR) images A critical step in image SR is accurate registration of the LR images or, in other words, effective estimation of motion parameters Conventional SR algorithms assume either the estimated motion parameters by existing registration methods to be error-free or the motion parameters are known a priori This assumption, however, is impractical in many applications, as most existing registration algorithms still experience various degrees of errors, and the motion parameters among the LR images are generally unknown a priori In view of this, this paper presents a new framework that performs simultaneous image registration and HR image reconstruction As opposed to other current methods that treat image registration and HR reconstruction as disjoint processes, the new framework enables image registration and HR reconstruction to be estimated simultaneously and improved progressively Further, unlike most algorithms that focus on the translational motion model, the proposed method adopts a more generic motion model that includes both translation as well as rotation An iterative scheme is developed to solve the arising nonlinear least squares problem Experimental results show that the proposed method is effective in performing image registration and SR for simulated as well as real-life images

125 citations


Proceedings ArticleDOI
29 Sep 2007
TL;DR: A distributed network of smart cameras whose function is to detect and localize falls, an important application in elderly living environments, and a joint routing and homography transformation scheme for multi-hop localization that yields localization errors of less than 2 feet using very low resolution images are presented.
Abstract: This paper presents the design, implementation and evaluation of a distributed network of smart cameras whose function is to detect and localize falls, an important application in elderly living environments. A network of overlapping smart cameras uses a decentralized procedure for computing inter-image homographies that allows the location of a fall to be reported in 2D world coordinates by calibrating only one camera. Also, we propose a joint routing and homography transformation scheme for multi-hop localization that yields localization errors of less than 2 feet using very low resolution images. Our goal is to demonstrate that such a distributed low-power system can perform adequately in this and related applications. A prototype implementation is given for low-power Agilent/UCLA Cyclops cameras running on the Crossbow MICAz platform. We demonstrate the effectiveness of the fall detection as well as the precision of the localization using a simulation of our sample implementation.

118 citations


Journal ArticleDOI
TL;DR: A new method for text detection and recognition in natural scene images is presented, based on plane-to-plane homography, and results tested from a large dataset have demonstrated that the proposed method is effective and practical.

75 citations


Proceedings ArticleDOI
10 Apr 2007
TL;DR: A probabilistic framework where uncertainties can be considered in the mosaic building process, and how, when a loop is present in the sequence of images, the accumulated drift can be compensated and propagated to the rest of the mosaic.
Abstract: This paper presents a probabilistic framework where uncertainties can be considered in the mosaic building process. It is shown how can be used as an environment representation for an aerial robot. The inter-image relations are modeled by homographies. The paper shows a robust method to compute them in the case of quasi-planar scenes, and also how to estimate the uncertainties in these local image relations. Moreover, the paper describes how, when a loop is present in the sequence of images, the accumulated drift can be compensated and propagated to the rest of the mosaic. In addition, the relations among images in the mosaic can be used, under certain assumptions, to localize the robot.

67 citations


Journal ArticleDOI
TL;DR: This paper presents a new method for the self-recalibration of a structured light system by a single image in the presence of a planar surface in the scene that can be determined automatically from four projection correspondences between an image and a projection plane.

60 citations


Journal ArticleDOI
TL;DR: A real-time, robust and effective tracking framework for visual servoing applications based on the fusion of visual cues and on the estimation of a transformation (either a homography or a 3D pose).
Abstract: This paper proposes a real-time, robust and effective tracking framework for visual servoing applications. The algorithm is based on the fusion of visual cues and on the estimation of a transformation (either a homography or a 3D pose). The parameters of this transformation are estimated using a non-linear minimization of a unique criterion that integrates information both on the texture and the edges of the tracked object. The proposed tracker is more robust and performs well in conditions where methods based on a single cue fail. The framework has been tested for 2D object motion estimation and pose computation. The method presented in this paper has been validated on several video sequences as well as in visual servoing experiments considering various objects. Results show the method to be robust to occlusions or textured backgrounds and suitable for visual servoing applications.

56 citations


Journal ArticleDOI
01 Sep 2007-Robotica
TL;DR: This work captures the motion of a set of fictitious planes, each formed by four feature points, defined at various strategic locations along the body of the robot, and obtains three dimensional shape information continuously to demonstrate the development of a kinematic controller to regulate the end-effector of the robots to a constant desired position and orientation.
Abstract: In this paper, we investigate the problem of measuring the shape of a continuum robot manipulator using visual information from a fixed camera. Specifically, we capture the motion of a set of fictitious planes, each formed by four or more feature points, defined at various strategic locations along the body of the robot. Then, utilizing expressions for the robot forward kinematics as well as the decomposition of a homography relating a reference image of the robot to the actual robot image, we obtain the three-dimensional shape information continuously. We then use this information to demonstrate the development of a kinematic controller to regulate the manipulator end-effector to a constant desired position and orientation.

55 citations


Proceedings ArticleDOI
23 Jul 2007
TL;DR: A wavelet-based feature extraction technique, normalized cross-correlation matching and relaxation-based image matching techniques are employed in this new method, which shows that the proposed algorithm can select enough control points to reduce the local distortions caused by terrain relief.
Abstract: Image registration is the process of geometrically aligning one image to another image of the same scene taken from different viewpoints or by different sensors. High resolution remote sensing images have made it more convenient for people to study the earth; however, they also create challenges for traditional research methods. In terms of image registration, there are a number of problems with using current image registration techniques for high resolution images. This study proposes a new image registration technique, which is based on the combination of feature-based matching (FBM) and area-based matching (ABM). A wavelet-based feature extraction technique, normalized cross-correlation matching and relaxation-based image matching techniques are employed in this new method. Two pairs of data sets, panchromatic images of IKONOS and a panchromatic image of IKONOS with a multispectral image of Quickbird, are used to evaluate the proposed image registration algorithm. The experiment results show that the proposed algorithm can select enough control points to reduce the local distortions caused by terrain relief.

53 citations


Patent
08 May 2007
TL;DR: In this paper, a first image is taken using a digital camera associated with a communication terminal, and query data related to the first image are transmitted via a communication network to a remote recognition server.
Abstract: For retrieving information based on images, a first image is taken (S1 ) using a digital camera associated with a communication terminal (1). Query data related to the first image is transmitted (S3) via a communication network (2) to a remote recognition server (3). In the remote recognition server (3) a reference image is identified (S4) based on the query data. Subsequently, in the remote recognition server (3), a Homography is computed (S5) based on the reference image and the query data, the Homography mapping the reference image to the first image. Moreover, in the remote recognition server(3), a second image is selected (S6) and a projection image is computed (S7) of the second image using the Homography. By replacing a part of the first image with at least a part of the projection image, an augmented image is generated (S8, S10) and displayed (S11) at the communication terminal (1). Efficient augmentation of the first image taken with the camera is made possible by remaining in the planar space and dealing with two-dimensional images and objects only.

01 Jan 2007
TL;DR: It is demonstrated that the viewer’s image without the occluding object can be synthesized for every camera on-line, even though all the cameras are freely moving for all cameras each camera that relates.
Abstract: In this paper, we present a system for Diminished Reality with multiple handheld camera system. In this paper, we assume a situation such that the same scene is captured with multiple handheld cameras, but some objects occlude the scene. In such case, we propose a method for synthesizing image for each camera in which the occluding objects are diminished by the support of the different cameras that capture the same scene at the different viewpoints. In the proposed method, we use the AR-Tag marker to calibrate the multiple cameras, so that online process can be possible. By the use of AR tag, we compute homographies for between each camera’s imaging plane and the objective scene that is approximated as a planar in this research. The homography is used for warping the planar area for synthesizing the viewer’s image that includes only the objective scene without the occluding object that cannot be approximated as a planar area. We demonstrate that the viewer’s image without the occluding object can be synthesized for every camera on-line, even though all the cameras are freely moving for all cameras each camera that relates.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: This paper shows that visual hull intersection can be performed in the image plane without requiring to go in 3D space, and shows the application of the method on complicated object shapes as well as cluttered environments containing multiple objects.
Abstract: This paper presents a purely image-based approach to fusing foreground silhouette information from multiple arbitrary views. Our approach does not require 3D constructs like camera calibration to carve out 3D voxels or project visual cones in 3D space. Using planar homographies and foreground likelihood information from a set of arbitrary views, we show that visual hull intersection can be performed in the image plane without requiring to go in 3D space. This process delivers a 2D grid of object occupancy likelihoods representing a cross-sectional slice of the object. Subsequent slices of the object are obtained by extending the process to planes parallel to a reference plane in a direction along the body of the object. We show that homographies of these new planes between views can be computed in the framework of plane to plane homologies using the homography induced by a reference plane and the vanishing point of the reference direction. Occupancy grids are stacked on top of each other, creating a three dimensional data structure that encapsulates the object shape and location. Object structure is finally segmented out by minimizing an energy functional over the surface of the object in a level sets formulation. We show the application of our method on complicated object shapes as well as cluttered environments containing multiple objects.

Patent
Ross Cutler1
21 Feb 2007
TL;DR: In this article, a panorama warping calibration model and manufacturing calibration data are characterized in a parametric model that is stored on the camera and utilized for camera calibration, which can employ combinations of intra camera homography, inter-camera homography and polynomial warps.
Abstract: Architecture for spatially calibrating a multi-sensor panoramic camera. A panorama warping calibration model and manufacturing calibration data is characterized in a parametric model that is stored on the camera and utilized for camera calibration. Calibration techniques can employ combinations of intra-camera homography, inter-camera homography, and polynomial warps, which correct the error-free spatial panorama warping calibration model. Calibration system configuration can include a stationary camera system for spatial pattern testing for each camera and a rotational camera system for rotating a multi-sensor panoramic camera through a single spatial pattern.

Journal ArticleDOI
TL;DR: An adaptive, homography-based visual servo tracking controller is developed to navigate the position and orientation of a camera held by the end-effector of a robot manipulator to a goal position and Orientation along the desired image-space trajectory while ensuring the target points remain visible under certain technical restrictions.

Patent
30 May 2007
TL;DR: In this paper, an image guided surgical system and method for correction of automated image registration via user interaction is presented, consisting of at least one imaging apparatus adapted to acquire a first image and a second image of a region of interest of a subject.
Abstract: An image guided surgical system and method for correction of automated image registration via user interaction. The system and method comprising at least one imaging apparatus adapted to acquire a first image and a second image of a region of interest of a subject, a registration component adapted to perform a registration of the second image to a dataset of the first image, at least one display for displaying a visualization of the registration of the second image to a dataset of the first image as it is occurring, and a user interface for manipulating the visualization of the registration to correct any misalignments between the first image and the second image in the registration.

Book ChapterDOI
20 Oct 2007
TL;DR: A vision-based projected table-top interface for finger interaction that provides more comfortable and direct viewing for users, and more natural, intuitive yet flexible interaction than classical or tangible interfaces.
Abstract: We designed and implemented a vision-based projected table-top interface for finger interaction. The system offers a simple and quick setup and economic design. The projection onto the tabletop provides more comfortable and direct viewing for users, and more natural, intuitive yet flexible interaction than classical or tangible interfaces. Homography calibration techniques are used to provide geometrically compensated projections on the tabletop. A robust finger tracking algorithm is proposed to enable accurate and efficient interactions using this interface. Two applications have been implemented based on this interface.

Proceedings Article
24 Aug 2007
TL;DR: A new feature based image mosaic algorithm based on the modified media flow filter, to detect wrong matches for improving the stability of the normal RANSAC homography algorithm.
Abstract: In this paper, we proposed a new feature based image mosaic algorithm. The improved RANSAC homography algorithm based on the modified media flow filter, to detect wrong matches for improving the stability of the normal RANSAC homography algorithm. The method improved the local registration between neighboring images. Experiments and Statistical Analysis show that our mosaic method is robust.

Proceedings ArticleDOI
29 Oct 2007
TL;DR: A simple method to estimate an IPM view from an embedded camera based on the tracking of the road markers assuming that the road is locally planar to develop a free-space estimator which can be implemented in an Autonomous Guided Vehicle to allow a safe path planning.
Abstract: We present in this article a simple method to estimate an IPM view from an embedded camera. The method is based on the tracking of the road markers assuming that the road is locally planar. Our aim is the development of a free-space estimator which can be implemented in an Autonomous Guided Vehicle to allow a safe path planning. Opposite to most of the obstacle detection methods which make assumptions on the shape or height of the obstacles, all the scene elements above the road plane (particularly kerbs and poles) have to be detected as obstacles. Combined with the IPM tranformation, this obstacle detection stage can be viewed as the first stage of a free-space estimator dedicated to AGV in the complex urban environments.

Proceedings ArticleDOI
10 Apr 2007
TL;DR: This paper presents a new visual control approach based on homography intended for nonholonomic vehicles with a fixed monocular system on board needing neither decomposition of the homography nor depth estimation to the target.
Abstract: This paper presents a new visual control approach based on homography. The method is intended for nonholonomic vehicles with a fixed monocular system on board. The idea of visual control used here is the usual approach where the desired position of the robot is given by a target image taken at that position. This target image is the only previous information needed by the control law to perform the navigation from the initial position to the target. The control law is designed by the input-output linearization of the system using elements of the homography as output. The contribution is a controller that deals with the nonholonomic constraints of the mobile platform needing neither decomposition of the homography nor depth estimation to the target.

Book ChapterDOI
06 Jun 2007
TL;DR: In this paper, the authors address the problem of estimating 3D motion from acoustic images acquired by high-frequency 2D imaging sonars deployed in underwater using a planar approximation to scene surfaces.
Abstract: We address the problem of estimating 3-D motion from acoustic images acquired by high-frequency 2-D imaging sonars deployed in underwater Utilizing a planar approximation to scene surfaces, two-view homography is the basis of a nonlinear optimization method for estimating the motion parameters There is no scale factor ambiguity, unlike the case of monocular motion vision for optical images Experiments with real images demonstrate the potential in a range of applications, including target-based positioning in search and inspection operations

Patent
08 Nov 2007
TL;DR: In this paper, the authors proposed a 3D surface generation method that directly and efficiently generates a three-dimensional surface of the object surface from multiple images capturing a target object by using a homography determined by an all-vertices depth parameter of meshes and camera parameters.
Abstract: The present invention provides a three-dimensional surface generation method that directly and efficiently generates a three-dimensional surface of the object surface from multiple images capturing a target object. The three-dimensional surface generation method of the present invention sets one image as a basis image from multiple images obtained by capturing the target object from different viewpoint positions and sets other images as reference images, and then generates two-dimensional triangle meshes on the basis image. Next, the method of the present invention sets a distance between a vector whose elements are pixel values of an image obtained by deforming the reference image by a homography determined by an all-vertices depth parameter of meshes and camera parameters and a vector whose elements are pixel values of the basis image, as a term of a cost function, and computes the all-vertices depth parameter that a value of the cost function becomes smallest by iteratively performing the computation of the small variation of the all-vertices depth parameter and the update of the current value of the all-vertices depth parameter by using an optimization method that sets the multiple images, the camera parameters and the initial value of the all-vertices depth parameter as inputs till a predetermined condition is satisfied.

Patent
30 Mar 2007
TL;DR: In this article, a non-overlap region based homography calculation is proposed for ring camera image mosaics, which is based on feature points of a planar target appearing in a non overlap region among images captured by a multi-camera based video capture device.
Abstract: Various embodiments are directed to non-overlap region based automatic global alignment for ring camera image mosaic. The non-overlap region based homography calculation may be based on feature points of a planar target appearing in a non-overlap region among images captured by a multi-camera based video capture device. Other embodiments are described and claimed.

Book ChapterDOI
26 Nov 2007
TL;DR: An efficient algorithm to detect, correlate, and track features in a scene was implemented on an FPGA in order to obtain real-time performance and was designed specifically for use as an onboard vision solution in determining movement of small unmanned air vehicles that have size, weight, and power limitations.
Abstract: An efficient algorithm to detect, correlate, and track features in a scene was implemented on an FPGA in order to obtain real-time performance. The algorithm implemented was a Harris Feature Detector combined with a correlator based on a priority queue of feature strengths that considered minimum distances between features. The remaining processing of frame to frame movement is completed in software to determine an affine homography including translation, rotation, and scaling. A RANSAC method is used to remove mismatched features and increase accuracy. This implementation was designed specifically for use as an onboard vision solution in determining movement of small unmanned air vehicles that have size, weight, and power limitations.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: An algorithm for plane-based self-calibration of cameras with radially symmetric distortions given a set of sparse feature matches in at least two views is presented and it is shown that solving the approximate problem is a convex quadratic program, sufficient for accurately estimating the distortion parameters.
Abstract: We present an algorithm for plane-based self-calibration of cameras with radially symmetric distortions given a set of sparse feature matches in at least two views. The projection function of such cameras can be seen as a projection with a pinhole camera, followed by a non-parametric displacement of the image points in the direction of the distortion center. The displacement is a function of the points' distance to the center. Thus, the generated distortion is radially symmetric. Regular cameras, fish-eyes as well as the most popular central catadioptric devices can be described by such a model. Our approach recovers a distortion function consistent with all the views, or estimates one for each view if they are taken by different cameras. We consider a least squares algebraic solution for computing the homography between two views that is valid for rectified (undistorted) point correspondences. We observe that the terms of the function are bilinear in the unknowns of the homography and the distortion coefficient associated to each point. Our contribution is to approximate this non-convex problem by a convex one. To do so, we replace the bilinear terms by a set of new variables and obtain a linear least squares problem. We show that like the distortion coefficients, these variables are subject to monotonicity constraints. Thus, the approximate problem is a convex quadratic program. We show that solving it is sufficient for accurately estimating the distortion parameters. We validate our approach on simulated data as well as on fish-eye and catadioptric cameras. We also compare our solution to three state-of-the-art algorithms and show similar performance.

Proceedings ArticleDOI
21 Feb 2007
TL;DR: An new framework for homography-based analysis of pedestrian-vehicle activity in crowded scenes that can be used to enhance situational awareness for disaster prevention, human interactions in structured environments, and crowd movement analysis at wide regions is presented.
Abstract: This paper presents an new framework for homography-based analysis of pedestrian-vehicle activity in crowded scenes. Planar homography constraint is exploited to extract view-invariant object features including footage area and velocity of objects on the ground plane. Spatio-temporal relationships between people- and vehicle- tracks are represented by a semantic event. Context awareness of the situation is achieved by the estimated density distribution of objects and the anticipation of possible directions of near-future tracks using piecewise velocity history. Single-view and multi-view based homography mapping options are compared. Our framework can be used to enhance situational awareness for disaster prevention, human interactions in structured environments, and crowd movement analysis at wide regions

Proceedings ArticleDOI
26 Dec 2007
TL;DR: In this article, a method for radial lens distortion calibration based on a single image of planar chessboard pattern and using the extracted distorted grid of points is presented. But, due to homographic approach, no special alignment of the camera with regard to the calibration object is required.
Abstract: We present a novel method for radial lens distortion calibration which results in high accuracy of compensation. It is based on single image of planar chessboard pattern and uses the extracted distorted grid of points. Due to homographic approach, no special alignment of the camera with regard to the calibration object is required. Undistorted grid is determined from the central points of the image and used to find the radial distortion model using linear least square method (LSM). The model is used for dense compensation by bilinear interpolation or for sparse compensation by Newton iterative scheme.

Patent
08 Feb 2007
TL;DR: In this article, a grid projection pattern of known size and shape is projected and displayed on a projection plane to photograph it, and an equation representing the projection plane in a three-dimensional space is calculated from the coordinate.
Abstract: PROBLEM TO BE SOLVED: To provide an information projection display capable of projecting, with no distortion, on a projection plane positioned in an arbitrary direction from an arbitrary place, and also capable of intuitively designating a projection region with ease. SOLUTION: A grid projection pattern of known size and shape is projected and displayed on a projection plane to photograph it. The coordinates of at least three points on the projection plane is acquired, and an equation representing the projection plane in a three-dimensional space is calculated from the coordinate. A homography H 1 is calculated using the equation. A homography H 2 is calculated from the relationship of at least four feature points of the grid projection pattern and the photographed image. A homography H 3 is calculated from the H 1 and H 2 . Then, based on two markers set on the projection plane and a base angle specified in advance, a square region of which the two markers are opposing corners is decided as a projection region. A projection image is converted for projection, using H 3 , so that the image is projected on the projection region. COPYRIGHT: (C)2007,JPO&INPIT

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A new closed-form model that relates projectors to cameras in planar multi-projector displays, using rational Bezier patches is presented, overcomes the shortcomings of the previous methods by allowing for projectors with significant lens distortion.
Abstract: In order to achieve seamless imagery in a planar multi-projector display, geometric distortions and misalignment of images within and across projectors have to be removed. Camera-based calibration methods are popular for achieving this in an automated fashion. Previous methods for geometric calibration fall into two categories: (a) methods that model the geometric function relating the projectors to cameras using simple linear models, like homography, to calibrate the display. These models assume perfect linear devices and cannot address projector nonlinearities, like lens distortions, which are common in most commodity projectors, (b) methods that use piecewise linear approximations to model the relationship between projectors and cameras. These require a dense sampling of the function space to achieve good calibration. In this paper, we present a new closed-form model that relates projectors to cameras in planar multi-projector displays, using rational Bezier patches. This model overcomes the shortcomings of the previous methods by allowing for projectors with significant lens distortion. It can be further used to develop an efficient and accurate geometric calibration method with a sparse sampling of the function.

Proceedings ArticleDOI
11 Apr 2007
TL;DR: A data fusion of given 3d models provided by a GIS and recorded IR image sequences is performed to project the 2d images into the 3d object space and these textures can be used for feature extraction and object recognition for analyzing buildings.
Abstract: Focus of the paper lies on automated texturing of 3d building models with images recorded with infrared (IR) cameras. Therefore, a data fusion of given 3d models provided by a GIS and recorded IR image sequences is performed. Two concepts are presented to project the 2d images into the 3d object space. In the first concept, the polygenes of the faces of the 3d model are projected into the IR image and matched to the edges for assigning the correct parts of the image to the corresponding faces of the model. These image parts are then projected onto the faces and stored as surface textures. The second concept uses homography to detect surface planes in two subsequent images and matches theses planes to the 3d model. Because one image does not show complete building facades, the extracted textures of the images are combined to create complete textures for the model surfaces. These textures can be used for feature extraction and object recognition for analyzing buildings.