scispace - formally typeset
Search or ask a question

Showing papers on "Monocular vision published in 2005"


Proceedings ArticleDOI
07 Aug 2005
TL;DR: An approach in which supervised learning is first used to estimate depths from single monocular images, which is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene is presented.
Abstract: We consider the task of driving a remote control car at high speeds through unstructured outdoor environments. We present an approach in which supervised learning is first used to estimate depths from single monocular images. The learning algorithm can be trained either on real camera images labeled with ground-truth distances to the closest obstacles, or on a training set consisting of synthetic graphics images. The resulting algorithm is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene. Reinforcement learning/policy search is then applied within a simulator that renders synthetic scenes. This learns a control policy that selects a steering direction as a function of the vision system's output. We present results evaluating the predictive ability of the algorithm both on held out test data, and in actual autonomous driving experiments.

435 citations


Journal ArticleDOI
TL;DR: The results have implications for the information needed to scale egocentric distance in the real-world and reduce the support for the hypothesis that a limited field of view or imperfections in binocular image presentation are the cause of the underestimation seen with HMDs.
Abstract: We carried out three experiments to examine the influence of field of view and binocular viewing restrictions on absolute distance perception in real-world indoor environments. Few of the classical visual cues provide direct information for accurate absolute distance judgments to points in the environment beyond about 2 m from the viewer. Nevertheless, in previous work it has been found that visually directed walking tasks reveal accurate distance estimations in full-cue real-world environments to distances up to 20 m. In contrast, the same tasks in virtual environments produced with head-mounted displays (HMDs) show large compression of distance. Field of view and binocular viewing are common limitations in research with HMDs, and have been rarely studied under full pictorial-cue conditions in the context of distance perception in the real-world. Experiment 1 showed that the view of one's body and feet on the floor was not necessary for accurate distance perception. In experiment 2 we manipulated the horizontal and the vertical field of view along with head rotation and found that a restricted field of view did not affect the accuracy of distance estimations when head movement was allowed. Experiment 3 showed that performance with monocular viewing was equal to that with binocular viewing. These results have implications for the information needed to scale egocentric distance in the real-world and reduce the support for the hypothesis that a limited field of view or imperfections in binocular image presentation are the cause of the underestimation seen with HMDs.

298 citations


Journal ArticleDOI
TL;DR: A real-time vision system to compute traffic parameters by analyzing monocular image sequences coming from pole-mounted video cameras at urban crossroads is presented, utilizing a robust background updating, and a feature-based tracking method.
Abstract: The paper presents a real-time vision system to compute traffic parameters by analyzing monocular image sequences coming from pole-mounted video cameras at urban crossroads. The system uses a combination of segmentation and motion information to localize and track moving objects on the road plane, utilizing a robust background updating, and a feature-based tracking method. It is able to describe the path of each detected vehicle, to estimate its speed and to classify it into seven categories. The classification task relies on a model-based matching technique refined by a feature-based one for distinguishing between classes having similar models, like bicycles and motorcycles. The system is flexible with respect to the intersection geometry and the camera position. Experimental results demonstrate robust, real-time vehicle detection, tracking and classification over several hours of videos taken under different illumination conditions. The system is presently under trial in Trento, a 100,000-people town in northern Italy.

170 citations


Proceedings ArticleDOI
18 Apr 2005
TL;DR: A monocular robot vision system which accomplishes accurate 3-DOF dead-reckoning, closed loop motion control, and precipice and obstacle detection, all in dynamic environments, using a single, consumer-grade web cam and typical laptop computer hardware.
Abstract: We describe a monocular robot vision system which accomplishes accurate 3-DOF dead-reckoning, closed loop motion control, and precipice and obstacle detection, all in dynamic environments, using a single, consumer-grade web cam and typical laptop computer hardware. Simultaneous translation and rotation are accurately measured, and the camera need not be placed at the robot’s center of rotation. The algorithm is straightforward to implement and robust to noisy measurements. The software is based on open source computer vision libraries and is itself open source. It has been tested in a wide variety of real-world environments and on several different mobile robot platforms.

158 citations


Proceedings ArticleDOI
05 Dec 2005
TL;DR: In this paper, a complete system for outdoor robot navigation is presented that uses only monocular vision and a three dimensional map of the trajectory and the environment is built.
Abstract: In this paper, a complete system for outdoor robot navigation is presented. It uses only monocular vision. The robot is first guided on a path by a human. During this learning step, the robot records a video sequence. From this sequence, a three dimensional map of the trajectory and the environment is built. When this map has been computed, the robot is able to follow the same trajectory by itself. Experimental results carried out with an urban electric vehicle are shown and compared to the ground truth.

118 citations


Journal ArticleDOI
TL;DR: The stabilization ratio (SR) improved the reliability of the measurement of the visual contribution to posture within individuals, across subjects, and even across different studies in the literature.

106 citations


Journal ArticleDOI
TL;DR: The enhanced model explains all the psychophysical data previously simulated by Grossberg and Howe (2003), such as contrast variations of dichoptic masking and the correspondence problem, the effect of interocular contrast differences on stereoacuity, Panum's limiting case, the Venetian blind illusion, stereopsis with polarity-reversed stereograms, and da Vinci stereopsis.
Abstract: A laminar cortical model of stereopsis and 3D surface perception is developed and simulated. The model describes how monocular and binocular oriented filtering interact with later stages of 3D boundary formation and surface filling-in in the LGN and cortical areas V1, V2, and V4. It proposes how interactions between layers 4, 3B, and 2/3 in V1 and V2 contribute to stereopsis, and how binocular and monocular information combine to form 3D boundary and surface representations. The model includes two main new developments: (1) It clarifies how surface-to-boundary feedback from V2 thin stripes to pale stripes helps to explain data about stereopsis. This feedback has previously been used to explain data about 3D figure-ground perception. (2) It proposes that the binocular false match problem is subsumed under the Gestalt grouping problem. In particular, the disparity filter, which helps to solve the correspondence problem by eliminating false matches, is realized using inhibitory interneurons as part of the perceptual grouping process by horizontal connections in layer 2/3 of cortical area V2. The enhanced model explains all the psychophysical data previously simulated by Grossberg and Howe (2003), such as contrast variations of dichoptic masking and the correspondence problem, the effect of interocular contrast differences on stereoacuity, Panum's limiting case, the Venetian blind illusion, stereopsis with polarity-reversed stereograms, and da Vinci stereopsis. It also explains psychophysical data about perceptual closure and variations of da Vinci stereopsis that previous models cannot yet explain.

93 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: A method for computing the localization of a mobile robot with reference to a learning video sequence, where the robot is first guided on a path by a human, while the camera records a monocular learning sequence.
Abstract: In this paper we present a method for computing the localization of a mobile robot with reference to a learning video sequence. The robot is first guided on a path by a human, while the camera records a monocular learning sequence. Then a 3D reconstruction of the path and the environment is computed off line from the learning sequence. The 3D reconstruction is then used for computing the pose of the robot in real time (30 Hz) in autonomous navigation. Results from our localization method are compared to the ground truth measured with a differential GPS.

82 citations


Journal ArticleDOI
TL;DR: A motor control model that optimally integrates cues with different delays accounts for the findings and shows that cue integration for motor control depends in part on the time course of cue processing.

58 citations


Journal ArticleDOI
TL;DR: An efficient and simple method for searching image and model feature correspondences, which has been designed for indoor mobile robot self-location, will be highlighted: this is a three-stage method based on the interpretation tree search approach.

55 citations


Journal ArticleDOI
TL;DR: This work compares the use of the homography and the fundamental matrix and it is shown that the correction of motion directly from the parameters of the 2D homography, which only needs one calibration parameter, is robust, sufficiently accurate and simple.

Journal ArticleDOI
TL;DR: In this paper, the authors compared sensitivity to first-order versus second-order local motion in patients treated for dense central congenital cataracts in one or both eyes, and found that early visual deprivation, whether monocular or binocular, caused losses in sensitivity to both first- and secondorder motion.

Journal ArticleDOI
TL;DR: The FD2 stereotest is a useful measure of distance stereoacuity, provided the presentation protocol accounts for monocular cues, and can be modified to include a monocular test phase.

Journal ArticleDOI
TL;DR: The present data provide further evidence for a general right visual field advantage in bottlenose dolphins for visual information processing and suggest a specialization of the left hemisphere for analysing relational features between stimuli as required in tests for numerical abilities.

Journal ArticleDOI
TL;DR: The results suggest that light exposure of the embryo makes neural mechanisms that do not receive direct visual input more available to be used in assessment of novelty.

Proceedings ArticleDOI
18 Apr 2005
TL;DR: Two real-time pose tracking algorithms for rigid objects are compared, both of which are 3D-model based and are capable of calculating the pose between the camera and an object with a monocular vision system.
Abstract: In this paper, two real-time pose tracking algorithms for rigid objects are compared. Both methods are 3D-model based and are capable of calculating the pose between the camera and an object with a monocular vision system. Here, special consideration has been put into defining and evaluating different performance criteria such as computational efficiency, accuracy and robustness. Both methods are described and a unifying framework is derived. The main advantage of both algorithms lie in their real-time capabilities (on standard hardware) whilst being robust to miss-tracking, occlusion and changes in illumination.

Book ChapterDOI
01 Jan 2005
TL;DR: The system enabled the robot to detect unknown obstacles and reliably avoid them while advancing toward a target and proved highly successful by winning the obstacle avoidance challenge and was also used in the RoboCup championship games.
Abstract: We present a complete system for obstacle avoidance for a mobile robot. It was used in the RoboCup 2003 obstacle avoidance challenge in the Sony Four Legged League. The system enables the robot to detect unknown obstacles and reliably avoid them while advancing toward a target. It uses monocular vision data with a limited field of view. Obstacles are detected on a level surface of known color(s). A radial model is constructed from the detected obstacles giving the robot a representation of its surroundings that integrates both current and recent vision information. Sectors of the model currently outside the current field of view of the robot are updated using odometry. Ways of using this model to achieve accurate and fast obstacle avoidance in a dynamic environment are presented and evaluated. The system proved highly successful by winning the obstacle avoidance challenge and was also used in the RoboCup championship games.

Patent
24 Jun 2005
TL;DR: In this paper, the binocular vision field conflict was mitigated and the sense of fatigue of the observer was lessened by using a fusion point corresponding to a point (b) of the left visual sensation image (b).
Abstract: The binocular vision field conflict due to binocular vision is mitigated, and sense of fatigue of the observer is lessened. The fusion point corresponding to a point (b) of the left visual sensation image (320) of the binocular vision is a point (a) of the right visual sensation image (321). Since the objects viewed by the left and right eyes (200, 201) are different, binocular vision field conflict occurs when both eyes view the point (b). The image of the point (b) on the left vision field image (320) is sharp, but that of the point (a) of the right vision field image (321) blurs. As a result, the sharp image of the point (b) is dominantly perceived, and an object (B) at the same distance is also dominantly perceived. Therefore, the blurred image of the point (a) and an object (A) near the point (a) are excluded.

Journal ArticleDOI
TL;DR: The proposed method for autonomous robot navigation based on homographies computed between the current image and images taken in a previous teaching phase with a monocular vision system has turned out to be especially useful to correct heading and lateral displacement, which are critical in systems based on odometry.
Abstract: We introduce a method for autonomous robot navigation based on homographies computed between the current image and images taken in a previous teaching phase with a monocular vision system. The features used to estimate the homography are vertical lines automatically extracted and matched. From homography, the underlying motion correction between the reference path and the current robot location is computed. The proposed method, which uses a sole calibration parameter, has turned out to be especially useful to correct heading and lateral displacement, which are critical in systems based on odometry. We have tested the proposal in simulation, and with real images. Besides, the visual system has been integrated into an autonomous wheelchair for handicapped, working in real time with robustness. © 2005 Wiley Periodicals, Inc.

Journal ArticleDOI
TL;DR: It is concluded that the problems that WBS patients have with tasks such as descending stairs are not due to an inability to judge distance, but to problems in using depth information for guiding their movements when deprived of visual feedback.
Abstract: Patients with Williams-Beuren Syndrome (WBS, also known as Williams Syndrome) show many problems in motor activities requiring visuo-motor integration, such as walking stairs. We tested to what extent these problems might be related to a deficit in the perception of visual depth or to problems in using this information in guiding movements. Monocular and binocular visual depth perception was tested in 33 patients with WBS. Furthermore, hand movements to a target were recorded in conditions with and without visual feedback of the position of the hand. The WBS group was compared to a group of control subjects. The WBS patients were able to perceive monocular depth cues that require global processing, but about 49% failed to show stereopsis. On average, patients with WBS moved their hand too far when no visual feedback on hand position was given. This was not so when they could see their hand. Patients with WBS are able to derive depth from complex spatial relationships between objects. However, they seem to be impaired in using depth information for guiding their movements when deprived of visual feedback. We conclude that the problems that WBS patients have with tasks such as descending stairs are not due to an inability to judge distance.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the coupling of distance and size perception as well as the coupling between distance and shape perception using a targeted reaching task that simultaneously yielded measures of distance, size, and shape.
Abstract: This study investigated the coupling of distance and size perception as well as the coupling of distance and shape perception. Each was tested in 2 ways using a targeted reaching task that simultaneously yielded measures of distance, size, and shape perception. First, feed-forward reaches were tested without feedback. Errors in size did not covary with errors in distance, but errors in shape did. Second, reaches were tested with visual feedback. Estimated distance and size became more accurate, but shape did not. The evidence indicated that distance and size perception and distance and shape perception are not coupled. These results were replicated 3 times, as we also compared performance using dynamic monocular, static binocular, and dynamic binocular vision. Performance was better with binocular than monocular vision both without and with feedback. The presence of a size gradient did not improve monocular distance perception, yielding additional evidence that distance and size perception are not coupled.

Journal ArticleDOI
TL;DR: The results suggest that colour and orientation processing interact at monocular stages of visual processing, whereas binocular visual mechanisms code for form in a manner that is largely insensitive to chromatic signature.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A monocular vision-based Vehicle Recognition System in which the basic components of road vehicles are first located in the image and then combined with a SVM-based classifier in order to better deal with vehicle variability, illumination conditions, partial occlusions and rotations.
Abstract: This paper describes a monocular vision-based Vehicle Recognition System in which the basic components of road vehicles are first located in the image and then combined with a SVM-based classifier. The challenge is to use a single camera as input. This poses the problem of vehicle detection and recognition in real, cluttered road images. A distributed learning approach is proposed in order to better deal with vehicle variability, illumination conditions, partial occlusions and rotations. The vehicle searching area in the image is constrained to the limits of the lanes, which are determined by the road lane markings. By doing so, the rate of false positive detections is largely decreased. A large database containing thousands of vehicle examples extracted from real road images has been created for learning purposes. We present and discuss the results achieved up to date.

Journal ArticleDOI
TL;DR: The spatial localization abilities (alignment accuracy) of visually deprived kittens were measured by use of similar spatially bandpass stimuli (Gaussian blobs) to those employed for the assessment of human amblyopes, finding that the deficits could not be explained in terms of the contrast sensitivity loss in this eye.

Proceedings ArticleDOI
06 Jun 2005
TL;DR: In this paper, a monocular vision-based occupant classification approach was proposed to classify occupants into five categories including empty seats, adults in normal position, adults out of position, front-facing child/infant seats, and rear-facing infant seats.
Abstract: Occupant classification is essential to a smart airbag system that can either turn off or deploy in a less harmful way according to the type of the occupants in the front seat. This paper presents a monocular vision-based occupant classification approach to classify the occupants into five categories including empty seats, adults in normal position, adults out of position, front-facing child/infant seats, and rear-facing infant seats. The proposed approach consists of image representation and pattern classification. The image representation step computes Haar wavelets and edge features from the monochrome video frames. A support vector machine (SVM) classifier next determines the occupant category based on the representative features. We have tested our approach on a large variety of indoor and outdoor images acquired under various illumination conditions for occupants with different appearances, sizes and shapes. With a strict occupant exclusive training/testing split, our approach has achieved an average correct classification rate of 97.18% among the five occupant categories.

Journal ArticleDOI
TL;DR: There is empirical support for the working assumption of the visual perception approach that size perception based on monocular distance cues is computed automatically, for the first time.
Abstract: The study reported here examined whether size perception based on monocular distance cues is computed automatically. Participants were presented with a picture containing distance cues, which was superimposed with a pair of digits differing in numerical value. One digit was presented so as to be perceived as closer than the other. The digits were of similar physical size but differed in their perceptual size. The participants’ task was to decide which digit was numerically larger. It was found that the decision took longer and resulted in more errors when the perceptual size of the numerically larger digit was smaller than the perceptual size of the numerically smaller digit. These results show that perceived size affects performance in a task that does not require size or distance computation. Hence, for the first time, there is empirical support for the working assumption of the visual perception approach that size perception based on monocular distance cues is computed automatically.

Journal ArticleDOI
TL;DR: This study concludes that judgment of the orientation of the plane of regard, a plane that contains the line of sight, is veridical, indicating accurate compensation for actual eye torsion.

Proceedings ArticleDOI
29 Jul 2005
TL;DR: A novel method, which enhances the use of external mechanisms by considering a multi-sensor system, composed of sonar and a CCD camera, and utilizing Hough transform to extract features from raw sonar data and vision image is presented.
Abstract: The ability to simultaneously localize a robot and accurately map its surroundings is considered to be key prerequisite of truly autonomous robots. This paper presents a novel method, which enhances the use of external mechanisms by considering a multi-sensor system, composed of sonar and a CCD camera. Monocular vision provides redundant information about the location of the geometric entities detected by the sonar sensor. Hough transform is utilized to extract features from raw sonar data and vision image. The information is fused at the level of features. This technique improves the reliability significantly and precision of the environment observations used for the simultaneous localization and map building problem for mobile robots. Experimental results validate the favorable performance of this approach.

Journal ArticleDOI
TL;DR: The visual-field deficit seen with monocular viewing is greatest with nasal fixation, and head and eye movements cannot totally compensate for this deficit when viewing time is limited.
Abstract: Background The horizontal binocular visual field can extend to more than 200°, while a monocular field is limited to 160°. Additionally, the nose and other facial structures may block the monocular field further during certain eye movements. The purpose of this study was to compare the monocular against the binocular visual field and determine if head and eye movements can functionally overcome any measured deficit. Methods In Experiment 1, visual fields were measured monocularly with a bowl perimeter using 5 fixation positions. Binocular visual fields were calculated by combining the monocular visual field with its mirror image. In Experiment 2, subjects were allowed to make head, eye, and body movements to search for flashing lights 360° around them, spaced every 45°. The numbers of lights identified were compared for the subjects performing monocularly versus binocularly. Results The size of the overall monocular visual field was found to vary between 48% and 76% of the binocular visual field, depending on eye position. For the flashing light experiment, head and eye movements could not overcome the entire visual-field deficit with monocular viewing. Monocular performance remained 11.4% less than binocular performance. Conclusions The visual-field deficit seen with monocular viewing is greatest with nasal fixation, and head and eye movements cannot totally compensate for this deficit when viewing time is limited. Vision standards that require full visual fields in each eye are more appropriate for occupations in which peripheral visual targets must be identified and visual search time is limited.

Journal ArticleDOI
TL;DR: In this paper, a closed-form solution of the relative pose determination problem based on monocular vision during final approach phase of spacecraft Rendzvous and Docking is given, where the model of perspective projection is simplified by representing the relative attitude using unit quaternion.
Abstract: Purpose – To give a closed‐form solution of the relative pose determination problem based on monocular vision during final approach phase of spacecraft Rendzvous and Docking.Design/methodology/approach – Based on the assumption of scaled orthographic projection, the model of perspective projection is simplified by representing the relative attitude using unit quaternion. Then a closed‐form solution is derived. Subsequently, this study correct the approximate solution to compensate the error caused by the assumption of scaled orthographic projection.Findings – Extensive simulation studies were conducted for the validation of the proposed algorithm using Matlab™. When there are no relative attitudes between RVD spacecrafts, target distance for camera=2‐20 m. The simulation results show that the largest relative error of corrected relative position parameters is about 0.12 percent. When distance between RVD spacecrafts exceeds 5 m, the largest error of corrected relative attitude parameters are less than 0.3...