scispace - formally typeset
Search or ask a question

Showing papers on "Monocular vision published in 2009"


Journal ArticleDOI
TL;DR: It is proposed that with the possible exception of owls, binocularity in birds does not have a higher order function that results in the perception of solidity and relative depth and is a consequence of the requirement of having a portion of the visual field that looks in the direction of travel.
Abstract: It is proposed that with the possible exception of owls, binocularity in birds does not have a higher order function that results in the perception of solidity and relative depth. Rather, binocularity is a consequence of the requirement of having a portion of the visual field that looks in the direction of travel; hence, each eye must have a contralateral projection that gives rise to binocularity. This contralateral projection is necessary to gain a symmetrically expanding optic flow-field about the bill. This specifies direction of travel and time to contact a target during feeding or when provisioning chicks. In birds that do not need such control of their bill, binocular field widths are very small, suggesting that binocular vision plays only a minor role in the control of locomotion. In the majority of birds, the function of binocularity would seem to lie in what each eye does independently rather than in what the two eyes might be able to do together. The wider binocular fields of owls may be the product of an interaction between enlarged eyes and enlarged outer ears, which may simply prevent more lateral placement of the eyes.

129 citations


Journal ArticleDOI
TL;DR: Experimental results on several realistic indoor video sequences show that the proposed three-stage approach with multi-level state representation that enables a hierarchical estimation of 3D body poses is able to track multiple persons during complex movement including sitting and turning movements with self and inter-occlusion.
Abstract: Tracking human body poses in monocular video has many important applications. The problem is challenging in realistic scenes due to background clutter, variation in human appearance and self-occlusion. The complexity of pose tracking is further increased when there are multiple people whose bodies may inter-occlude. We proposed a three-stage approach with multi-level state representation that enables a hierarchical estimation of 3D body poses. Our method addresses various issues including automatic initialization, data association, self and inter-occlusion. At the first stage, humans are tracked as foreground blobs and their positions and sizes are coarsely estimated. In the second stage, parts such as face, shoulders and limbs are detected using various cues and the results are combined by a grid-based belief propagation algorithm to infer 2D joint positions. The derived belief maps are used as proposal functions in the third stage to infer the 3D pose using data-driven Markov chain Monte Carlo. Experimental results on several realistic indoor video sequences show that the method is able to track multiple persons during complex movement including sitting and turning movements with self and inter-occlusion.

116 citations


Journal ArticleDOI
02 Apr 2009-Nature
TL;DR: It is shown that local circuits for ocular dominance always have smooth and graded transitions from one apparently monocular functional domain to an adjacent binocular region, and a new map in the cat visual cortex is discovered that had a precise functional micro-architecture for binocular disparity selectivity.
Abstract: In invertebrate predators such as the praying mantis and vertebrate predators such as wild cats the ability to detect small differences in inter-ocular retinal disparities is a critical means for accurately determining the depth of moving objects such as prey. In mammals, the first neurons along the visual pathway that encode binocular disparities are found in the visual cortex. However, a precise functional architecture for binocular disparity has never been demonstrated in any species, and coarse maps for disparity have been found in only one primate species. Moreover, the dominant approach for assaying the developmental plasticity of binocular cortical neurons used monocular tests of ocular dominance to infer binocular function. The few studies that examined the relationship between ocular dominance and binocular disparity of individual cells used single-unit recordings and have provided conflicting results regarding whether ocular dominance can predict the selectivity or sensitivity to binocular disparity. We used two-photon calcium imaging to sample the response to monocular and binocular visual stimuli from nearly every adjacent neuron in a small region of the cat visual cortex, area 18. Here we show that local circuits for ocular dominance always have smooth and graded transitions from one apparently monocular functional domain to an adjacent binocular region. Most unexpectedly, we discovered a new map in the cat visual cortex that had a precise functional micro-architecture for binocular disparity selectivity. At the level of single cells, ocular dominance was unrelated to binocular disparity selectivity or sensitivity. When the local maps for ocular dominance and binocular disparity both had measurable gradients at a given cortical site, the two gradient directions were orthogonal to each other. Together, these results indicate that, from the perspective of the spiking activity of individual neurons, ocular dominance cannot predict binocular disparity tuning. However, the precise local arrangement of ocular dominance and binocular disparity maps provide new clues regarding how monocular and binocular depth cues may be combined and decoded.

110 citations


Journal ArticleDOI
15 Dec 2009
TL;DR: A novel indoor navigation and ranging strategy by using a monocular camera that is inspired by the key adaptive mechanisms for depth perception and pattern recognition found in humans and intelligent animals with a focus on indoor aerial vehicle applications is presented.
Abstract: This paper presents a novel indoor navigation and ranging strategy by using a monocular camera. The proposed algorithms are integrated with simultaneous localization and mapping (SLAM) with a focus on indoor aerial vehicle applications. We experimentally validate the proposed algorithms by using a fully self-contained micro aerial vehicle (MAV) with on-board image processing and SLAM capabilities. The range measurement strategy is inspired by the key adaptive mechanisms for depth perception and pattern recognition found in humans and intelligent animals. The navigation strategy assumes an unknown, GPS-denied environment, which is representable via corner-like feature points and straight architectural lines. Experimental results show that the system is only limited by the capabilities of the camera and the availability of good corners.

98 citations


Journal ArticleDOI
TL;DR: The field as it currently stands is reviewed, with a focus on understanding the extent to which the role of monocular regions in depth perception can be understood using extant theories of binocular vision.

75 citations


Journal ArticleDOI
TL;DR: It is concluded that both monocular and binocular static depth information contribute to flow parsing, and in line with previous studies, results consistent with flow parsing were found in the condition in which motion parallax and stereoscopic disparity were present.

63 citations


Journal ArticleDOI
TL;DR: It is found that binocular vision clearly facilitates walking performance, and the data are consistent with greater uncertainty in monocular vision, leading to a greater reliance on feedback in the control of the movements.
Abstract: Despite the extensive investigation of binocular and stereoscopic vision, relatively little is known about its importance in natural visually guided behavior. In this paper, we explored the role of binocular vision when walking over and around obstacles. We monitored eye position during the task as an indicator of the difference between monocular and binocular performances. We found that binocular vision clearly facilitates walking performance. Walkers were slowed by about 10% in monocular vision and raised their foot higher when stepping over obstacles. Although the location and sequence of the fixations did not change in monocular vision, the timing of the fixations relative to the actions was different. Subjects spent proportionately more time fixating the obstacles and fixated longer while guiding foot placement near an obstacle. The data are consistent with greater uncertainty in monocular vision, leading to a greater reliance on feedback in the control of the movements.

57 citations


Journal ArticleDOI
TL;DR: It is shown that monocular rivalry can occur with complex images, as with binocular rivalry, and that the two phenomena are affected similarly by the size and colour of the images.

49 citations


Journal ArticleDOI
TL;DR: This brief describes a single-vehicle tracking algorithm relying on active contours for target extraction and an extended Kalman filter for relative position estimation and seeks to study the achievable closed-loop performance under this model.
Abstract: This brief describes a single-vehicle tracking algorithm relying on active contours for target extraction and an extended Kalman filter for relative position estimation. The primary difficulty lies in the estimation and regulation of range using monocular vision. The work represents a first step towards treating the problem of the control of several unmanned vehicles flying in formation using only local visual information. In particular, allowing only onboard passive sensing of the external environment, we seek to study the achievable closed-loop performance under this model.

45 citations


Proceedings ArticleDOI
10 Feb 2009
TL;DR: A real-time monocular vision based rear vehicle and motorcycle detection and tracking approach for Lane Change Assistant (LCA) that detects and tracks multiple vehicles and motorcycles on road by combining multiple cues.
Abstract: A real-time monocular vision based rear vehicle and motorcycle detection and tracking approach is presented for Lane Change Assistant (LCA). To achieve robustness and accuracy this work detects and tracks multiple vehicles and motorcycles on road by combining multiple cues. To achieve real-time multi-resolution technology is used to reduce computing complexity, and all algorithms have been implemented on an IMAP (Integrated Memory Array Processor) parallel vision board. The test results under various traffic scenes illustrated accuracy, robustness and real-time of this work.

32 citations


Journal ArticleDOI
TL;DR: The results demonstrate that the patient’s lesions impaired both her distance and size perception, but not uniformly, and the patient showed partial preservation in size processing of novel objects even when familiar size cues were removed.
Abstract: Accurate distance perception depends on the processing and integration of a variety of monocular and binocular cues. Dorsal stream lesions can impair this process, but details of this neurocognitive relationship remain unclear. Here, we tested a patient with bilateral occipitoparietal damage and severely impaired stereopsis. We addressed four related questions: (1) Can distance and size perception survive limitations in perceiving monocular and binocular cues? (2) Are egocentric (self-referential) and allocentric (object-referential) distance judgments similarly impaired? (3) Are distance measurements equally impaired in peripersonal and extrapersonal space? (4) Are size judgments possible when distance processing is impaired? The results demonstrate that the patient’s lesions impaired both her distance and size perception, but not uniformly. Her performance when using an egocentric reference frame was more impaired than her performance when using an allocentric reference frame. Likewise, her distance judgments in peripersonal space were more impaired than those in extrapersonal space. The patient showed partial preservation in size processing of novel objects even when familiar size cues were removed.

Journal ArticleDOI
TL;DR: A statistically‐significant difference in temporal behavior between monocular and binocular viewing conditions was found, but on average it was too small to be of practical importance, although some subjects showed a notably higher variability for the monocular case during near vision.

Journal ArticleDOI
TL;DR: The latency to alter a movement when monocular and binocular cues indicated that the surface slant had changed was compared and it was found that subjects adjusted their movement in response to three types of information: information about the new slant from the monocular image, informationAbout the newSlant from binocular disparity, and information about a change in slantfrom the change in the mon goggles.
Abstract: For the online control of movement, it is important to respond fast. The extent to which cues are effective in guiding our actions might therefore depend on how quickly they provide new information. We compared the latency to alter a movement when monocular and binocular cues indicated that the surface slant had changed. We found that subjects adjusted their movement in response to three types of information: information about the new slant from the monocular image, information about the new slant from binocular disparity, and information about the change in slant from the change in the monocular image. Responses to changes in the monocular image were approximately 40 ms faster than responses to a new slant estimate from binocular disparity and about 90 ms faster than responses to a new slant estimate from the monocular image. Considering these delays, adjustments of ongoing movements to changes in slant will usually be initiated by changes in the monocular image. The response will later be refined on the basis of combined binocular and monocular estimates of slant.

Journal ArticleDOI
TL;DR: Valdes-Sosa et al. as discussed by the authors used ERP recordings to investigate the mechanisms of endogenous attentional selection of such competing dot surfaces under conditions of dichoptic viewing (one surface to each eye) and monocular viewing (both surfaces to one eye).

Journal ArticleDOI
TL;DR: The wheelchair based mobile robot based on monocular vision is developed, and the accuracy of the boundary line detection is estimated, and fast processing speed and high detection accuracy are obtained.
Abstract: This paper develops the indoor mobile robot navigation by center following based on monocular vision. In our method, based on the frontal image, two boundary lines between the wall and baseboard are detected. Then, the appearance based obstacle detection is applied. When the obstacle exists, the avoidance or stop movement is worked according to the size and position of the obstacle, and when the obstacle does not exist, the robot moves at the center of the corridor. We developed the wheelchair based mobile robot. We estimated the accuracy of the boundary line detection, and obtained fast processing speed and high detection accuracy. We demonstrate the effectiveness of our mobile robot by the stopping experiments with various obstacles and moving experiments.

Journal ArticleDOI
TL;DR: The present findings suggested that the induced ipsilateral bias may be primarily induced by visual deprivation, consistent with compensatory "where" resource re-allocation.

Proceedings ArticleDOI
06 Apr 2009
TL;DR: A monocular vision-based pose estimation and stabilization system for a quadrotor helicopter that is used to estimate the attitude and the position of the vehicle, except for the altitude, which is controlled manually from a joystick.
Abstract: A monocular vision-based pose estimation and stabilization system for a quadrotor helicopter is proposed. The goal of this project is to enable the helicopter to hover in place using a vision-based autopilot. The method consists of a single camera onboard the helicopter that is used to estimate the attitude and the position of the vehicle, except for the altitude, which is controlled manually from a joystick. These parameters are calculated using three dark colored targets mounted on a white wall. An algorithm processes the video frames to extract the necessary information, which is evaluated for errors and then passed to the control algorithm. The control signal for the helicopter is output via the audio port and driven through a custom-made circuit to eliminate noise and convert it to the necessary format. The results from flight tests are presented with the system’s advantages, limitations, and drawbacks discussed.

Journal ArticleDOI
TL;DR: It is suggested that binocular vision near the front wall provides visual information of a better quality than the monocular vision far from the frontWall that contributes to body sway response to the driving stimulus provided by the moving room.

Book ChapterDOI
04 May 2009
TL;DR: A previous color- based registration algorithm is extended with a more precise edge-based registration step, and an experimental analysis of the residual error vs. the computation time is presented.
Abstract: 3D human motion capture by real-time monocular vision without using markers can be achieved by registering a 3D articulated model on a video. Registration consists in iteratively optimizing the match between primitives extracted from the model and the images with respect to the model position and joint angles. We extend a previous color-based registration algorithm with a more precise edge-based registration step. We present an experimental analysis of the residual error vs. the computation time and we discuss the balance between both approaches.

Proceedings ArticleDOI
Samantha Ng1, Adel Fakih1, Adam Fourney1, Pascal Poupart1, John Zelek1 
13 Nov 2009
TL;DR: Preliminary experiments indicate that texture and colour cues conditioned on the appearance of a rollator user outperform more general cues, at the cost of manually initializing the appearance offline.
Abstract: Cognitive assistance of a rollator (wheeled walker) user tends to reduce the attentional capacity of the user and may impact her stability. Hence, it is important to understand and track the pose of rollator users before augmenting a rollator with some form of cognitive assistance. While the majority of current markerless vision systems focus on estimating 2D and 3D walking motion in the sagittal plane, we wish to estimate the 3D pose of rollator users' lower limbs from observing image sequences in the coronal (frontal) plane. Our apparatus poses a unique set of challenges: a single monocular view of only the lower limbs and a frontal perspective of the rollator user. Since motion in the coronal plane is relatively subtle, we explore multiple cues within a Bayesian probabilistic framework to formulate a posterior estimate for a given subject's leg limbs. In this work, our focus is on evaluating the appearance model (the cues). Preliminary experiments indicate that texture and colour cues conditioned on the appearance of a rollator user outperform more general cues, at the cost of manually initializing the appearance offline.

Journal Article
TL;DR: The improved algorithm is more convenient and quicker because it does not need moving the camera to take two pictures, and can be applied to robot localization, space rendezvous and docking.
Abstract: This paper improves a position and orientation algorithm of monocular vision based on circular features.The information of laser rangerfinder is used to select correct one from two solutions of the equations.When the radius of a spatial circle is unkown,the improved algorithm can measure all parameters of circular position and orientation according to an image.Compared with previous algorithms,the improved algorithm is more convenient and quicker because it does not need moving the camera to take two pictures.A computational model is built up,and the simulations are carried out.The simulation results show that the improved algorithm is effective and can be applied to robot localization,space rendezvous and docking.

Journal ArticleDOI
TL;DR: This article discusses and explains two modules: a self-position recognition system and an obstacle recognition system that integrates two modules for autonomy for a robot that uses an indoor navigation system based on visual methods to provide the required autonomy.
Abstract: We have developed a technology for a robot that uses an indoor navigation system based on visual methods to provide the required autonomy For robots to run autonomously, it is extremely important that they are able to recognize the surrounding environment and their current location Because it was not necessary to use plural external world sensors, we built a navigation system in our test environment that reduced the burden of information processing mainly by using sight information from a monocular camera In addition, we used only natural landmarks such as walls, because we assumed that the environment was a human one In this article we discuss and explain two modules: a self-position recognition system and an obstacle recognition system In both systems, the recognition is based on image processing of the sight information provided by the robot’s camera In addition, in order to provide autonomy for the robot, we use an encoder and information from a two-dimensional space map given beforehand Here, we explain the navigation system that integrates these two modules We applied this system to a robot in an indoor environment and evaluated its performance, and in a discussion of our experimental results we consider the resulting problems

Proceedings ArticleDOI
03 May 2009
TL;DR: A single camera is mounted on the front of a mobile robot and an SVM is trained to classify obstacles as they are encountered by the robot to improve robustness in recognizing floor features.
Abstract: This paper describes a monocular vision-based obstacle detection method for a mobile robot using a support vector machine (SVM). A single camera is mounted on the front of a mobile robot and an SVM is trained to classify obstacles as they are encountered by the robot. Since it is not possible to train on all obstacle types a-priori, a one-class SVM is used to learn the appearance of the floor in the absence of obstacles. Anything that is not recognized as a floor is classified as an obstacle. To improve robustness in recognizing floor features, images are preprocessed using a Fast Fourier Transform (FFT) to provide translation invariance. Experimental results indicate high accuracy and specificity for four different floor surfaces that were tested.

Journal ArticleDOI
TL;DR: The framework here does not permit any discussion of its electrophysiology, histochemistry, or histology, although great advances in these fields have been made during recent years.
Abstract: Vision in itself is an extremely complex function, and single binocular vision even more so. Its clinical features constitute such an enormous group of subjects that the framework here does not permit any discussion of its electrophysiology, histochemistry, or histology, although great advances in these fields have been made during recent years. By way of introduction let me emphasize that the physiology and pathophysiology of single binocular vision still remain unelucidated in many ways. Monocular vision is the natural requisite for binocular vision. Fusion of the monocular sensory impressions from the right and left eye into a joint, unified, general sensory impression is what is known as single binocular vision.

01 Jan 2009
TL;DR: A novel algorithm based on the structure-from-motion principle that ensures real-time obstacle detection is presented that prevents non detection of obstacles by doing a systematic scan of the image while controlling the processing time.
Abstract: Currently, the automotive industry is actively seeking generic obstacle sensors based on monocular vision and able to run on low frequency central processing units (CPUs). The authors have tackled the challenge of designing a vision based obstacle detection system using a common in-vehicle micro controller: an 80 MHz 32 bits CPU. This system uses a single wide angle rear camera, commercially available. The main contribution of this paper is a novel algorithm based on the structure-from-motion principle that ensures real-time obstacle detection. This algorithm prevents non detection of obstacles by doing a systematic scan of the image while controlling the processing time. The authors demonstrated that our system is able to detect various types of obstacles, from cars to poles up to 6 meters.

01 Jan 2009
TL;DR: A target detection system on road environments based on Support Vec- tor Machine (SVM) and monocular vision and an intelligent learning approach is proposed in order to better deal with objects vari- ability, illumination conditions, partial occlusions and rotations.
Abstract: This paper describes a target detection system on road environments based on Support Vec- tor Machine (SVM) and monocular vision. The fi- nal goal is to provide pedestrian-to-car and car-to-car time gap. The challenge is to use a single camera as input, in order to achieve a low cost final system that meets the requirements needed to undertake serial production in automotive industry. The basic fea- ture of the detected objects are first located in the image using vision and then combined with a SVM- based classifier. An intelligent learning approach is proposed in order to better deal with objects vari- ability, illumination conditions, partial occlusions and rotations. A large database containing thousands of object examples has been created for learning pur- poses. The classifier is trained using SVM in order to be able to classify pedestrians, cars and trucks. In the paper, we present and discuss the results achieved up to date in real traffic conditions.

Proceedings ArticleDOI
30 Oct 2009
TL;DR: A method of obstacle recognition and location based on the monocular vision is proposed and tests indicate that the method can recognize and locate the obstacles effectively.
Abstract: Obstacle recognition and location is one of the key techniques for the autonomous transmission line inspection robot. Considering the structure of 220kV double split transmission line, a method of obstacle recognition and location based on the monocular vision is proposed. First of all, capture the image located ahead of the inspection robot with camera, and recognize such obstacles as spacer, counterweight, etc in the image. Then build the geometric model of ranging using the position relation of obstacle’s location center and camera, and calibrate the camera’s intrinsic parameters. At last, put the parameters obtained by calibration into ranging formula to perform the ranging tests. The tests indicate that the method can recognize and locate the obstacles effectively.

Journal ArticleDOI
TL;DR: This paper presents a novel method, which enhances the use of external mechanisms by considering a multisensor system, composed of sonars and a CCD camera, and uses Hough transform to extract features from raw sonar data and vision image.
Abstract: This paper presents a novel method, which enhances the use of external mechanisms by considering a multisensor system, composed of sonars and a CCD camera. Monocular vision provides redundant information about the location of the geometric entities detected by the sonar sensors. To reduce ambiguity significantly, an improved and more detailed sonar model is utilized. Moreover, Hough transform is used to extract features from raw sonar data and vision image. Information is fused at the level of features. This technique significantly improves the reliability and precision of the environment observations used for the simultaneous localization and map building problem for mobile robots. Experimental results validate the favorable performance of this approach.

Proceedings ArticleDOI
10 Oct 2009
TL;DR: A monocular vision-based odometry system that utilizes the vertical edges from the scene to estimate the robot ego-motion and the resulting closed form error model can assist to choose an appropriate pair of vertical lines to reduce the error in computation.
Abstract: When a robot travels in urban area, Global Positional System (GPS) signals might be obstructed by buildings. Hence visual odometry is a choice. We notice that the vertical edges from high buildings and poles of street lights are a very stable set of features that can be easily extracted. Thus, we develop a monocular vision-based odometry system that utilizes the vertical edges from the scene to estimate the robot ego-motion. Since it only takes a single vertical line pair to estimate the robot ego-motion on the road plane, here we model the ego-motion estimation process and analyze how the choice of different vertical line pair impacts the accuracy of the ego-motion estimation process. The resulting closed form error model can assist to choose an appropriate pair of vertical lines to reduce the error in computation. We have implemented the proposed method and validated the error analysis results in physical experiments.

Dissertation
11 Nov 2009
TL;DR: A combined relative pose and target object model estimation framework using a monocular camera as the primary feedback sensor has been designed and validated in a simulated robotic environment and tested in a simulation environment with a virtual robot manipulator tracking a target object workpiece through a relative trajectory.
Abstract: A combined relative pose and target object model estimation framework using a monocular camera as the primary feedback sensor has been designed and validated in a simulated robotic environment. The monocular camera is mounted on the end-effector of a robot manipulator and measures the image plane coordinates of a set of point features on a target workpiece object. Using this information, the relative position and orientation, as well as the geometry, of the target object are recovered recursively by a Kalman filter process. The Kalman filter facilitates the fusion of supplemental measurements from range sensors, with those gathered with the camera. This process allows the estimated system state to be accurate and recover the proper environment scale. Current approaches in the research areas of visual servoing control and mobile robotics are studied in the case where the target object feature point geometry is well-known prior to the beginning of the estimation. In this case, only the relative pose of target object frames is estimated over a sequence of frames from a single monocular camera. An observability analysis was carried out to identify the physical configurations of camera and target object for which the relative pose cannot be recovered by measuring only the camera image plane coordinates of the object point features. A popular extension to this is to concurrently estimate the target object model concurrently with the relative pose of the camera frame, a process known as Simultaneous Localization and Mapping (SLAM). The recursive framework was augmented to facilitate this larger estimation problem. The scale of the recovered solution is ambiguous using measurements from a single camera. A second observability analysis highlights more configurations for which the relative pose and target object model are unrecoverable from camera measurements alone. Instead, measurements which contain the global scale are required to obtain an accurate solution. A set of additional sensors are detailed, including range finders and additional cameras. Measurement models for each are given, which facilitate the fusion of this supplemental data with the original monocular camera image measurements. A complete framework is then derived to combine a set of such sensor measurements to recover an accurate relative pose and target object model estimate. This proposed framework is tested in a simulation environment with a virtual robot manipulator tracking a target object workpiece through a relative trajectory. All of the detailed estimation schemes are executed: the single monocular camera