scispace - formally typeset
Search or ask a question
Topic

Depth perception

About: Depth perception is a research topic. Over the lifetime, 3101 publications have been published within this topic receiving 80071 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF is presented, and a deep structured learning scheme which learns the unary and pairwise potentials of continuousCRF in a unified deep CNN framework is proposed.
Abstract: In this article, we tackle the problem of depth estimation from single monocular images. Compared with depth estimation using multiple images such as stereo depth perception, depth from monocular images is much more challenging. Prior work typically focuses on exploiting geometric priors or additional sources of information, most using hand-crafted features. Recently, there is mounting evidence that features from deep convolutional neural networks (CNN) set new records for various vision applications. On the other hand, considering the continuous characteristic of the depth values, depth estimation can be naturally formulated as a continuous conditional random field (CRF) learning problem. Therefore, here we present a deep convolutional neural field model for estimating depths from single monocular images, aiming to jointly explore the capacity of deep CNN and continuous CRF. In particular, we propose a deep structured learning scheme which learns the unary and pairwise potentials of continuous CRF in a unified deep CNN framework. We then further propose an equally effective model based on fully convolutional networks and a novel superpixel pooling method, which is about 10 times faster, to speedup the patch-wise convolutions in the deep model. With this more efficient model, we are able to design deeper networks to pursue better performance. Our proposed method can be used for depth estimation of general scenes with no geometric priors nor any extra information injected. In our case, the integral of the partition function can be calculated in a closed form such that we can exactly solve the log-likelihood maximization. Moreover, solving the inference problem for predicting depths of a test image is highly efficient as closed-form solutions exist. Experiments on both indoor and outdoor scene datasets demonstrate that the proposed method outperforms state-of-the-art depth estimation approaches.

1,229 citations

Journal ArticleDOI
TL;DR: Evidence from psychology, neuropsychology, and neurophysiology supports the idea that there are multiple systems for recognition of objects, and indicates that one system may represent objects by combinations of multiple views, or aspects, and another may representObjects by structural primitives and their spatial interrelationships.
Abstract: Visual object recognition is of fundamental importance to most animals. The diversity of tasks that any biological recognition system must solve suggests that object recognition is not a single, general purpose process. In this review, we consider evidence from the fields of psychology, neuropsychology, and neurophysiology, all of which supports the idea that there are multiple systems for recognition. Data from normal adults, infants, animals, and brain-damaged patients reveal a major distinction between the classification of objects at a basic category level and the identification of individual objects from a homogeneous object class. An additional distinction between object representations used for visual perception and those used for visually guided movements provides further support for a multiplicity of visual recognition systems. Recent evidence from psychophysical and neurophysiological studies indicates that one system may represent objects by combinations of multiple views, or aspects, and another may represent objects by structural primitives and their spatial interrelationships.

1,205 citations

Journal ArticleDOI
TL;DR: The importance of various causes and aspects of visual discomfort is clarified and three-dimensional artifacts resulting from insufficient depth information in the incoming data signal yielding spatial and temporal inconsistencies are believed to be the most pertinent.
Abstract: Visual discomfort has been the subject of considerable research in relation to stereoscopic and autostereoscopic displays. In this paper, the importance of various causes and aspects of visual discomfort is clarified. When disparity values do not surpass a limit of 1°, which still provides sufficient range to allow satisfactory depth perception in stereoscopic television, classical determinants such as excessive binocular parallax and accommodation-vergence conflict appear to be of minor importance. Visual discomfort, however, may still occur within this limit and we believe the following factors to be the most pertinent in contributing to this: (1) temporally changing demand of accommodation-vergence linkage, e.g., by fast motion in depth; (2) three-dimensional artifacts resulting from insufficient depth information in the incoming data signal yielding spatial and temporal inconsistencies; and (3) unnatural blur. In order to ad- equately characterize and understand visual discomfort, multiple types of measurements, both objective and subjective, are required. © 2009 Society for Imaging Science and Technology. DOI: 10.2352/J.ImagingSci.Technol.2009.53.3.030201

990 citations

Journal ArticleDOI
TL;DR: The problem of how three-dimensional form is perceived in spite of the fact that pertinent stimulation consists only in two-dimensional retinal images has been only partly solved.
Abstract: The problem of how three-dimensional form is perceived in spite of the fact that pertinent stimulation consists only in two-dimensional retinal images has been only partly solved. Much is known about the impressive effectiveness of binocular disparity. However, the excellent perception of threedimensional form in monocular vision has remained essentially unexplained. It has been proposed that some patterns of stimulation on the retina give rise to three-dimensional experiences, because visual processes differ in the spontaneous organization that results from certain properties of the retinal pattern. Rules of organization are supposed to exist according to which most retinal projections of three-dimensional forms happen to produce three-dimensional percepts and most retinal images of flat forms lead to flat forms in experience also. This view has been held mainly by gestalt psychologists. Another approach to this problem maintains that the projected stimulus patterns are interpreted on the basis of previous experience, either visual

963 citations

Journal ArticleDOI
TL;DR: In this paper, the problem of finding binocular parallax matching patterns of the left and right visual fields was investigated using stereo image pairs generated on a digital computer, and it was shown that pattern-matching can be achieved by first combining the two fields and then searching for patterns in the fused field.
Abstract: The perception of depth involves monocular and binocular depth cues. The latter seem simpler and more suitable for investigation. Particularly important is the problem of finding binocular parallax, which involves matching patterns of the left and right visual fields. Stereo pictures of familiar objects or line drawings preclude the separation of interacting cues, and thus this pattern-matching process is difficult to investigate. More insight into the process can be gained by using unfamiliar picture material devoid of all cues except binocular parallax. To this end, artificial stereo picture pairs were generated on a digital computer. When viewed monocularly, they appear completely random, but if viewed binocularly, certain correlated point domains are seen in depth. By introducing distortions in this material and testing for perception of depth, it is possible to show that pattern-matching of corresponding points of the left and right visual fields can be achieved by first combining the two fields and then searching for patterns in the fused field. By this technique, some interesting properties of this fused binocular field are revealed, and a simple analog model is derived. The interaction between the monocular and binocular fields is also describea. A number of stereo images that demonstrate these and other findings are presented.

726 citations


Network Information
Related Topics (5)
Visual perception
20.8K papers, 997.2K citations
87% related
Visual cortex
18.8K papers, 1.2M citations
81% related
Facial expression
17K papers, 639.9K citations
79% related
Stimulus (physiology)
16K papers, 674.4K citations
78% related
Cognitive neuroscience of visual object recognition
13.6K papers, 622.2K citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202346
2022104
202190
202093
2019115
201898