scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Computing spatiotemporal relations for dynamic perceptual organization

01 Nov 1993-Cvgip: Image Understanding (Academic Press, Inc.)-Vol. 58, Iss: 3, pp 338-351
TL;DR: In this article, the authors define dynamic perceptual organization as an extension of the traditional (static) perceptual organization approach, and propose a new paradigm for motion understanding and show why it can be done independently of the recovery of scene structure and scene motion.
Abstract: To date, the overwhelming use of motion in computational vision has been to recover the three-dimensional structure of the scene. We propose that there are other, more powerful, uses for motion. Toward this end, we define dynamic perceptual organization as an extension of the traditional (static) perceptual organization approach. Just as static perceptual organization groups coherent features in an image, dynamic perceptual organization groups coherent motions through an image sequence. Using dynamic perceptual organization, we propose a new paradigm for motion understanding and show why it can be done independently of the recovery of scene structure and scene motion. The paradigm starts with a spatiotemporal cube of image data and organizes the paths of points so that interactions between the paths, and perceptual motions such as common , relative , and cyclic are made explicit. The results of this can then be used for high-level motion recognition tasks.

Content maybe subject to copyright    Report






Citations
More filters
Patent
14 May 2009
TL;DR: In this paper, a surveillance system with at least one primary video camera for translating real images of a zone into electronic video signals at a first level of resolution is presented, which includes means for sampling movements of an individual or individuals located within the zone from the video signal output from at least 1 video camera.
Abstract: A surveillance system having at least one primary video camera for translating real images of a zone into electronic video signals at a first level of resolution. The system includes means for sampling movements of an individual or individuals located within the zone from the video signal output from at least one video camera. Video signals of sampled movements of the individual is electronically compared with known characteristics of movements which are indicative of individuals having a criminal intent. The level of criminal intent of the individual or individuals is then determined and an appropriate alarm signal is produced.

599 citations

Journal ArticleDOI
TL;DR: A review of recent developments in the computer vision aspect of motionbased recognition and several methods for the recognition of objects and motions, including cyclic motion detection and recognition, lipreading, hand gestures interpretation, motion verb recognition and temporal textures classification are reported.

489 citations

Journal ArticleDOI
TL;DR: A structured synopsis of the problems in image motion computation and analysis, and of the methods proposed, exposing the underlying models and supporting assumptions are offered.
Abstract: The goal of this paper is to offer a structured synopsis of the problems in image motion computation and analysis, and of the methods proposed, exposing the underlying models and supporting assumptions. A sufficient number of pointers to the literature will be given, concentrating mostly on recent contributions. Emphasis will be on the detection, measurement and segmentation of image motion. Tracking, and deformable motion issues will be also addressed. Finally, a number of related questions which could require more investigations will be presented.

275 citations

Journal ArticleDOI
TL;DR: Experimental results on different types of scenes demonstrate the ability of the proposed technique for spatio-temporal segmentation to automatically partition the scene into its constituent objects.
Abstract: This paper proposes a technique for spatio-temporal segmentation to identify the objects present in the scene represented in a video sequence. This technique processes two consecutive frames at a time. A region-merging approach is used to identify the objects in the scene. Starting from an oversegmentation of the current frame, the objects are formed by iteratively merging regions together. Regions are merged based on their mutual spatio-temporal similarity. We propose a modified Kolmogorov-Smirnov test for estimating the temporal similarity. The region-merging process is based on a weighted, directed graph. Two complementary graph-based clustering rules are proposed, namely, the strong rule and the weak rule. These rules take advantage of the natural structures present in the graph. Experimental results on different types of scenes demonstrate the ability of the proposed technique to automatically partition the scene into its constituent objects.

210 citations

Proceedings ArticleDOI
02 Nov 2003
TL;DR: A novel methodology for implementing video search functions such as retrieval of near-duplicate videos and recognition of actions in surveillance video, with the distinct advantage that there is no need for detection or recognition of body parts.
Abstract: This paper describes a novel methodology for implementing video search functions such as retrieval of near-duplicate videos and recognition of actions in surveillance video. Videos are divided into half-second clips whose stacked frames produce 3D space-time volumes of pixels. Pixel regions with consistent color and motion properties are extracted from these 3D volumes by a threshold-free hierarchical space-time segmentation technique. Each region is then described by a high-dimensional point whose components represent the position, motion and, when possible, color of the region. In the indexing phase for a video database, these points are assigned labels that specify their video clip of origin. All the labeled points for all the clips are stored into a single binary tree for efficient $k$-nearest neighbor retrieval. The retrieval phase uses video segments as queries. Half-second clips of these queries are again segmented to produce sets of points, and for each point the labels of its nearest neighbors are retrieved. The labels that receive the largest numbers of votes correspond to the database clips that are the most similar to the query video segment. We illustrate this approach for video indexing and retrieval and for action recognition. First, we describe retrieval experiments for dynamic logos, and for video queries that differ from the indexed broadcasts by the addition of large overlays. Then we describe experiments in which office actions (such as pulling and closing drawers, taking and storing items, picking up and putting down a phone) are recognized. Color information is ignored to insure independence to people's appearance. One of the distinct advantages of using this approach for action recognition is that there is no need for detection or recognition of body parts.

83 citations

References
More filters
Journal ArticleDOI
TL;DR: The kinetic-geometric model for visual vector analysis originally developed in the study of perception of motion combinations of the mechanical type was applied to biological motion patterns and the results turned out to be highly positive.
Abstract: This paper reports the first phase of a research program on visual perception of motion patterns characteristic of living organisms in locomotion. Such motion patterns in animals and men are termed here as biological motion. They are characterized by a far higher degree of complexity than the patterns of simple mechanical motions usually studied in our laboratories. In everyday perceptions, the visual information from biological motion and from the corresponding figurative contour patterns (the shape of the body) are intermingled. A method for studying information from the motion pattern per se without interference with the form aspect was devised. In short, the motion of the living body was represented by a few bright spots describing the motions of the main joints. It is found that 10–12 such elements in adequate motion combinations in proximal stimulus evoke a compelling impression of human walking, running, dancing, etc. The kinetic-geometric model for visual vector analysis originally developed in the study of perception of motion combinations of the mechanical type was applied to these biological motion patterns. The validity of this model in the present context was experimentally tested and the results turned out to be highly positive.

4,175 citations

Book
01 Jan 1979
TL;DR: In this paper, the authors used the methodology of artificial intelligence to investigate the phenomena of visual motion perception: how the visual system constructs descriptions of the environment in terms of objects, their three-dimensional shape, and their motion through space, on the basis of the changing image that reaches the eye.
Abstract: This book uses the methodology of artificial intelligence to investigate the phenomena of visual motion perception: how the visual system constructs descriptions of the environment in terms of objects, their three-dimensional shape, and their motion through space, on the basis of the changing image that reaches the eye. The author has analyzed the computations performed in the course of visual motion analysis. Workable schemes able to perform certain tasks performed by the visual system have been constructed and used as vehicles for investigating the problems faced by the visual system and its methods for solving them.Two major problems are treated: first, the correspondence problem, which concerns the identification of image elements that represent the same object at different times, thereby maintaining the perceptual identity of the object in motion or in change. The second problem is the three-dimensional interpretation of the changing image once a correspondence has been established.The author's computational approach to visual theory makes the work unique, and it should be of interest to psychologists working in visual perception and readers interested in cognitive studies in general, as well as computer scientists interested in machine vision, theoretical neurophysiologists, and philosophers of science.

2,070 citations

Proceedings ArticleDOI
15 Jun 1992
TL;DR: The recognition rate is improved by increasing the number of people used to generate the training data, indicating the possibility of establishing a person-independent action recognizer.
Abstract: A human action recognition method based on a hidden Markov model (HMM) is proposed. It is a feature-based bottom-up approach that is characterized by its learning capability and time-scale invariability. To apply HMMs, one set of time-sequential images is transformed into an image feature vector sequence, and the sequence is converted into a symbol sequence by vector quantization. In learning human action categories, the parameters of the HMMs, one per category, are optimized so as to best describe the training sequences from the category. To recognize an observed sequence, the HMM which best matches the sequence is chosen. Experimental results for real time-sequential images of sports scenes show recognition rates higher than 90%. The recognition rate is improved by increasing the number of people used to generate the training data, indicating the possibility of establishing a person-independent action recognizer. >

1,477 citations

Journal ArticleDOI
TL;DR: Gibson et al. as discussed by the authors applied the ecological approach to perception to the social domain and found its applicability to social perception and its specific implications for research on emotion perception, impression formation, and causal attribution.
Abstract: The ecological approach to perception (J. Gibson, 1979; Shaw, Turvey, & Mace, 1982) is applied to the social domain. The general advantages of this approach are enumerated, its applicability to social perception is documented, and its specific implications for research on emotion perception, impression formation, and causal attribution are discussed. The implications of the ecological approach for our understanding of errors in social perception are also considered. Finally, the major tenets of the ecological approach are contrasted with current cognitive approaches, and a plea is made for greater attention to the role of perception in social knowing.

947 citations