scispace - formally typeset
Search or ask a question
Author

Arun Hampapur

Bio: Arun Hampapur is an academic researcher from IBM. The author has contributed to research in topics: Video tracking & Event (computing). The author has an hindex of 39, co-authored 171 publications receiving 6757 citations.


Papers
More filters
Proceedings Article•DOI•
TL;DR: The Virage engine provides an open framework for developers to 'plug-in' primitives to solve specific image management problems and can be utilized to address high-level problems as well, such as automatic, unsupervised keyword assignment, or image classification.
Abstract: Until recently, the management of large image databases has relied exclusively on manually entered alphanumeric annotations. Systems are beginning to emerge in both the research and commercial sectors based on 'content-based' image retrieval, a technique which explicitly manages image assets by directly representing their visual attributes. The Virage image search engine provides an open framework for building such systems. The Virage engine expresses visual features as image 'primitives.' Primitives can be very general (such as color, shape, or texture) or quite domain specific (face recognition, cancer cell detection, etc.). The basic philosophy underlying this architecture is a transformation from the data-rich representation of explicit image pixels to a compact, semantic-rich representation of visually salient characteristics. In practice, the design of such primitives is non-trivial, and is driven by a number of conflicting real-world constraints (e.g. computation time vs. accuracy). The virage engine provides an open framework for developers to 'plug-in' primitives to solve specific image management problems. The architecture has been designed to support both static images and video in a unified paradigm. The infrastructure provided by the Virage engine can be utilized to address high-level problems as well, such as automatic, unsupervised keyword assignment, or image classification.

921 citations

Journal Article•DOI•
TL;DR: The concepts of multiscale spatiotemporal tracking through the use of real-time video analysis, active cameras, multiple object models, and long-term pattern analysis to provide comprehensive situation awareness are explored.
Abstract: Situation awareness is the key to security. Awareness requires information that spans multiple scales of space and time. Smart video surveillance systems are capable of enhancing situational awareness across multiple scales of space and time. However, at the present time, the component technologies are evolving in isolation. To provide comprehensive, nonintrusive situation awareness, it is imperative to address the challenge of multiscale, spatiotemporal tracking. This article explores the concepts of multiscale spatiotemporal tracking through the use of real-time video analysis, active cameras, multiple object models, and long-term pattern analysis to provide comprehensive situation awareness.

335 citations

Proceedings Article•DOI•
TL;DR: This paper proposes two new sequence-matching techniques for copy detection and compares the performance with one of the existing techniques.
Abstract: Video copy detection is a complementary approach to watermarking. As opposed to watermarking, which relies on inserting a distinct pattern into the video stream, video copy detection techniques match content-based signatures to detect copies of video. Existing typical content-based copy detection schemes have relied on image matching. This paper proposes two new sequence-matching techniques for copy detection and compares the performance with one of the existing techniques. Motion, intensity and color-based signatures are compared in the context of copy detection. Results are reported on detecting copies of movie clips.

281 citations

Journal Article•DOI•
01 May 2005
TL;DR: The authors' privacy console manages operator access to different versions of video-derived data according to access-control lists and their PrivacyCam is a smart camera that produces a video stream with privacy-intrusive information already removed.
Abstract: Closed-circuit television cameras used today for surveillance sometimes enable privacy intrusion. The authors' privacy console manages operator access to different versions of video-derived data according to access-control lists. Additionally, their PrivacyCam is a smart camera that produces a video stream with privacy-intrusive information already removed.

227 citations

Journal Article•DOI•
TL;DR: The approach proposed here is inspired and influenced by well established video production processes and is used to classify the transition effects used in video and to design automatic edit effect detection algorithms.
Abstract: Effective and efficient tools for segmenting and content-based indexing of digital video are essential to allow easy access to video-based information. Most existing segmentation techniques do not use explicit models of video. The approach proposed here is inspired and influenced by well established video production processes. Computational models of these processes are developed. The video models are used to classify the transition effects used in video and to design automatic edit effect detection algorithms. Video segmentation has been formulated as a production model based classification problem. The video models are also used to define segmentation error measures. Experimental results from applying the proposed technique to commercial cable television programming are presented.

223 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal Article•DOI•

6,278 citations

Journal Article•DOI•
TL;DR: The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of content-based image retrieval.

2,197 citations

Journal Article•DOI•
TL;DR: This paper empirically evaluates facial representation based on statistical local features, Local Binary Patterns, for person-independent facial expression recognition, and observes that LBP features perform stably and robustly over a useful range of low resolutions of face images, and yield promising performance in compressed low-resolution video sequences captured in real-world environments.

2,098 citations

Proceedings Article•DOI•
01 Feb 1997
TL;DR: The VisualSEEk system is novel in that the user forms the queries by diagramming spatial arrangements of color regions by utilizing color information, region sizes and absolute and relative spatial locations.
Abstract: We describe a highly functional prototype system for searching by visual features in an image database. The VisualSEEk system is novel in that the user forms the queries by diagramming spatial arrangements of color regions. The system nds the images that contain the most similar arrangements of similar regions. Prior to the queries, the system automatically extracts and indexes salient color regions from the images. By utilizing e cient indexing techniques for color information, region sizes and absolute and relative spatial locations, a wide variety of complex joint color/spatial queries may be computed.

2,084 citations