Showing papers on "Video tracking published in 1996"

PDF

Open Access

Journal Article•DOI•

Extraction of high-resolution frames from video sequences

[...]

Richard R. Schultz¹, Robert L. Stevenson²•Institutions (2)

University of North Dakota¹, University of Notre Dame²

01 Jun 1996-IEEE Transactions on Image Processing

TL;DR: A novel observation model based on motion compensated subsampling is proposed for a video sequence and Bayesian restoration with a discontinuity-preserving prior image model is used to extract a high-resolution video still given a short low-resolution sequence.

...read moreread less

Abstract: The human visual system appears to be capable of temporally integrating information in a video sequence in such a way that the perceived spatial resolution of a sequence appears much higher than the spatial resolution of an individual frame. While the mechanisms in the human visual system that do this are unknown, the effect is not too surprising given that temporally adjacent frames in a video sequence contain slightly different, but unique, information. This paper addresses the use of both the spatial and temporal information present in a short image sequence to create a single high-resolution video frame. A novel observation model based on motion compensated subsampling is proposed for a video sequence. Since the reconstruction problem is ill-posed, Bayesian restoration with a discontinuity-preserving prior image model is used to extract a high-resolution video still given a short low-resolution sequence. Estimates computed from a low-resolution image sequence containing a subpixel camera pan show dramatic visual and quantitative improvements over bilinear, cubic B-spline, and Bayesian single frame interpolations. Visual and quantitative improvements are also shown for an image sequence containing objects moving with independent trajectories. Finally, the video frame extraction algorithm is used for the motion-compensated scan conversion of interlaced video data, with a visual comparison to the resolution enhancement obtained from progressively scanned frames.

...read moreread less

1,058 citations

Proceedings Article•DOI•

Superior augmented reality registration by integrating landmark tracking and magnetic tracking

[...]

Andrei State¹, Gentaro Hirota¹, David Chen¹, William F. Garrett¹, Mark A. Livingston¹ - Show less +1 more•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Aug 1996

TL;DR: This work presents a hybrid tracking method that combines the accuracy of vision-based tracking with the robustness of magnetic tracking without compromising real-time performance or usability.

...read moreread less

Abstract: Accurate registration between real and virtual objects is crucial for augmented reality applications. Existing tracking methods are individually inadequate: magnetic trackers are inaccurate, mechanical trackers are cumbersome, and vision-based trackers are computationally problematic. We present a hybrid tracking method that combines the accuracy of vision-based tracking with the robustness of magnetic tracking without compromising real-time performance or usability. We demonstrate excellent registration in three sample applications. CR

...read moreread less

457 citations

Patent•

Method and apparatus for frame accurate access of digital audio-visual information

[...]

Mark A. Porter¹, Dave Pawson¹•Institutions (1)

Oracle Corporation¹

12 Jul 1996

TL;DR: In this article, a method and apparatus for use in a digital video delivery system is provided, where a digital representation of an audio-visual work, such as an MPEG file, is parsed to produce a tag file.

...read moreread less

Abstract: A method and apparatus for use in a digital video delivery system is provided. A digital representation of an audio-visual work, such as an MPEG file, is parsed to produce a tag file. The tag file includes information about each of the frames in the audio-visual work. During the performance of the audio-visual work, data from the digital representation is sent from a video pump to a decoder. Seek operations are performed by causing the video pump to stop transmitting data from the current position in the digital representation, and to start transmitting data from a new position in the digital representation. The information in the tag file is inspected to determine the new position from which to start transmitting data. To ensure that the data stream transmitted by the video pump maintains compliance with the applicable video format, prefix data that includes appropriate header information is transmitted by said video pump prior to transmitting data from the new position. Fast and slow forward and rewind operations are performed by selecting video frames based on the information contained in the tag file and the desired presentation rate, and generating a data stream containing data that represents the selected video frames. A video editor is provided for generating a new video file from pre-existing video files. The video editor selects frames from the pre-existing video files based on editing commands and the information contained in the tag files of the pre-existing video files. A presentation rate, start position, end position, and source file may be separately specified for each sequence to be created by the video editor.

...read moreread less

388 citations

Journal Article•DOI•

Rate control of MPEG video coding and recording by rate-quantization modeling

[...]

Wei Ding¹, Bede Liu¹•Institutions (1)

Princeton University¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A feedback re-encoding method with a rate-quantization model, which can be adapted to changes in picture activities, is developed and used for quantization parameter selection at the frame and slice level.

...read moreread less

Abstract: For MPEG video coding and recording applications, it is important to select the quantization parameters at slice and macroblock levels to produce consistent quality image for a given bit budget. A well-designed rate control strategy can improve the overall image quality for video transmission over a constant-bit-rate channel and fulfil the editing requirement of video recording, where a certain number of new pictures are encoded to replace consecutive frames on the storage media using, at most, the same number of bits. We developed a feedback re-encoding method with a rate-quantization model, which can be adapted to changes in picture activities. The model is used for quantization parameter selection at the frame and slice level. The extra computations needed are modest. Experiments show the accuracy of the model and the effectiveness of the proposed rate control method. A new bit allocation algorithm is then proposed for MPEG video coding.

...read moreread less

377 citations

Proceedings Article•DOI•

Perceptual quality measure using a spatiotemporal model of the human visual system

[...]

Christian J. van den Branden Lambrecht¹, Olivier Verscheure¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

22 Mar 1996

TL;DR: A metric for the assessment of video coding quality is presented based on a multi- channel model of human spatio-temporal vision that has been parameterized for video coding applications by psychophysical experiments.

...read moreread less

Abstract: This paper addresses the problem of quality estimation of digitally coded video sequences. The topic is of great interest since many products in digital video are about to be released and it is thus important to have robust methodologies for testing and performance evaluation of such devices. The inherent problem is that human vision has to be taken into account in order to assess the quality of a sequence with a good correlation with human judgment. It is well known that the commonly used metric, the signal-to-noise ratio is not correlated with human vision. A metric for the assessment of video coding quality is presented. It is based on a multi- channel model of human spatio-temporal vision that has been parameterized for video coding applications by psychophysical experiments. The visual mechanisms of vision are simulated by a spatio-temporal filter bank. The decomposition is then used to account for phenomena as contrast sensitivity and masking. Once the amount of distortions actually perceived is known, quality estimation can be assessed at various levels. The described metric is able to rate the overall quality of the decoded video sequence as well as the rendition of important features of the sequence such as contours or textures.

...read moreread less

372 citations

Patent•

Communication apparatus, image processing apparatus, communication method, and image processing method

[...]

Hiroki Yonezawa¹, Tomoaki Kawai¹, Masaaki Kotani¹, Kazuko Tsujimura¹•Institutions (1)

Canon Inc.¹

18 Dec 1996

TL;DR: In this article, the operation of drag and drop of a symbol to a specific position on a map showing the symbol indicating information of the position where an image generator is set, thereby establishing logical network connection with a video transmission terminal to which the image generator was connected, is described.

...read moreread less

Abstract: In order to freely select, locate and display an image from a remote place on a monitor to facilitate observer's use, disclosed are apparatus and methods arranged to perform operation of drag and drop of a symbol to a specific position on a map showing the symbol indicating information of the position where an image generator is set, thereby establishing logical network connection with a video transmission terminal to which the image generator is connected, to display a video in an arbitrary display area, to perform the drag and drop operation of the video displayed in the video display area to another video display area, thereby changing a video display position, and to perform the drag and drop operation thereof to a display stop symbol, thereby disconnecting the logical network connection to stop the video display of the video camera.

...read moreread less

353 citations

Proceedings Article•DOI•

Digital watermarking of raw and compressed video

[...]

Frank Hartung, Bernd Girod

16 Sep 1996

TL;DR: In this article, a scheme for robust interoperable watermarking of MPEG-2 encoded video is presented. But the watermark is embedded either into the uncoded video or into the MPEG2 bitstream, and can be retrieved from the decoded video.

...read moreread less

Abstract: Embedding information into multimedia data is a topic that has gained increasing attention recently. For video broadcast applications, watermarking of video, and especially of already encoded video, is interesting. We present a scheme for robust interoperable watermarking of MPEG-2 encoded video. The watermark is embedded either into the uncoded video or into the MPEG-2 bitstream, and can be retrieved from the decoded video. The scheme working on encoded video is of much lower complexity than a complete decoding process followed by watermarking in the pixel domain and re-encoding. Although an existing MPEG-2 bitstream is partly altered, the scheme avoids drift problems. The scheme has been implemented and practical results show that a robust watermark can be embedded into MPEG encoded video which can be used to transmit arbitrary binary information at a data rate of several bytes/second.

...read moreread less

332 citations

Proceedings Article•DOI•

An empirical study of secure MPEG video transmissions

[...]

I. Agi, L. Gong

22 Feb 1996

TL;DR: This study showed that previously proposed selective encryption schemes for MPEG video security are inadequate for sensitive applications and discusses the tradeoffs between levels of security and computational and compression efficiency.

...read moreread less

Abstract: MPEG (Moving Pictures Expert Group) is an industrial standard for video processing and is widely used in multimedia applications in the Internet. However, no security provision is specified in the standard. We conducted an experimental study of previously proposed selective encryption schemes for MPEG video security. This study showed that these methods are inadequate for sensitive applications. We discuss the tradeoffs between levels of security and computational and compression efficiency.

...read moreread less

294 citations

Patent•

A system for enhancing the television presentation of an object at a sporting event

[...]

Stanley K. Honey, Richard H. Cavallaro, David Blyth Hill, Andrew G. Setos, Jerry Neil Gepner, Thimothy Paul Heidmann, Patrick W Olsen, Fred Judson Heinzmann, Alan C. Phillips, Harold Guthart, Alan Alexander Burns, Charles L. Rino, Philip Calvin Evans - Show less +9 more

28 Jun 1996

TL;DR: In this article, a system for enhancing the television presentation of an object at a sporting event includes a sensor (210, 212, 214, 216), which determines the location of the object.

...read moreread less

Abstract: A system (200) for enhancing the television presentation of an object at a sporting event includes a sensor (210, 212, 214, 216), which determines the location of the object. Based on the location of the object and the field of view of a broadcast camera (201, 202, 203, 204), a processor (302) determines the position of the object in the video frame of the broadcast camera. Once knowing where the object is positioned within the video frame, the television signal can be edited or augmented to enhance the presentation of the object.

...read moreread less

286 citations

Patent•

Video viewing assisting method and a video playback system therefor

[...]

Akio Nagasaka¹, Hirotada Ueda¹, Takafumi Miyatake¹•Institutions (1)

Hitachi¹

16 Feb 1996

TL;DR: In this article, a video terminal device capable of controlling video playback by a controller stores a position of the video program at which it was interrupted by the user, and the interrupted position is stored in a video library of a video server.

...read moreread less

Abstract: In order to provided information for helping a viewing user remember the contents of past viewing of a video program, in an easy-to-comprehend form and in as small an information amount as possible, a video terminal device capable of controlling video playback by a controller stores a position of the video program at which it was interrupted by the user. The interrupted position is stored in a video library of a video server. Images representative of a portion from the start or another position of the interrupted video program up to the interrupted position are extracted by a video digest making program. The extracted representative images are represented by a list display based on reduced icons or a digest image. The list or the digest image is displayed before resuming the interrupted video program.

...read moreread less

266 citations

Patent•

Video superposition system and method

[...]

Jaron Lanier

18 Sep 1996

TL;DR: In this paper, a graphic image system comprising a video camera producing a first video signal defining a first image including a foreground object and a background, the foreground object preferably including an image of a human subject having a head with a face, an image position estimating system for identifying a position with respect to said foreground object, and a computer, responsive to the position estimation system, for defining a mask region separating the foreground objects from said background.

...read moreread less

Abstract: A graphic image system comprising a video camera producing a first video signal defining a first image including a foreground object and a background, the foreground object preferably including an image of a human subject having a head with a face; an image position estimating system for identifying a position with respect to said foreground object, e.g., the head, the foreground object having features in constant physical relation to the position; and a computer, responsive to the position estimating system, for defining a mask region separating the foreground object from said background. The computer generates a second video signal including a portion corresponding to the mask region, responsive to said position estimating system, which preferably includes a character having a mask outline. In one embodiment, the mask region of the second video signal is keyed so that the foreground object of the first video signal shows through, with the second video signal having portions which interact with the foreground object. In another embodiment, means, responsive to the position estimating system, for dynamically defining an estimated boundary of the face and for merging the face, as limited by the estimated boundary, within the mask outline of the character. Video and still imaging devices may be flexibly placed in uncontrolled environments, such as in a kiosk in a retail store, with an actual facial image within the uncontrolled environment placed within a computer generated virtual world replacing the existing background and any non-participants.

...read moreread less

Journal Article•DOI•

A self-calibration technique for active vision systems

[...]

Sang De Ma

01 Feb 1996

TL;DR: In this article, the authors present a technique for calibrating the head-eye geometry and the camera intrinsic parameters using three pure translational motions, each consisting of three orthogonal translations, are necessary to determine the camera orientation and intrinsic parameters.

...read moreread less

Abstract: A manipulator wrist-mounted camera considerably facilitates motion stereo, object tracking, and active perception. An important issue in active vision is to determine the camera position and orientation relative to the camera platform (head-eye calibration or hand-eye calibration). We present a technique for calibrating the head-eye geometry and the camera intrinsic parameters. The technique allows camera self-calibration because it requires no reference object and directly uses the images of the environment. Camera self-calibration is important especially where the underlying visual tasks do not permit the use of reference objects. Our method exploits the flexibility of the active vision system, and bases camera calibration on a sequence of specially designed motion. It is shown that if the camera intrinsic parameters are known a priori, the orientation of the camera relative to the platform can be solved using 3 pure translational motions. If the intrinsic parameters are unknown, then two sequences of motion, each consisting of three orthogonal translations, are necessary to determine the camera orientation and intrinsic parameters. Once the camera orientation and intrinsic parameters are determined, the position of the camera relative to the platform can be computed from an arbitrary nontranslational motion of the platform. All the computations in our method are linear. Experimental results with real images are presented.

...read moreread less

Proceedings Article•DOI•

Time-constrained clustering for segmentation of video into story units

[...]

Minerva M. Yeung¹, Boon-Lock Yeo•Institutions (1)

Princeton University¹

25 Aug 1996

TL;DR: Time-constrained clustering of video shots is proposed to collapse visually similar and temporally local shots into a compact structure that allows the automatic segmentation of scenes and story units that cannot be achieved by existing shot boundary detection schemes.

...read moreread less

Abstract: Many video programs have story structures that can be recognized through the clustering of video contents based on low-level visual primitives, and the analysis of high level structures imposed by temporal arrangement of composing elements. In this paper time-constrained clustering of video shots is proposed to collapse visually similar and temporally local shots into a compact structure. We show that the proposed clustering formulations, when incorporated into the scene transition graph framework, allows the automatic segmentation of scenes and story units that cannot be achieved by existing shot boundary detection schemes. The proposed method is able to decompose video into meaningful hierarchies and provide compact representations that reflect the flow of story, thus offering efficient browsing and organization of video.

...read moreread less

Proceedings Article•DOI•

Clustering methods for video browsing and annotation

[...]

Di Zhong¹, Hong-Jiang Zhang¹, Shih-Fu Chang²•Institutions (2)

National University of Singapore¹, Columbia University²

13 Mar 1996-Storage and Retrieval for Image and Video Databases

TL;DR: A generalized top-down hierarchial clustering process, which adopts partition clustering recursively at each level of the hierarchy, is studied and used to build hierarchical views of video shots.

...read moreread less

Abstract: The large amount of video data makes it a tedious and hard job to browse and annotate them by just fast forward and rewind. Recent works in video parsing provide a foundation for building interactive and content based video browsing systems. In this paper, a generalized top-down hierarchial clustering process, which adopts partition clustering recursively at each level of the hierarchy, is studied and used to build hierarchical views of video shots. With the clustering processes, when a list of video programs or clips is provided, a browsing system can use either key-frame and/or shot features to cluster shots into classes, each of which consists of shots of similar content. After such clustering, each class of shots can be represented by an icon, which can then be displayed at the high levels of a hierarchical browser. As a result, users can know roughly the content of video shots even without moving down to a lower level of the hierarchy.© (1996) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Patent•

Hybrid tracking for augmented reality using both camera motion detection and landmark tracking

[...]

Gentaro Hirota¹, Andrei State•Institutions (1)

University of North Carolina at Chapel Hill¹

02 Aug 1996

TL;DR: In this article, the authors proposed a hybrid tracking system that combines the registration accuracy of vision-based tracking and the robustness of magnetic tracking systems, which is applicable to see-through and video augmented reality systems.

...read moreread less

Abstract: Systems, methods and computer program products which have the registration accuracy of vision-based tracking systems and the robustness of magnetic tracking systems. Video tracking of landmarks is utilized as the primary method for determining camera position and orientation but is enhanced by magnetic or other forms of physical tracking camera movement and orientation. A physical tracker narrows the landmark search area on images, speeding up the landmark search process. Information from the physical tracker may also be used to select one of several solutions of a non-linear equation resulting from the vision-based tracker. The physical tracker may also act as a primary tracker if the image analyzer cannot locate enough landmarks to provide proper registration, thus, avoiding complete loss of registration. Furthermore, if 1 or 2 landmarks (not enough for a unique solution) are detected, several may be utilized heuristic methods are used to minimize registration loss. Catastrophic failure may be avoided by monitoring the difference between results from the physical tracker and the vision-based tracker and discarding corrections that exceed a certain magnitude. The hybrid tracking system is equally applicable to see-through and video augmented reality systems.

...read moreread less

Proceedings Article•DOI•

A camera-based system for tracking people in real time

[...]

Jakub Segen¹•Institutions (1)

Bell Labs¹

25 Aug 1996

TL;DR: The system has numerous applications since various statistics and indicators of human activity can be derived from the motion trajectories, including people counts, presence and time spent in a region, traffic density maps and directional traffic statistics.

...read moreread less

Abstract: This paper describes a system for real-time tracking of people in video sequences. The input to the system is live or recorded video data acquired by a stationary camera in an environment where the primary moving objects are people. The output consists of trajectories which give the spatio-temporal coordinates of individual persons as they move in the environment. The system uses a new model-based approach to object tracking. It identifies feature points in each video frame, matches feature points across frames to produce feature "paths", then groups short-lived and partially overlapping feature paths into longer living trajectories representing motion of individual persons. The path grouping is based on a novel model-based algorithm for motion clustering. The system runs on an SGI Indy workstation at an average rate of 14 frames a second. The system has numerous applications since various statistics and indicators of human activity can be derived from the motion trajectories. Examples of these indicators described in the paper include people counts, presence and time spent in a region, traffic density maps and directional traffic statistics.

...read moreread less

Journal Article•DOI•

Abstracting Digital Movies Automatically

[...]

Silvia Pfeiffer¹, Rainer Lienhart¹, Stephan Fischer¹, Wolfgang Effelsberg¹•Institutions (1)

University of Mannheim¹

01 Apr 1996-Journal of Visual Communication and Image Representation

TL;DR: In this article, the authors define a video abstract as a sequence of still or moving images presenting the content of a video in such a way that the target group is rapidly provided with concise information about the content while the essential message of the original is preserved.

...read moreread less

Book•

Rate-Distortion Based Video Compression: Optimal Video Frame Compression and Object Boundary Encoding

[...]

Guido M. Schuster, Aggelos K. Katsaggelos

31 Dec 1996

TL;DR: Rate-Distortion Based Video Compression establishes a general theory for the optimal bit allocation among dependent quantizers, which is used to design efficient motion estimation schemes, video compression schemes and object boundary encoding schemes.

...read moreread less

Abstract: From the Publisher: The book contains a review chapter on video compression, a background chapter on optimal bit allocation and the necessary mathematical tools, such as the Lagrangian multiplier method and Dynamic Programming. These two introductory chapters make the book self-contained and a fast way of entering this exciting field. Rate-Distortion Based Video Compression establishes a general theory for the optimal bit allocation among dependent quantizers. The minimum total (average) distortion and the minimum maximum distortion cases are discussed. This theory is then used to design efficient motion estimation schemes, video compression schemes and object boundary encoding schemes. For the motion estimation schemes, the theory is used to optimally trade the reduction of energy in the displaced frame difference (DFD) for the increase in the rate required to encode the displacement vector field (DVF). These optimal motion estimators are then used to formulate video compression schemes which achieve an optimal distribution of the available bit rate among DVF, DFD and segmentation. This optimal bit allocation results in very efficient video coders. In the last part of the book, the proposed theory is applied to the optimal encoding of object boundaries, where the bit rate needed to encode a given boundary is traded for the resulting geometrical distortion. Again, the resulting boundary encoding schemes are very efficient. Rate-Distortion Based Video Compression is ideally suited for anyone interested in this booming field of research and development, especially engineers who are concerned with the implementation and design of efficient video compression schemes. It also represents a foundation for future research, since all the key elements needed are collected and presented uniformly. Therefore, it is ideally suited for graduate students and researchers working in this field.

...read moreread less

Journal Article•DOI•

The advanced video information system: data structures and query processing

[...]

Sibel Adali¹, K. Selçuk Candan¹, Su-Shing Chen¹, Kutluhan Erol¹, V. S. Subrahmanian¹ - Show less +1 more•Institutions (1)

University of Maryland, College Park¹

01 Sep 1996-Multimedia Systems

TL;DR: A formal model for video data is developed and it is shown how spatial data structures, suitably modified, provide an elegant way of storing such data.

...read moreread less

Abstract: We describe how video data can be organized and structured so as to facilitate efficient querying. We develop a formal model for video data and show how spatial data structures, suitably modified, provide an elegant way of storing such data. We develop algorithms to process various kinds of video queries and show that, in most cases, the complexity of these algorithms is linear. A prototype system, called the Advanced Video Information System (AVIS), based on these concepts, has been designed at the University of Maryland.

...read moreread less

Proceedings Article•

Immersive video

[...]

S. Moezzi¹, Arun Katkere¹, D.Y. Kuramura¹, Ramesh Jain¹•Institutions (1)

University of California, San Diego¹

30 Mar 1996

TL;DR: The concept of immersive video, which employs computer vision and computer graphics technologies to provide viewers of live events a sense of total immersion by providing the viewer with a "virtual camera", is introduced.

...read moreread less

Abstract: Interactive video and television viewers should have the power to control their viewing position. To realize this, we introduce the concept of immersive video, which employs computer vision and computer graphics technologies to provide viewers of live events a sense of total immersion by providing the viewer with a "virtual camera". Immersive video uses multiple videos of an event, captured from different perspectives, to generate a full 3D digital video of that event. While replaying this 3D digital movie, interactive viewers are able to explore the scene continuously from any perspective. This is accomplished by combining an a priori static model with a dynamic model that is created by assimilating dynamic information from each video stream into a comprehensive three dimensional environment model. We formalize the concept of immersive video, describe the architecture of our current implementation, and illustrate immersive video in staged karate demonstrations and basketball games. In its full realization, immersive video will be a paradigm shift in visual communication which will revolutionize television and video media, and will become an integral part of future telepresence and virtual reality systems.

...read moreread less

Proceedings Article•DOI•

What you can see is what you can feel-development of a visual/haptic interface to virtual environment

[...]

Yasuyoshi Yokokohji¹, Ralph L. Hollis¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

30 Mar 1996

TL;DR: WYSIWYF display as discussed by the authors provides correct visual/haptic registration using a vision based object tracking technique and a video keying technique so that what the user can see via a visual interface is consistent with what he/she can feel through a haptic interface using Chroma Keying, a live video image of the user's hand is extracted and blended with the graphic scene of the virtual environment.

...read moreread less

Abstract: We propose a new concept of visual/haptic interfaces called WYSIWYF display The proposed concept provides correct visual/haptic registration using a vision based object tracking technique and a video keying technique so that what the user can see via a visual interface is consistent with what he/she can feel through a haptic interface Using Chroma Keying, a live video image of the user's hand is extracted and blended with the graphic scene of the virtual environment The user's hand "encounters" the haptic device exactly when his/her hand touches a virtual object in the blended scene The first prototype has been built and the proposed concept was demonstrated

...read moreread less

Patent•

Analyzer and methods for detecting and processing video data types in a video data stream

[...]

R. Padmanabha Rao¹, Amanda L. Chin¹•Institutions (1)

General Instrument¹

08 Apr 1996

TL;DR: In this article, a video data stream analyzer is proposed to eliminate redundancy in the input video signal, and reorganize the video signal so that the spatial and temporal redundancy is increased.

...read moreread less

Abstract: A video data stream analyzer modifies an input digital video signal so that the resulting output digital signal can be optimally compressed by a digital video encoder. The video data stream analyzer eliminates redundancy in the input video signal, and reorganizes the input video signal so that the spatial and temporal redundancy is increased. In addition, the video data stream analyzer generates side channel information that is supplied to the video encoder. The side channel information tells the video encoder whether vertical frame-based filtering or vertical field-based filtering is preferable. Additional side channel information specifies the order and duration of the display of the fields after decoding and this information preferably is encoded with the video signal. The video data stream analyzer provides scan detection of the incoming video digital data, and automatically and reliably detects scene cuts, repeated fields, and mixed-field frames in the incoming digital video data in real time independent of the video source. The video data stream analyzer modifies the input video data stream by dropping repeated fields and replacing a frame with a scene cut with a frame having identical fields for video, cartoon, telecine video sources as well as arbitrary combinations of these video sources.

...read moreread less

Patent•

Sports event video manipulating system for highlighting movement

[...]

Michael Tamir, Avi Sharir

06 Dec 1996

TL;DR: A sports event video manipulating system for manipulating a representation of a sports event is described in this article, where a sports editor including a video field grabber and an object tracker is used to track an object through a plurality of successive video fields, an object highlighter receiving input from the object tracker and an operative to highlight the tracked object on each of the plurality of video fields.

...read moreread less

Abstract: A sports event video manipulating system for manipulating a representation of a sports event, the sports editor including a video field grabber operative to grab at least one video field including a video image A/D converter operative to digitize a grabbed video field, an object tracker operative to track an object through a plurality of successive video fields, an object highlighter receiving input from the object tracker and operative to highlight the tracked object on each of the plurality of successive video fields, a D/A image converter operative to convert output of the object highlighter into a video standard format, and a video display monitor.

...read moreread less

Journal Article•DOI•

A society of models for video and image libraries

[...]

Rosalind W. Picard¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1996-Ibm Systems Journal

TL;DR: The focus of this research is the use of a society of low-level models for performing relatively high-level tasks, such as retrieval and annotation of image and video libraries.

...read moreread less

Abstract: The average person with a computer will soon have access to the world's collections of digital video and images. However, unlike text that can be alphabetized or numbers that can be ordered, image and video has no general language to aid in its organization. Tools that can "see" and "understand" the content of imagery are still in their infancy, but they are now at the point where they can provide substantial assistance to users in navigating through visual media. This paper describes new tools based on "vision texture" for modeling image and video. The focus of this research is the use of a society of low-level models for performing relatively high-level tasks, such as retrieval and annotation of image and video libraries. This paper surveys recent and present research in this fast-growing area.

...read moreread less

Patent•

Video stabilization system and method

[...]

Robert J. Gove¹•Institutions (1)

Texas Instruments¹

30 Aug 1996

TL;DR: In this paper, a system for stabilizing a video recording of a scene (20, 22, and 24) made with a video camera (34) is provided. The video recording may include video data (36) and audio (38) data.

...read moreread less

Abstract: A system (26) for stabilizing a video recording of a scene (20, 22, & 24) made with a video camera (34) is provided. The video recording may include video data (36) and audio (38) data. The system (26) may include source frame storage (64) for storing source video data (36) as a plurality of sequential frames. The system (26) may also include a processor (50) for detecting camera movement occurring during recording and for modifying the video data (36) to compensate for the camera movement. Additionally the system (26) may include destination frame storage (70) for storing the modified video data as plurality of sequential frames.

...read moreread less

Patent•

Apparatus for detecting a cut in a video

[...]

Farshid Arman¹, Shih-Ping Liou¹, David L. Loching¹•Institutions (1)

Princeton University¹

19 Nov 1996

TL;DR: In this paper, an approach for detecting a cut in a video comprises arrangements for acquiring video images from a source, for deriving from the video images a pixel-based difference metric, and for measuring video content of video images to provide up-to-date test criteria.

...read moreread less

Abstract: Apparatus for detecting a cut in a video comprises arrangements for acquiring video images from a source, for deriving from the video images a pixel-based difference metric, for deriving from the video images a distribution-based difference metric, and for measuring video content of the video images to provide up-to-date test criteria. Arrangements are included for combining the pixel-based difference metric and the distribution-based difference metric, taking into account the up-to-date test criteria provided so as to derive a scene change candidate signal and for filtering the scene change candidate signal so as to generate a scene change frame list.

...read moreread less

Patent•

Apparatus having flexible capabilities for analysis of video information

[...]

Patrick O. Nunally, David Ross MacCormack, Charles Park Wilson, Gerhard Josef Winter, Harry Eric Klein, William Thanh Nguyen, Sen Lin-Liu, Lyn Nguyen, Alex Kamlun Auyeung, Chris H. Pedersen - Show less +6 more

31 Oct 1996

TL;DR: In this paper, a flexible video information analysis apparatus stores a video information data base and a plurality of moving image content analysis algorithms for analyzing the video information in the data base, and a user can manipulate a mouse to select one of the analysis algorithms.

...read moreread less

Abstract: A flexible video information analysis apparatus stores a video information data base and a plurality of moving image content analysis algorithms for analyzing the video information in the data base. A user can manipulate a mouse to select one of the analysis algorithms. The selected algorithm is used to analyze video information in the data base.

...read moreread less

Proceedings Article•DOI•

Integrated image and speech analysis for content-based video indexing

[...]

Yuh-Lin Chang¹, Wenjun Zeng¹, Ibrahim Kamel¹, Rafael Alonso¹•Institutions (1)

Princeton University¹

17 Jun 1996

TL;DR: The novelty of this work is that it proposes to integrate speech understanding and image analysis algorithms for extracting information in news or sports video indexing, where usually speech analysis is more efficient in detecting events than image analysis.

...read moreread less

Abstract: We study an important problem in multimedia database, namely the automatic extraction of indexing information from raw data based on video contents. The goal of our research project is to develop a prototype system for automatic indexing of sports videos. The novelty of our work is that we propose to integrate speech understanding and image analysis algorithms for extracting information. The main thrust of this work comes from the observation that in news or sports video indexing, usually speech analysis is more efficient in detecting events than image analysis. Therefore, in our system, the audio processing modules are first applied to locate candidates in the whole data. This information is passed to the video processing modules, which further analyze the video. The final products of video analysis are in the form of pointers to the locations of interesting events in a video. Our algorithms have been tested extensively with real TV programs, and results are presented and discussed.

...read moreread less

Proceedings Article•DOI•

Camera on a chip

[...]

Bryan D. Ackland¹, Alexander George Dickinson¹•Institutions (1)

Bell Labs¹

08 Feb 1996

TL;DR: These cameras will be a standard peripheral on all PCs bundled for multimedia applications, given that in excess of 60M PCs will be sold this year, a sizable new market for electronic cameras is being created.

...read moreread less

Abstract: Recent advances in video compression and digital networking technology, combined with the ever increasing power of PCs and workstations, are creating enormous opportunities to develop new multimedia products and services built upon sophisticated voice, data, image and video processing. This will create a significant demand for compact, low-cost, low-power electronic cameras for video and still image capture. These cameras will be a standard peripheral on all PCs bundled for multimedia applications. Given that in excess of 60M PCs will be sold this year, a sizable new market for electronic cameras is being created.

...read moreread less

Patent•

Video camera system which automatically follows subject changes

[...]

Tadafusa Tomitaka¹, Takayuki Sasaki¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

11 Dec 1996

TL;DR: In this article, the position of detection measurement frame having a feature pattern with the largest similarity to the standard feature pattern obtained from the standard measurement frame is determined and an imaging condition of a television camera is controlled on the basis of the position information of the detection measurement frames in order to attain a video camera system enabling to suitably track the object motion.

...read moreread less

Abstract: A video camera system can suitably track a moving object without influence of other objects outside the desired image. Detection feature patterns are formed after brightness and hue frequency feature data are obtained on the basis of image information of the detection measurement frame. The position of detection measurement frame having a feature pattern with the largest similarity to the standard feature pattern obtained from the standard measurement frame is determined. An imaging condition of a television camera is controlled on the basis of the position information of the detection measurement frame in order to attain a video camera system enabling to suitably track the object motion. Further, a video camera system can obtain a face image of constantly a same size with a simple construction. An area of the face image on the display plane is detected as the detected face area, and by comparing this with a standard face area, zooming-processing is performed such that the difference becomes 0. Thus, it is unnecessary to use the method of a distance sensor, etc., and a video camera system with a simple construction can be obtained.

...read moreread less

Collapse