scispace - formally typeset
Search or ask a question

Showing papers on "Smacker video published in 1998"


Patent
25 Jun 1998
TL;DR: In this article, a system for adaptively transporting video over networks wherein the available bandwidth varies with time is proposed, where a video/audio codec that functions to compress, code, decode and decompress video streams that are transmitted over networks having available bandwidth varying with time and location.
Abstract: A system for adaptively transporting video over networks wherein the available bandwidth varies with time. The system comprises a video/audio codec that functions to compress, code, decode and decompress video streams that are transmitted over networks having available banwidths that vary with time and location. Depending on the channel bandwidth, the system adjusts the compression ratio to accomodate a plurality of bandwidths ranging from 20 Kbps for POTS to several Mbps for switched LAN and ATM environments. Bandwidth adjustability is provided by offering a trade off between video resolution, frame rate and individual frame quality. The system generates a video data stream comprised of Key, P and B frames from a raw source of video. Each frame type is further comprised of multiple levels of data representing varying degrees of quality. In addition, several video server platforms can be utilized in tandem to transmit video/audio information with each video server platform transmitting information for a single compression/resolution level.

575 citations


Journal ArticleDOI
01 May 1998
TL;DR: A new set of methods for indexing into the video sequence based on the scene-based representation, based on geometric and dynamic information contained in the video, complement the more traditional content-based indexing methods.
Abstract: Video is a rich source of information It provides visual information about scenes This information is implicitly buried inside the raw video data, however, and is provided with the cost of very high temporal redundancy While the standard sequential form of video storage is adequate for viewing in a movie mode, it fails to support rapid access to information of interest that is required in many of the emerging applications of video This paper presents an approach for efficient access, use and manipulation of video data The video data are first transformed from their sequential and redundant frame-based representation, in which the information about the scene is distributed over many frames, to an explicit and compact scene-based representation, to which each frame can be directly related This compact reorganization of the video data supports nonlinear browsing and efficient indexing to provide rapid access directly to information of interest This paper describes a new set of methods for indexing into the video sequence based on the scene-based representation These indexing methods are based on geometric and dynamic information contained in the video These methods complement the more traditional content-based indexing methods, which utilize image appearance information (namely, color and texture properties) but are considerably simpler to achieve and are highly computationally efficient

334 citations


Journal ArticleDOI
TL;DR: This paper discusses the problem of transcoding H.263-based video streams and shows that the computational complexity of the basic transcoding model can be reduced for each model by, on average, 39% and 23% without significant lose in quality.
Abstract: This paper discusses the problem of transcoding H.263-based video streams. Two different models for transcoding are examined, rate reduction and resolution reduction. Results show that the computational complexity of the basic transcoding model can be reduced for each model by, on average, 39% and 23% without significant lose in quality. Comparisons with the scaleable coding model are also shown.

222 citations


Proceedings ArticleDOI
03 Jan 1998
TL;DR: The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure.
Abstract: Digital video is rapidly becoming important for education, entertainment and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectively browse segments in a short time without losing the content of the video. We propose a method to extract the significant audio and video information and create a skim video which represents a very short synopsis of the original. The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure. The resulting skim video is much shorter; where compaction is as high as 20:1, and yet retains the essential content of the original segment. We have conducted a user-study to test the content summarization and effectiveness of the skim as a browsing tool.

220 citations


Proceedings ArticleDOI
28 Jun 1998
TL;DR: An effective approach for video scene structure construction is presented, in which shots are grouped into semantic-related scenes and the output of the proposed algorithm provides a structured video that greatly facilitates user's access.
Abstract: While existing shot-based video analysis approaches provide users with better access to the video than the raw data stream does, they are still not sufficient for meaningful video browsing and retrieval, since: (1) the shots in a long video are still too many to be presented to the user; and (2) shots do not capture the underlying semantic structure of the video, based on which the user may wish to browse/retrieve the video. To explore video structure at the semantic level this paper presents an effective approach for video scene structure construction, in which shots are grouped into semantic-related scenes. The output of the proposed algorithm provides a structured video that greatly facilitates user's access. Experiments based on real-world movie videos validate the effectiveness of the proposed approach.

177 citations


Patent
14 Aug 1998
TL;DR: In this article, a system and method for video cataloging is described, in which video feature extractors produce metadata tracks from the video information, and each metadata track indexes the stored video information.
Abstract: One aspect of the invention is directed to a system and method for video cataloging. The video is cataloged according to predefined or user definable metadata. The metadata is used to index and then retrieve encoded video. Video feature extractors produce metadata tracks from the video information, and each metadata track indexes the stored video information. A feature extractor registration interface registers the video feature extractors, providing for registration with the video engine of new video feature extractors with new metadata tracks.

162 citations


Patent
30 Jun 1998
TL;DR: In this paper, an apparatus and method for combining digital information with at video stream and for using the digital information to modify or augment video frames in the video stream is disclosed, where the video processor uses the received auxiliary data to identify a portion of the at least one video frame, the portion being modified in the act of modifying the video frame.
Abstract: An apparatus and method for combining digital information with at video stream and for using the digital information to modify or augment video frames in the video stream is disclosed. The apparatus for decoding a video stream comprises a video receiver configured to receive a video stream, the video stream including a plurality of video frames. A video processor is configured to receive auxiliary data corresponding to the video stream, the auxiliary data including the information indicative of at least one video frame of the plurality of video frames. The video processor is further configured to modify the video frame in accordance with the auxiliary data. The video processor uses the received auxiliary data to identify a portion of the at least one video frame, the portion being modified in the act of modifying the video frame, other portions of the at least one video frame not being so modified. The video processor applies a filter to the portion of the at least one video frame.

153 citations


Proceedings ArticleDOI
16 Aug 1998
TL;DR: This work has developed a scheme for automatically extracting text from digital images and videos for content annotation and retrieval that results in segmented characters that can be directly processed by an OCR system to produce ASCII text.
Abstract: Efficient content-based retrieval of image and video databases is an important application due to rapid proliferation of digital video data on the Internet and corporate intranets. Text either embedded or superimposed within video frames is very useful for describing the contents of the frames, as it enables both keyword and free-text based search, automatic video logging, and video cataloging. We have developed a scheme for automatically extracting text from digital images and videos for content annotation and retrieval. We present our approach to robust text extraction from video frames, which can handle complex image backgrounds, deal with different font sizes, font styles, and font appearances such as normal and inverse video. Our algorithm results in segmented characters that can be directly processed by an OCR system to produce ASCII text. Results from our experiments with over 5000 frames obtained from twelve MPEG video streams demonstrate the good performance of our system in terms of text identification accuracy and computational efficiency.

151 citations


Patent
08 Sep 1998
TL;DR: A scalable real-time modular video processing system is described in this paper, which includes a processing module (PM 10) containing at least one general purpose microprocessor (12) which controls hardware and software operations using control data One or more video processing modules (20) are also provided, each containing parallel pipelined video hardware which is programmable by the control data.
Abstract: A scalable real-time modular video processing system The modular video processing system includes a processing module (PM 10) containing at least one general purpose microprocessor (12) which controls hardware and software operations using control data One or more video processing modules (20) are also provided, each containing parallel pipelined video hardware which is programmable by the control data to provide different video processing operations on an input video data stream Each video processing module also contains one or more connections for accepting one or more daughterboards which each perform a particular image processing task A global video bus (30) routes video data between the processing module and each video processing module, while a global control bus (40) provides control data to/from the processing module from/to the video processing modules separate from the video data on the global video bus A hardware control library on the processing modules provides an application programming interface including high level C callable functions which allow programming of the video hardware as the video processing system is modified for different applications

147 citations


Patent
Shih-Fu Chang1, William Chen1, Horace J. Meng1, Hari Sundaram1, Di Zhong1 
05 May 1998
TL;DR: In this article, object-oriented methods and systems for permitting a user to locate one or more video objects from one video clips over an interactive network are disclosed, including a system that allows the user to browse through stored video object attributes within the server computers and an interactive video player.
Abstract: Object-oriented methods and systems for permitting a user to locate one or more video objects from one or more video clips over an interactive network are disclosed. The system includes one or more server computers (110) comprising storage (111) for video clips and databases of video object attributes, a communications network (120), and a client computer (130). The client computer contains a query interface to specify video object attribute information, including motion trajectory information (134), a browser interface to browse through stored video object attributes within the server computers, and an interactive video player.

131 citations


Patent
23 Feb 1998
TL;DR: In this article, a system for interactively organizing and browsing video automatically processes video, creating a video table of contents (VTOC), while providing easy-to-use interfaces for verification, correction, and augmentation of the automatically extracted video structure.
Abstract: A system for interactively organizing and browsing video automatically processes video, creating a video table of contents (VTOC), while providing easy-to-use interfaces for verification, correction, and augmentation of the automatically extracted video structure. Shot detection, shot grouping and VTOC generation are automatically determined without making restrictive assumptions about the structure or content of the video. A nonstationary time series model of difference metrics is used for shot boundary detention. Color and edge similarities are used for shot grouping. Observation about the structure of a wide class of videos are used for the generating the table of contents. The use of automatic processing in conjuction with input from the user provides a meaningful video organization.

Journal ArticleDOI
TL;DR: The audio-based approach to video indexing described by the authors detects music and speech independently even when they occur simultaneously, and provides different video condensation levels based on video structuring that can link the video segments and the director's intentions.
Abstract: The audio-based approach to video indexing described by the authors detects music and speech independently even when they occur simultaneously. The indexed video segments, when presented on the Video Sound Browser, let users randomly access the video. The Video in Time system provides different video condensation levels based on video structuring that can link the video segments and the director's intentions.

Patent
K. Metin Uz1
06 Oct 1998
TL;DR: In this article, a method and apparatus for restricting access to a digital video signal is presented. But the method is based on motion compensated encoding, where motion compensated decoding is performed on one or more second video picture portions of the video signal using predictions formed from the descrambled first video picture portion, thereby providing a decoded accessed signal.
Abstract: A method and apparatus are provided for restricting access to a digital video signal. According to the method, the digital video signal is encoded to produce an encoded video signal. In encoding the digital video signal, motion compensated encoding is performed on one or more first video picture portions of the digital video signal using a second video picture portion of the video signal as a reference for forming predictions. Only the second video picture portion of the encoded video signal is scrambled thereby producing a restricted access signal that is subsequently stored on a storage medium. Also provided is a method and apparatus for enabling access to a video signal. According to the method, the encoded video signal is received and only a first video picture portion of the video signal is descrambled. The encoded video signal is then decoded. In decoding the encoded video signal, motion compensated decoding is performed on one or more second video picture portions of the video signal using predictions formed from the descrambled first video picture portion thereby providing a decoded accessed signal.

Journal ArticleDOI
TL;DR: This work introduces a highly scalable video compression system for very low bit-rate videoconferencing and telephony applications around 10-30 kbits/s and incorporates a high degree of video scalability into the codec by combining the layered/progressive coding strategy with the concept of embedded resolution block coding.
Abstract: We introduce a highly scalable video compression system for very low bit-rate videoconferencing and telephony applications around 10-30 kbits/s. The video codec first performs a motion-compensated three-dimensional (3-D) wavelet (packet) decomposition of a group of video frames, and then encodes the important wavelet coefficients using a new data structure called tri-zerotrees (TRI-ZTR). Together, the proposed video coding framework forms an extension of the original zero tree idea of Shapiro (1992) for still image compression. In addition, we also incorporate a high degree of video scalability into the codec by combining the layered/progressive coding strategy with the concept of embedded resolution block coding. With scalable algorithms, only one original compressed video bit stream is generated. Different subsets of the bit stream can then be selected at the decoder to support a multitude of display specifications such as bit rate, quality level, spatial resolution, frame rate, decoding hardware complexity, and end-to-end coding delay. The proposed video codec also allows precise bit rate control at both the encoder and decoder, and this can be achieved independently of the other video scaling parameters. Such a scheme is very useful for both constant and variable bit rate transmission over mobile communication channels, as well as video distribution over heterogeneous multicast networks. Finally, our simulations demonstrated comparable objective and subjective performance when compared to the ITU-T H.263 video coding standard, while providing both multirate and multiresolution video scalability.

Proceedings ArticleDOI
01 Sep 1998
TL;DR: CueVideo integrates voice and manual annotation, attachment of related da~ visual content search technologies (QBICti), and novel mukiview storyboard generation to provide a system where the user can incorporate the type of semantic information that automatic techniques would fail to obtain.
Abstract: 1. ABSTRACT Mukirnedia data is an increasingly important information medium today. Providing intelligent access for effective use of this information continues to offer challenges in digital Iibrary research. As computer vision, image processing and speech recognition research continue to progress, we examine the effectiveness of these fully automated techniques in architecting effective video retrieval systems. We present semi-automated techniques that combine manual inpu~ and video and speech technology for automatic content characterization integrated into a single system we cdl CueVideo. CueVideo integrates voice and manual annotation, attachment of related da~ visual content search technologies (QBICti), and novel mukiview storyboard generation to provide a system where the user can incorporate the type of semantic information that automatic techniques would fail to obtain. 1.1

Proceedings ArticleDOI
04 Oct 1998
TL;DR: A novel compressed-domain approach to embedding visible watermarks in MPEG-1 and MPEG-2 video streams that adapt to the local video features such as brightness and complexity to achieve consistent perceptual visibility.
Abstract: Digital visible or invisible watermarks are increasingly in demand for protecting or verifying the original image or video ownership. We propose a novel compressed-domain approach to embedding visible watermarks in MPEG-1 and MPEG-2 video streams. Our algorithms operate on the DCT coefficients which are obtained with minimal parsing of input video. The embedded watermarks adapt to the local video features such as brightness and complexity to achieve consistent perceptual visibility. The embedded watermarks are robust against attempts of removal since clear artifacts remain after the possible attacks.

Patent
09 Nov 1998
TL;DR: In this paper, a method and system for personalizing images inserted into a video stream, provided by a video service, to each of its clients according to a priori individual knowledge of their clients is presented.
Abstract: A method and system for personalizing images inserted into a video stream, provided by a video service (11) to each of its clients (12) according to a priori individual knowledge of its clients. The present invention generates a user profile (10) from video sequences selected by the user from a video server (11) having a plurality of video sequences stored therein. The selected video sequence is personalized according to the user profile, for transmission to the user (12).


Patent
25 Aug 1998
TL;DR: In this article, the authors propose a method and apparatus for providing a video e-mail kiosk for creating and sending video E-mail messages such as full motion videos or still snapshots.
Abstract: A method and apparatus for providing a video e-mail kiosk for creating and sending video e-mail messages such as full motion videos or still snapshots. The method comprises recording a video message, requesting an e-mail address of an intended recipient, and sending the video message to the intended recipient. The apparatus comprises a display device capable of displaying video and computer graphics, an input device capable of accepting input from a user, a digital video camera, a microphone, a digital network communications link, and a processor connected to the display device, the input device, the digital video camera, the microphone, and the digital network communications link, and capable of accepting an input from a user and generating display output, and further capable of converting a video input from the digital video camera and an audio input from the microphone into a digital video e-mail message and transmitting the digital video e-mail message over the digital network communications link.

Patent
17 Dec 1998
TL;DR: In this article, a method and apparatus for converting audio-video data from a full motion video format to a slide show presentation with synchronised sound is described, where the video sequence is divided into a number of shorter video segments, key frames are extracted for each segment and a significance measure is calculated for each frame.
Abstract: A method and apparatus is disclosed for converting audio-video data from a full motion video format to a slide show presentation with synchronised sound. The full motion video is received from a source, separated into an audio stream and a video sequence, the video sequence is divided into a number of shorter video segments, key frames are extracted for each segment and a significance measure is calculated for each frame. A database is created wherein the extracted data is stored for subsequent (off-line) processing and reproduction. The system may have more than one system retrieving data from the database, selecting slide frames and subsequently displaying a slide presentation.

Patent
02 Jan 1998
TL;DR: In this article, a method and apparatus for synchronizing audio and video streams in a video conferencing system is provided, where the video stream has a variable frame rate during transmission, extra frames are inserted into the recorded video stream in order to maintain a constant, predetermined frame rate.
Abstract: A method and apparatus for synchronizing audio and video streams in a video conferencing system is provided. During a video conferencing session, audio and video streams are transmitted from one processing system to a remote processing system, where they are recorded. Because the video stream has a variable frame rate during transmission, extra frames are inserted into the recorded video stream in order to maintain a constant, predetermined frame rate. During playback, synchronization information from the audio stream is provided by an audio playback process to a video playback process in order to synchronize the start of playing the audio and video streams, as well as to repeatedly synchronize the audio and video streams during playback.

Patent
09 Jan 1998
TL;DR: In this paper, an analog-to-digital converter provides digital videoconferencing data from a camera source, and a second analog to digital converter is arranged to provide digital video from a supplemental analog video signal source, such as a broadcast television signal.
Abstract: A videoconferencing arrangement that selectively creates a composite arrangement of videoconferencing data along with video data from a supplemental video source. In one embodiment, a first analog-to-digital converter provides digital videoconferencing data from a camera source, and a second analog-to-digital converter is arranged to provide digital video data from a supplemental analog video signal source, such as a broadcast television signal. Digital video signals from a remote videoconferencing arrangement are decompressed and provided as input, along with the video data from the camera and supplemental video data, to a video processor. Responsive to selection signals, the video processor scales selected video data and overlays the scaled video data on selected other video data. For example, a live television broadcast can be overlaid with remote videoconferencing data.

Journal ArticleDOI
Ruud M. Bolle1, B.-L. Yeo2, M. M. Yeung2
TL;DR: A novel framework of structural video analysis that focuses on the processing of high-level features as well as low-level visual cues is presented that augments the semantic interpretation of a wide variety of long video segments and assists in the search, navigation, and retrieval of video.
Abstract: As digital video databases become more and more pervasive, finding video in large databases becomes a major problem. Because of the nature of video (streamed objects), accessing the content of such databases is inherently a time-consuming operation. Enabling intelligent means of video retrieval and rapid video viewing through the processing, analysis, and interpretation of visual content are, therefore, important topics of research. In this paper, we survey the art of video query and retrieval and propose a framework for video-query formulation and video retrieval based on an iterated sequence of navigating, searching, browsing, and viewing. We describe how the rich information media of video in the forms of image, audio, and text can be appropriately used in each stage of the search process to retrieve relevant segments. Also, we address the problem of automatic video annotation-- attaching meanings to video segments to aid the query steps. Subsequently, we present a novel framework of structural video analysis that focuses on the processing of high-level features as well as low-level visual cues. This processing augments the semantic interpretation of a wide variety of long video segments and assists in the search, navigation, and retrieval of video. We describe several such techniques.

Journal ArticleDOI
01 Nov 1998
TL;DR: The multimedia abstractions used to represent this video in prior systems are summarized and the visualization techniques employed to browse and navigate multiple video documents at once are introduced.
Abstract: The Informedia Digital Video Library contains over a thousand hours of video, consuming over a terabyte of disk space. This paper summarizes the multimedia abstractions used to represent this video in prior systems and introduces the visualization techniques employed to browse and navigate multiple video documents at once.

Patent
31 Dec 1998
TL;DR: In this article, a method and apparatus for video conferencing provides for automatically managing user views to provide ready, video only, full screen video, video with document not shared, and video with shared document views.
Abstract: Method and apparatus for video conferencing provides for automatically managing user views to provide ready, video only, full screen video, video with document not shared, and video with shared document views. Several different automatically managed user-interfaces for a dual monitor system provide, among other things, that the second monitor can be used for local video image only, used for all functions except shared applications, or to display shared applications.

Patent
22 Jan 1998
TL;DR: In this article, a keyframe-based displaying of a video presentation enables a user to select among keyframes, and based on the selecting displays a substantially continuous video stream relating to the presentation.
Abstract: Keyframe-based displaying of a video presentation enables a user to select among keyframes, and based on the selecting displays a substantially continuous video stream relating to the presentation. In particular, various keyframes are displayed in parallel in a reduced and static video format, and the displaying is controlled as starting from a particular active keyframe which subsequently acts as a dynamic cursor frame within the video format. The cursor may be dynamic video plus dynamic audio, dynamic video alone, or static video per interval plus dynamic audio.

Patent
28 Sep 1998
TL;DR: In this article, a method and apparatus for allocating processing ressources (120) in an information stream decoder (100) in response to format indicia included within a compressed information stream (IN) is presented.
Abstract: The invention comprises a method and apparatus for allocating processing ressources (120) in an information stream decoder (100) in response to format indicia included within a compressed information stream (IN). Specifically, the invention comprises an apparatus and method for decoding a video stream having an associated source video format to produce a decoded video stream having the same format or a different format, in which a computational ressource (120) is allocated in response to a change in format between the input video format and the resultant video format.

Proceedings ArticleDOI
04 Oct 1998
TL;DR: It is shown that the watermark, when limited to the 4 lowest bitplanes of an 8-bit video, are unnoticeable.
Abstract: In this work we apply a direct sequence spread spectrum model to the watermarking of digital video. First, the video signal is modeled as a sequence of bit planes arranged along the time axis. Watermarking of this sequence is a two layer operation. A controlling m-sequence first establishes a pseudorandom order in the bitplane stream for later watermarking. Watermarks, defined as m-frames, supplant the tagged bitplanes. We have shown that the watermark, when limited to the 4 lowest bitplanes of an 8-bit video, are unnoticeable. Moreover, attempts in corrupting the image to destroy the watermark renders the video useless before damaging the seal itself. The watermarked video is also robust to video editing attempts such as subsampling, frame reordering etc. The watermark is also identifiable from very short segments of video. Individual frames extracted from the video will also contain copyright information.

Patent
08 Jun 1998
TL;DR: In this article, a base image is produced by an I-frame, and MPEG-2 graphics elements formed either by modifying the I-Frame or overlaying the Iframe with one or more P-frames.
Abstract: Graphics in the same compressed digital format as that of a target video image are combined, on the fly, by an application. Combination is performed by frame modification or overlay techniques. In the preferred embodiment, the compressed video image format conforms to an MPEG-2 compression standard. A base image is produced by an I-frame, and MPEG-2 graphics elements formed either by modifying the I-frame or overlaying the I-frame with one or more P-frames.

Patent
Theresa A. Alexander1
05 Jan 1998
TL;DR: In this article, a method for editing a video recording includes receiving (124) a signal including video content and analyzing (128) the video content of the received signal to identify visual attributes which characterize the video contents.
Abstract: A method for editing a video recording includes receiving (124) a signal including video content and analyzing (128) the video content of the received signal to identify visual attributes which characterize the video content. Based, at least in part, on the identified visual attributes of the video content an audio selection (128) with which to augment the received signal is identified from a plurality of available audio selections.