scispace - formally typeset
Search or ask a question

Showing papers on "Closed captioning published in 2002"


Patent
David H. Sloo1
21 May 2002
TL;DR: In this paper, closed captioning streams of textual data are extracted from video signals received by a client device, and closed-captioning streams may be searched for occurrences of text data in the closed captioned streams that match one or more search terms.
Abstract: In some implementations, closed captioning streams of textual data are extracted from video signals received by a client device. The closed captioning streams may be searched for occurrences of textual data in the closed captioning streams that match one or more search terms. When the number of matches between the search terms and a particular closed captioning stream exceeds a threshold number, a notification may be sent indicating that content programming determined to be of interest to a viewer has been located and/or the content programming may be recorded.

166 citations


Patent
04 Mar 2002
TL;DR: In this article, a user can override the closed caption presentation format as selected by the originator (e.g., programmer or broadcaster), in order to select alternate presentation attributes based on the user's preference.
Abstract: User customizable advanced closed caption capabilities are provided using closed caption information, such as that described in the Electronic Industries Association (EIA) Television Data Systems Subcommittee standards, EIA-608 or EIA-708. The invention allows the user to override the closed caption presentation format as selected by the originator (e.g., programmer or broadcaster), in order to select alternate presentation attributes based on the user's preference. The invention may also be implemented to customize other forms of text information (e.g., subtitles.). The invention also allows for storage and subsequent retrieval and review of text included within the closed caption information, which text serves as a transcript of the program. The methods and apparatus provided are independent of the type of delivery network, content format, and receiver type. In an example embodiment, closed caption information is extracted (e.g., by a closed caption processor 20) from a television signal 10, which television signal 10 also contains corresponding audiovisual programming. The processor 20 determines whether one or more user selected attributes 12 have been set. At least one user selected attribute 12 is applied to at least a portion of the closed caption information (e.g., via a closed caption driver 30). The closed caption information is displayed (e.g., via a display driver 40 and graphics processor 45) on a display device 50 (e.g., a television screen) in accordance with the user selected attributes 12 via a graphical overlay on top of the audiovisual programming. In this manner, user selected advanced closed caption features can be provided using existing closed caption information contained in the television signal.

106 citations


Patent
Tony E. Piotrowski1
11 Jun 2002
TL;DR: In this paper, a system and method are disclosed that allow viewers of video/TV programs to automatically, or by request, receive synchronized supplemental multimedia information related to the video and TV programs, e.g., keyframes, image triggers extracted using image recognition technology, time codes or Closed Captioning (CC) and Extended Data Services (EDS) codes.
Abstract: A system and method are disclosed that allow viewers of video/TV programs to automatically, or by request, receive synchronized supplemental multimedia information related to the video/TV programs. The supplemental multimedia information is received as an Internet document, e.g., using Synchronized Multimedia Integration Language (SMIL). Synchronizing information is received/extracted from the video/TV program. The synchronizing information may be in the form of keyframes, image triggers extracted using image recognition technology, time codes or Closed Captioning (CC) and Extended Data Services (EDS) codes. The video/TV program and the supplemental multimedia information are then displayed as a virtual web page.

73 citations


Patent
21 May 2002
TL;DR: In this article, an exemplary television signal system such as described herein involves using closed caption (CC) data from a standard definition signal, processing CC data, and overlaying the CC data at a video rate of a higher definition signal selected for viewing that does not carry its own embedded closed caption data.
Abstract: A system as described herein enables a user to access auxiliary information when viewing an enhanced performance television signal or program. Particularly, a television signal system is operative, configured, and/or enabled to allow a user to access and/or utilize auxiliary information when viewing a high definition or progressive-scan television signal. Briefly, an exemplary television signal system receives the auxiliary information/data (e.g. closed caption data) on a selected interlaced standard definition input, processes the auxiliary data, and combines or overlays the auxiliary data with a television (video) signal received on a selected input that does not have its own embedded auxiliary information/data. More particularly, an exemplary television signal system such as described herein involves using closed caption (CC) data from a standard definition signal, processing the CC data, and overlaying the CC data at a video rate of a higher definition signal selected for viewing that does not carry its own embedded closed caption data.

61 citations


Patent
Christopher K. Karstens1
03 Sep 2002
TL;DR: In this article, a system and method for remote audio caption visualizations is presented, where a user uses a personal device during an event to display an enhanced captioning stream corresponding to the event.
Abstract: A system and method for remote audio caption visualizations is presented. A user uses a personal device during an event to display an enhanced captioning stream corresponding to the event. A media-playing device provides a media stream corresponding to the enhanced captioning stream. The media-playing device provides a synchronization signal to the personal device which instructs the personal device to start playing the enhanced captioning stream on the personal device's display. The user views text on the personal display while the media stream plays. The user is able to adjust the timing of the enhanced captioning stream in order to fine-tune the synchronization between the enhanced captioning stream and the media stream.

61 citations


Patent
07 Jan 2002
TL;DR: In this paper, a method and apparatus for use in connection with home television receivers involving processing an electronic signal, including audio and video processors; the audio information, including digital representations thereof, is analyzed and modified to compare words and phrases represented in the audio and phrases stored in electronic memory for elimination of undesirable words or phrases in audible or visible representations of the audio with options for replacing undesirable words with acceptable words.
Abstract: A method and apparatus for use in connection with home television receivers involving processing an electronic signal, including audio and video processors; the audio information, including digital representations thereof, is analyzed and modified to compare words and phrases represented in the audio information with words and phrases stored in electronic memory for elimination of undesirable words or phrases in audible or visible representations of the audio with options for replacing undesirable words with acceptable words. The options include varying degrees of selectivity in specifying words as undesirable and control over substitute words which are used to replace undesirable words. The options for control of the method and apparatus for language filtering are selectable from an on-screen menu through use of a conventional television remote transmitter. Full capability of the method and apparatus depends only on presence of closed caption or similar digitally-encoded language information being received with a television signal special instructions transmitted with a television signal may also be responded to for activating particular language libraries customizing a library for the program material, as well as unrelated viewer information and control functions.

44 citations


Patent
08 Mar 2002
TL;DR: In this paper, a system for performing closed captioning enables a caption prepared remotely by a captioner to be repositioned by someone other than the captioner, such as by a program originator.
Abstract: A system for performing closed captioning enables a caption prepared remotely by a captioner to be repositioned by someone other than the captioner, such as by a program originator. This capability is particularly useful when, for example, the program originator wishes to include a banner in a video but also wishes to avoid having a closed caption interfere with the banner. In one illustrative system, the program originator is a broadcast station that includes a conventional encoder and a broadcast station computer. In one arrangement, control data generated at the station computer is incorporated into the caption data by the station computer. In another arrangement, the control data is sent from the station computer to the captioner computer, which incorporates the control data into the caption data.

36 citations


Patent
Doreen Galli1, Rick Hamilton1, Harry Schatz1
11 Jul 2002
TL;DR: In this article, a guardian review system is disclosed which monitors the television programming that a child watches and to report back to the parent to determine if the TV programming contains any of the offensive items specified by the parent.
Abstract: A guardian review system is disclosed which monitors the television programming that a child watches and to report back to the parent. The hardware consists of a logical unit which is used to monitor the television programming. The software is divided into two programs: the monitor program and the report program. The monitor program analyzes the closed captioning signal, the audio signal, the title, and content ratings of the television programming to determine if the television programming contains any of the offensive items specified by the parent. The monitor program creates data of the offensive language viewed by the child and sends the data to the report program. The reporting program receives criteria from the parent such as time, channel, duration, rating, and content of a television program. The report program can operate either on the television or on an external computing device, such as the parent's personal computer.

20 citations


Proceedings ArticleDOI
07 Nov 2002
TL;DR: A system which automatically summarizes video by analyzing the video color information and utterance information is proposed, and important intervals having several color change patterns are discovered by using the probability model.
Abstract: In recent years, digital video is rapidly becoming important for education, entertainment, and a host of multimedia applications. With the increasing size of video collections, technology is needed to effectively browse a video in a short time without losing any important contents. We propose a system which automatically summarizes video by analyzing the video color information and utterance information. To do so, we use the color histogram of a shot as color information. We discover important intervals having several color change patterns by using the probability model. Furthermore, we extract utterance information by using closed caption. We find the dialogue structure by analyzing the connectivity between utterances. Finally, we integrate them so as to skim meaningful portions in a video.

16 citations


Patent
05 Mar 2002
TL;DR: In this article, a set top box for use with a video signal captioning system is provided, which includes a first port for receiving caption text, and a second port for decoding a visual signal.
Abstract: A set top box for use with a video signal captioning system is provided. The set top box includes a first port for receiving caption text. The set top box also includes a second port for receiving a video signal. The set top box converts the caption text from the computer into a video image. The set top box then combines the video signal from the video source with the video image. The combined signal from the set top box is transmitted as an output video signal.

13 citations


01 Jan 2002
TL;DR: An eTEACH presentation combines a video frame with a slide frame, an external web links frame, a dynamic table of contents that titles the major portions of the lecture and allows jumping to any portion, buttons that allow the lecture to be advanced or rewound 10 or 30 seconds, and fast forward and reverse buttons.
Abstract: An eTEACH presentation combines a video frame (Microsoft MediaPlayer) with a slide frame (Microsoft PowerPoint), an external web links frame, a dynamic table of contents that titles the major portions of the lecture and allows jumping to any portion, buttons that allow the lecture to be advanced or rewound 10 or 30 seconds, and fast forward and reverse buttons; all in an Internet Explorer window. The PowerPoint slides and web links automatically synchronize with the current position in the lecture video. eTEACH supports PowerPoint animation features for viewing in the browser. eTEACH supports accessibility features such as closed captioning and web page readers. eTEACH has been used extensively in reforming a large enrollment computer sciences course.

Journal ArticleDOI
TL;DR: A novel method for summarizing news video based on multimodal analysis of the content is proposed, which exploits the closed caption data to locate semantically meaningful highlights in a news video and speech signals in an audio stream to align theclosed caption data with the video in a time‐line.
Abstract: A video summary abstracts the gist from an entire video and also enables efficient access to the desired content. In this paper, we propose a novel method for summarizing news video based on multimodal analysis of the content. The proposed method exploits the closed caption data to locate semantically meaningful highlights in a news video and speech signals in an audio stream to align the closed caption data with the video in a time-line. Then, the detected highlights are described using MPEG-7 Summarization Description Scheme, which allows efficient browsing of the content through such functionalities as multi-level abstracts and navigation guidance. Multimodal search and retrieval are also within the proposed framework. By indexing synchronized closed caption data, the video clips are searchable by inputting a text query. Intensive experiments with prototypical systems are presented to demonstrate the validity and reliability of the proposed method in real applications.

Patent
16 Oct 2002
TL;DR: In this article, a user can override the closed caption presentation format as selected by the originator (e.g., programmer or broadcaster), in order to select alternate presentation attributes based on the user's preference.
Abstract: User customizable advanced closed caption capabilities are provided using closed caption information, such as that described in the Electronic Industries Association (EIA) Television Data Systems Subcommittee standards, EIA-608 or EIA-708. The invention allows the user to override the closed caption presentation format as selected by the originator (e.g., programmer or broadcaster), in order to select alternate presentation attributes based on the user's preference. The invention may also be implemented to customize other forms of text information (e.g., subtitles.). The invention also allows for storage and subsequent retrieval and review of text included within the closed caption information, which text serves as a transcript of the program. The methods and apparatus provided are independent of the type of delivery network, content format, and receiver type. In an example embodiment, closed caption information is extracted (e.g., by a closed caption processor 20) from a television signal 10, which television signal 10 also contains corresponding audiovisual programming. The processor 20 determines whether one or more user selected attributes 12 have been set. At least one user selected attribute 12 is applied to at least a portion of the closed caption information (e.g., via a closed caption driver 30). The closed caption information is displayed (e.g., via a display driver 40 and graphics processor 45) on a display device 50 (e.g., a television screen) in accordance with the user selected attributes 12 via a graphical overlay on top of the audiovisual programming. In this manner, user selected advanced closed caption features can be provided using existing closed caption information contained in the television signal.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: An algorithm that takes advantage of both video and closed caption text information for video scene clustering is described, which helps track video from multiple sources for video summarization.
Abstract: Video summarization is receiving increasing attention due to the large amount of video content made available on the Internet. We present an idea to track video from multiple sources for video summarization. An algorithm that takes advantage of both video and closed caption text information for video scene clustering is described. Experimental results are given followed by discussion on future directions.

Proceedings ArticleDOI
06 Nov 2002
TL;DR: An eTEACH presentation combines a video frame with a slide frame, an external web links frame, a dynamic table of contents that titles the major portions of the lecture and allows jumping to any portion, buttons that allow the lecture to be advanced or rewound 10 or 30 seconds, and fast forward and reverse buttons.
Abstract: An eTEACH presentation combines a video frame (Microsoft MediaPlayer) with a slide frame (Microsoft PowerPoint), an external web links frame, a dynamic table of contents that titles the major portions of the lecture and allows jumping to any portion, buttons that allow the lecture to be advanced or rewound 10 or 30 seconds, and fast forward and reverse buttons; all in an Internet Explorer window. The PowerPoint slides and web links automatically synchronize with the current position in the lecture video. eTEACH supports PowerPoint animation features for viewing in the browser. eTEACH supports accessibility features such as closed captioning and web page readers. eTEACH has been used extensively in reforming a large enrollment computer sciences course.

Patent
19 Aug 2002
TL;DR: In this paper, a color selective material in the form of glasses is provided to a subclass of the viewers in the audience to enable these viewers to distinguish the alphanumeric subtitles from the background.
Abstract: Alphanumeric images are displayed in common view in one color on a background of a different color using a liquid crystal projection device. The two colors are sufficiently similar so they are not distinguished by the unaided eye of the viewer. A color selective material in the form of glasses is provided to a subclass of the viewers in the audience to enable these viewers to distinguish the alphanumeric subtitles from the background. This material extinguishes either the alphanumeric images or the background, thereby either providing darkened letters on a light background or light letters on a darkened background. The projection and viewing apparatus can be utilized as a motion picture closed captioning system for a mixed audience of hearing impaired and non hearing impaired individuals to permit the hearing impaired to view the subtitles while those individuals not wearing such wavelength selective glasses would not see the subtitles.

Book ChapterDOI
15 Jul 2002
TL;DR: Traditional closed captioning provides limited bandwidth for verbatim text and sparse background information to reach the viewer, but the information and richness contained in paralanguage, music, and background sounds is lost to the deaf viewer.
Abstract: Traditional closed captioning provides limited bandwidth for verbatim text and sparse background information to reach the viewer However, the information and richness contained in paralanguage, music, and background sounds is lost to the deaf viewer The Emotive We are exploring the ways to fill these gaps with the increased bandwidth offered by the newdigital media

Proceedings Article
01 May 2002
TL;DR: The value of multimodal information access is motivated with a vision of multimmodal question answering and an example of content based access to broadcast news video, and a range of applications, required corpora, and associated media are described.
Abstract: This paper considers multimodal systems, resources, and evaluation. We first motivate the value of multimodal information access with a vision of multimodal question answering and an example of content based access to broadcast news video. We next describe intelligent multimodal interfaces, define terminology, and summarize a range of applications, required corpora, and associated media. We then introduce a jointly created roadmap for multimodality and show an example of an open source multimodal spoken dialogue toolkit. We next describe requirements for and an abstract architecture of multimodal systems. We conclude discussing multimodal collaboration, multimodal instrumentation, and multilevel evaluation. 1. Multimodal Question Answering A long range vision of ours is to create software that will support natural, multimodal information access. As implied by Figure 1, this suggests transforming the conventional information retrieval strategy of keywordbased document/web page retrieval into one in which multimodal questions spawn multimodal information discovery, multimodal extraction, and personalized multimodal presentation planning. In Figure 1 the user of the future is able to naturally employ a combination of spoken language, gesture, and perhaps even drawing or humming to articulate their information need which is satisfied using an appropriate coordinated integration of media and modalities, extracted from source media. Figure 1. Ask Multimodal Questions, Get Multimodal Answers The inadequacy of the current document retrieval strategy most closely associated with web search engines is underscored by Figure 2. Figure 2 illustrates that while (normalized) computing power doubles every 18 months and storage capacity doubles every 12 months, the fastest changing area of infrastructure is optical networking, where network speed is doubling every 8 months. Coupled with the rapid deployment of wireless devices and infrastructure, the ability to support mobile, multimodal access is becoming reality. Figure 2. Acceleration of Infrastructure Growth 2. Broadcast News Access As a step toward multimodal question answering, we have been exploring tools to help individuals access vast quantities of non-text multimedia (e.g., imagery, audio, video). Applications that promises on-demand access to multimedia information such as radio and broadcast news on a broad range of computing platforms (e.g. kiosk, mobile phone, PDA) offer new engineering challenges. Synergistic processing of speech, language and image/gesture promise both enhanced interaction at the interface and enhanced understanding of artifacts such as web, radio, and television sources (Maybury 2000). Coupled with user and discourse modeling, new services such as delivery of intelligent instruction and individually tailored personalcasts become possible. Figure 3 illustrates one such system, the Broadcast News Navigator (BNN) (Merlino et al. 1997). The web-based BNN gives the user the ability to browse, query (using free text or named entities), and view stories or their multimedia summaries. For example, 0 2 4 6 8 10 12 14 16 18 0 6 12 18 24 30 36 Optical Network Speed doubles every 8 months Storage capacity doubles every 12 months Computing power doubles every 18 months Typed Query: Where was Ebola last reported in animals and humans on the coast of Gabon? NOW Multimodal Query: Where was Ebola last reported near this coast (spoken with gesture to map)? FUTURE BBC News. Friday, 11 January, 2002, 16:37 GMT Ebola in Gabon A World Health Organisation official, Gregory Hartl, expressed concern about 200 people who had been in contact with Ebola victims near Mekambo, a jungle town about 750 kilometres (465 miles) north-east of the capital, Libreville. There have been 34 confirmed cases reported so far, including 25 deaths 18 in Gabon and seven in the Republic of Congo. Another 200 people are being closely monitored. Fused, Tailored Multimodal Answers Multimodal Answer: 25 people died of Ebola in Gabon and Congo as of January 11 near the location shown here in the map Text Documents, not Answers Figure 3 displays all stories about the Russian nuclear submarine disaster from multiple North American broadcasts from 14-18 August 2000. This format is called a Story Skim. For each story, the user can view story details, including a closed caption text transcription, extracted named entities (i.e., people, places, organizations, time, and money), a generated multimedia summary, or the full original video. Figure 3. Tailored Multimedia News In empirical studies, Merlino and Maybury (1999) demonstrated (see Figure 4) that users enhanced their retrieval performance (a weighted combination of precision and recall) when utilizing BNN’s Story Skim and Story Details presentations instead of mono-media presentations (e.g., text, key frames, video). In addition to performance enhancement, users reported increased satisfaction (8.2 on a scale of 1 (dislike) to 10 (like)) for mixed media display (e.g., story skim, story details). Figure 4. Relevancy Judgement Performance with Different Multimedia Displays Figure 5. Video Annotation As illustrated in Figure 5, during system development we utilized annotation tools to markup a corpus of video for features such as program start/stop as well as commercial and story segments. Using this gold standard, we can apply hidden Markov models to automatically learn a cross modal statistical model for video segmentation and transition detection. Learned models can then detect such video elements as the start of commercial or the transition from a desk anchor to a reporter in the field (Boykin and Merlino 2000). Rapid creation of this multimodal corpora is essential. 3. Multimodal Interfaces Another vision is of intelligent multimodal interfaces that support more sophisticated and natural input and output, enable users to perform complex tasks more quickly, with greater accuracy, and improve user satisfaction. Intelligent multimodal interfaces are becoming more important as users face increasing information overload, system complexity, and mobility as well as an increasing need for systems that are locally adaptive and tailorable to heterogeneous user populations. Intelligent multimodal interfaces are typically characterized by one or more of the following three functions (Maybury and Wahlster 1998, Maybury 1999): Multimodal input – they process potentially ambiguous, impartial, or imprecise combinations of mixed input such as written text, spoken language, gestures (e.g., mouse, pen, dataglove) and gaze. Multimodal output – they design coordinated presentations of, e.g., text, speech, graphics, and gestures, which may be presented via conventional displays or animated, life-like agents. Interaction management – they support mixed initiative interactions that are context-dependent based on system models of the discourse, user, task and media. 1 See www.mitre.org/resources/centers/it/maybury/iui99 for an on-line tutorial on intelligent interfaces. 0 10 20 25 Average Time (seconds per story) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1 A ve ra ge P er fo rm an ce (F -s co re = (P +R )/2 3 Named Entities All Named Entities Skim and Details Key Frame Skim Story Details

01 Jan 2002
TL;DR: The UC Berkeley Digital Library project is developing a new "Multivalent" document browser, which it hopes will convince you to throw away your current, limited web browser, for "collaborative quality filtering", which provides the value of peer review without deference to prior established authorities, such as journals.
Abstract: Our practice of disseminating, accessing and using information, especially scholarly information, is still largely informed by the nature of pre-electronic media. For example, journals still exist in their traditional forms partly because of the value of the peer review process, which thus far has not yielded to decentralized, distributed and timely alternatives. Similarly, information access is still largely text -based, with other data types relegated to second class citizenship. The UC Berkeley Digital Library project is developing technologies aimed at addressing these impediments, and hence allowing the development of new, more efficient mechanisms of information dissemination and use. In particular, we are developing a new "Multivalent" document browser, which we hope will convince you to throw away your current, limited web browser, for "collaborative quality filtering", which provides the value of peer review without deference to prior established authorities, such as journals, and for "collection management services", which bring to individual information users services previously available to libraries. Taken together, such mechanisms may provide the benefits of modern communications without sacrificing traditional academic values. In addition, we have been developing techniques for image retrieval based on image content. Recent progress on learning the semantics of image databases using text and pictures suggests that new forms of image-lated web services may be possible, including automatic image captioning and automatic illustration, among others.

Patent
24 Jul 2002
TL;DR: In this paper, a method for summarizing a news video based on a multimodal feature is provided to divide the news video provided together with a closed caption by the news article, and to summarize the news videos by the divided news articles, to utilize the summarized news video in contents-based video browsing and utilize an index result of generated closed caption data in establishing a database for text-based visual search.
Abstract: PURPOSE: A method for summarizing a news video based on a multimodal feature is provided to divide the news video provided together with a closed caption by the news article, and to summarize the news video by the divided news article, to utilize the summarized news video in contents-based video browsing and utilize an index result of generated closed caption data in establishing a database for text-based video search. CONSTITUTION: A digital contents acquiring step(S110) converts broadcasting signals of a news video into digital signals, and stores the converted digital signals. A multimodal feature extracting step(S120) extracts features from each multimodal component configuring the acquired digital contents. A major section detecting step(S130) detects major sections of the news video based on the extracted multimodal features. A video summary describing step(S140) structurally describes summary information of the news video based on the detected major sections. In the digital contents acquiring step, the broadcasting signals of the news video are divided into audio visual signals and closed caption signals, if the news video is an analog type. Then, the divided audio visual signals are converted into the digital signals and closed caption signals are decoded and stored in a text type.


01 Jan 2002
TL;DR: A subtitle decoding process model in set-top box is designed, demultiplexing and decoding procedures and memory organization and allocation of a software-based decoder are described and the means of user controlling via digital television Navigator is presented.
Abstract: The new subtitling system will be established along with the digital television broadcasting. The system uses MPEG-2 multiplexing and region-based bitmap graphics (with indexed pixel colors) technologies to transmit subtitle data to set-top box. It provides more interactivity than the existing analogue television although interactivity is limited. This paper gives a description of Digital Video Broadcasting (DVB) subtitle encoding and transmission technologies. A subtitle decoding process model in set-top box is designed. It describes demultiplexing and decoding procedures and memory organization and allocation of a software-based decoder. The means of user controlling via digital television Navigator (i.e., enable/disable subtitles in a program and language setting/selection) is presented. The two approaches for implementation based on the client-server model are proposed to ensure better performance and the interoperability. Finally, technical requirements are discussed in the paper.

Proceedings Article
01 May 2002
TL;DR: A system that detects and filters commercials from broadcast news data, which uses just closed caption text to perform this task and shows comparable performance with other methods.
Abstract: Story segmentation is an important problem in multimedia indexing and retrieval and includes detection of commercials as one of its component problems. Commercials appear regularly in television data and are usually treated as noise. Hence, filtering of commercials is an important task. This paper presents a system that detects and filters commercials from broadcast news data. While previous work in the area relies largely on features from audio, video and captions, the system described in this paper uses just closed caption text to perform this task. An evaluation of this system is also presented which shows comparable performance with other methods.

Patent
13 Mar 2002
TL;DR: In this paper, an audio/video device that is connected to a printer for printing documents is described. But this device is not suitable for the printing of large amounts of data, such as, advertisements, coupons, discounts, rebates, lyrics, scripts, closed captioning, etc.
Abstract: This invention relates to an audio/video device that is connected to a printer for printing documents. Such structures of this type, generally, allow the user to print documents, such as, advertisements, coupons, discounts, rebates, lyrics, scripts, closed captioning, and a list of similar products and/or services which are associated with the audio/video programming that the user is listening to and/or viewing.


Reference EntryDOI
15 Jan 2002
TL;DR: A simple multimedia object model is presented to provide a common reference point for discussing multimedia indexing in the remainder of this chapter, and the model is contained in the following definitions discussed.
Abstract: With the continuous evolution of telecommunication and computing technologies, more and more repositories of digital video data are being developed to support a wide range of applications in digital libraries, telemedicine, distance learning, tourism, entertainment, etc. With the rapid proliferation of the Web, these applications are rapidly emerging. Content-based retrieval of video data has been the subject of extensive research since 1990. Because of the huge volume of data, it becomes crucial to develop indexing techniques that will carry out the process of content-based retrieval more efficiently. The problem of video indexing is to create and maintain index structures and algorithms that support the efficient execution of queries about the contents of video presentations. Such queries may ask about features of objects or regions contained within a video, or relationships between objects or regions contained within a video. Additionally, queries may concern these features or relationships in relation to time. The spatial and temporal aspects of video indexing taken separately are nontrivial problems. Each type of indexing has been studied widely, and many research problems remain. Indexing of video, however, must be differentiated from spatial, temporal, and spatiotemporal indexing techniques in that information to be indexed may include not only spatiotemporal information, but possibly highly dimensional feature data such as texture, textual closed captioning information, shape, color histograms, and object trajectories or animation operations. A video indexing technique must, therefore, support efficient searches for objects and images on the basis of the three major facets of a video: its spatial, temporal, and feature values. A simple multimedia object model is presented to provide a common reference point for discussing multimedia indexing in the remainder of this chapter. The model is contained in the following definitions discussed. Keywords: data modeling; video data; knowledge representation; overview; video indexing; issues; techniques; queries; classification of techniques