scispace - formally typeset
Search or ask a question

Showing papers on "Interactive video published in 2004"


Journal ArticleDOI
TL;DR: In this paper, the authors show that participants used the interactive features like stopping, replaying, reversing or changing speed to adapt the pace of the video demonstration, which led to an uneven distribution of their attention and cognitive resources across the videos, which was more pronounced for difficult knots.

421 citations


Journal ArticleDOI
TL;DR: This work believes that their system is the first to deliver fully interactive, photorealistic image-based tours on a personal computer at or above broadcast video resolutions and frame rates and to their knowledge, no other tour provides the same rich set of interactions or visually complex environments.
Abstract: Interactive scene walkthroughs have long been an important computer graphics application area. More recently, researchers have developed techniques for constructing photorealistic 3D architectural models from real-world images. We present an image-based rendering system that brings us a step closer to a compelling sense of being there. Whereas many previous systems have used still photography and 3D scene modeling, we avoid explicit 3D reconstruction because it tends to be brittle. Our system is not the first to propose interactive video-based tours. We believe, however, that our system is the first to deliver fully interactive, photorealistic image-based tours on a personal computer at or above broadcast video resolutions and frame rates. Moreover, to our knowledge, no other tour provides the same rich set of interactions or visually complex environments.

144 citations


Journal ArticleDOI
TL;DR: An overview is presented here of the MPEG activity exploring the need for standardization in this area to support these new applications, under the name of 3DAV (for 3-D audio-visual), as an example, a detailed solution for omnidirectional video is presented as one of the application scenarios in3DAV.
Abstract: New kinds of media are emerging that extend the functionality of available technology. The growth of immersive recording technologies has led to video-based rendering systems for photographing and reproducing environments in motion. This lends itself to new forms of interactivity for the viewer, including the ability to explore a photographic scene and interact with its features. The three-dimensional (3-D) qualities of objects in the scene can be extracted by analysis techniques and displayed by the use of stereo vision. The data types and image bandwidth needed for this type of media experience may require especially efficient formats for representation, coding, and transmission. An overview is presented here of the MPEG activity exploring the need for standardization in this area to support these new applications, under the name of 3DAV (for 3-D audio-visual). As an example, a detailed solution for omnidirectional video is presented as one of the application scenarios in 3DAV.

127 citations


Journal ArticleDOI
TL;DR: Diver can aid collaborative analysis of a broad array of visual data records, including simulations, 2D and 3D animations, and static works of art, photography, and text, which support the tool's fundamentally collaborative, communication-oriented nature.
Abstract: The digital interactive video exploration and reflection (Diver) system lets users create virtual pathways through existing video content using a virtual camera and an annotation window for commentary. Users can post their Dives to the WebDiver server system to generate active collaboration, further repurposing, and discussion. Although our current work focuses on video records in learning research and educational practices, Diver can aid collaborative analysis of a broad array of visual data records, including simulations, 2D and 3D animations, and static works of art, photography, and text. In addition to the social and behavioral sciences, substantive application areas include medical visualization, astronomic data or cosmological models, military satellite intelligence, and ethnology and animal behavior. Diver-style user-centered video repurposing might also prove compelling for popular media with commercial application involving sports events, movies, television shows, and video gaming. Future technical development includes possible enhancements to the interface to support simultaneous display of multiple Dives on the same source content, a more fluid two-way relation between desktop Diver and WebDiver, and solutions to the current limitations on displaying and authoring time/space cropped videos in a browser context. These developments support the tool's fundamentally collaborative, communication-oriented nature.

115 citations


Journal ArticleDOI

88 citations


Patent
25 Oct 2004
TL;DR: In this paper, a plurality of video spots are displayed on the interactive video display system and data based on interaction with the interactive visual display system corresponding to video spots of the plurality of videos is gathered.
Abstract: A method for managing an interactive video display system. A plurality of video spots are displayed on the interactive video display system. Data based on interaction with the interactive video display system corresponding to video spots of the plurality of video spots is gathered. The data is stored, wherein the data is for use in managing presentation of the video spots. By analyzing data relating to different video spots, popularity and other metrics may be determined for the video spots, providing useful information for managing the presentation of the video spots.

71 citations


Journal ArticleDOI
TL;DR: This paper identified teaching styles of university interactive television instructors, with a strong inclination toward a teacher-centered approach to the distance teaching process and discrepancies between what research and theory suggest is the most appropriate distance teaching style and what is actually being employed in interactive television classrooms.
Abstract: This study identified teaching styles of university interactive television instructors. The instructors (N = 203), representing nine Land Grant universities, completed a demographic survey and the Principles of Adult Learning Scale, a forty-four item teaching-style assessment instrument. Descriptive statistics revealed that interactive television instructors displayed behaviors representative of both learner-centered and teacher-centered styles, with a strong inclination toward a teacher-centered approach to the distance teaching process. Results indicate discrepancies between what research and theory suggest is the most appropriate distance teaching style and what is actually being employed in interactive television classrooms.

70 citations


Journal ArticleDOI
TL;DR: This paper proposes a natural language approach to content-based video indexing and retrieval to identify appropriate video clips that can address users' needs and shows that precision and recall of this approach are better than those of the traditional keyword based approach.
Abstract: As a powerful and expressive nontextual media that can capture and present information, instructional videos are extensively used in e-learning (Web-based distance learning). Since each video may cover many subjects, it is critical for an e-learning environment to have content-based video searching capabilities to meet diverse individual learning needs. In this paper, we present an interactive multimedia-based e-learning environment that enables users to interact with it to obtain knowledge in the form of logically segmented video clips. We propose a natural language approach to content-based video indexing and retrieval to identify appropriate video clips that can address users' needs. The method integrates natural language processing, named entity extraction, frame-based indexing, and information retrieval techniques to explore knowledge-on-demand in a video-based interactive e-learning environment. A preliminary evaluation shows that precision and recall of this approach are better than those of the traditional keyword based approach.

67 citations


Patent
02 Jun 2004
TL;DR: In this paper, a video player (20) plays video using media (10) it causes a menu to display (40), recognizes triggers (50) and based on user input skips segments (60) of video, inserts graphics 70 and plays a game
Abstract: Video player (20) plays vides (30) using media (10) It causes a menu to display (40), recognizes triggers (50) and based on user (15) input skips segments (60) of video, inserts graphics 70 and plays a game

66 citations


Patent
Nebojsa Jojic1, Chris Pal1
01 Dec 2004
TL;DR: The Video Browser as discussed by the authors provides interactive browsing of unique events occurring within an overall video recording, such as motion events, security events, or other predefined event types, occurring within all or part of the total period covered by the video.
Abstract: A “Video Browser” provides interactive browsing of unique events occurring within an overall video recording. In particular, the Video Browser processes the video to generate a set of video sprites representing unique events occurring within the overall period of the video. These unique events include, for example, motion events, security events, or other predefined event types, occurring within all or part of the total period covered by the video. Once the video has been processed to identify the sprites, the sprites are then arranged over a background image extracted from the video to create an interactive static video montage. The interactive video montage illustrates all events occurring within the video in a single static frame. User selection of sprites within the montage causes either playback of a portion of the video in which the selected sprites were identified, or concurrent playback of the selected sprites within a dynamic video montage.

56 citations


Patent
30 Apr 2004
TL;DR: An interactive video compositing device as discussed by the authors includes a chroma-key mixer (28), video switcher (30) and control circuitry (20), which generates a composite image by combining a real-time image such as one captured by a video recorder (16), with a pre-recorded video image, such as a movie.
Abstract: An interactive video compositing device (12) includes a chroma-key mixer (28), video switcher (30) and control circuitry (20). The chroma-key mixer (28) generates a composite image by combining a real-time image, such as one captured by a video recorder (16), with a prerecorded video image, such as a movie. The composite image includes the modified real-time image superimposed, or overlaid, onto the prerecorded image. The video switcher (30) automatically selects either the composite image or the prerecorded image to be output to a display (18). The control circuitry (20) controls the video switcher (30) and other outputted signals based on data file information that corresponds to content of the prerecorded image or media. For example, the data files may contain information relating to the presence (or absence) of a particular character in a movie scene, thus allowing for the output and display, at appropriate times, of the real-time composite image instead of the prerecorded image.

Patent
22 Oct 2004
TL;DR: In this paper, a system and method for delivering interactive video and audio content items, e.g., movie clips, music videos, adverts, to a user playback device, such as a television (TV) set, is presented.
Abstract: A system and method for delivering interactive video and audio content items, e.g., movie clips, music videos, adverts, to a user playback device, such as a television (TV) set. In a preferred embodiment, the content items are delivered within a video on demand (VoD) environment. Each content item has associated attributes that detail the navigational properties for that content item. The content items are delivered as entries in a content sequence. Nonlinear navigation of the video content sequence is facilitated by querying the associated attributes for the current content item and enabling navigational actions (e.g. FF/REW/PAUSE/SKIP/Jump to target) for that content item accordingly. The content items thus permit varying degrees of user interaction. The user interaction is not bound by predetermined navigational rules, since the user is free to experience the sequenced VoD content items in any order.

Proceedings ArticleDOI
20 Jun 2004
TL;DR: This paper shows through trace driven simulations that attempting to transmit every frame results in severe performance degradation and shows that the proposed algorithm MC-drop outperforms other algorithms in terms of suitably defined metrics that capture overall video quality.
Abstract: A mobile terminal equipped with multiple interfaces can achieve a much higher bandwidth by aggregating the bandwidth offered by the individual networks. This helps support demanding applications like interactive video. Often, in spite of bandwidth aggregation, the available bandwidth may be too small to avoid frame loss altogether. Under these circumstances, it may be necessary to selectively discard frames to minimize the effect of their loss on the overall video quality. In this paper, we consider different frame discard algorithms and study their performance in the presence of multiple interfaces. We show through trace driven simulations that attempting to transmit every frame results in severe performance degradation. In particular we show that our proposed algorithm MC-drop outperforms other algorithms in terms of suitably defined metrics that capture overall video quality.

Journal ArticleDOI
TL;DR: In this article, the authors analyze the online exchange of messages in one school district that participated in a video-case-based program of teacher professional development and derive principles that will help facilitators lead grounded online interactions.
Abstract: The use of interactive video cases for teacher professional development is an emergent medium inspired by case study methods used extensively in law, management, and medicine, and by the advent of multimedia technology available to support online discussions. This paper focuses on Web-based “grounded” discussions—in which the participants base their contributions on specific events portrayed in the case—and the role facilitators play in these online interactions. This paper analyzes the online exchange of messages in one school district that participated in a video-case-based program of teacher professional development and derives principles that will help facilitators lead grounded online interactions.

Journal ArticleDOI
TL;DR: An interactive framework for navigating video sequences is presented using an optimal content-based video decomposition scheme and a computationally efficient algorithm is proposed to regulate the degree of detail in case the visual content is not efficiently represented from the user's perceptive view.
Abstract: In this paper, an interactive framework for navigating video sequences is presented using an optimal content-based video decomposition scheme. In particular, each video sequence is analyzed at different content resolution levels, creating a hierarchy from the lowest (coarse) to the highest (fine) resolution. This content hierarchy is represented as a tree structure, each level of which corresponds to a particular content resolution, while the tree nodes indicate the temporal video segments that the sequence content is partitioned at a given resolution. A criterion is introduced to measure the efficiency of the proposed scheme in organizing the video visual content and to compare it with other hierarchical video content representations and navigation schemes. The efficiency is measured as the difficulty for a user to locate a video segment of interest, while moving through different levels of hierarchy. In our case, video is decomposed so that the best efficiency is accomplished. However, the efficiency of a nonlinear video decomposition scheme depends on: 1) the number of paths required for a user to locate a relevant video segment and 2) the number of shot/frame classes (i.e., content representatives) extracted to represent the visual content. Both issues are addressed in this paper. In the first case, the probability of selecting a relevant video segment in the first path is maximized by extracting optimal content representatives through a minimization of a cross-correlation criterion. For the minimization, a genetic algorithm (GA) is adopted, since application of an exhaustive search to obtain the minimum value is too large to be implemented. The cross-correlation criterion is evaluated on the feature domain by extracting appropriate global and object-based descriptors for each video frame so that a better representation of the visual content is achieved. The second aspect (e.g., the number of content representatives) is addressed by minimizing the average transmitted information and simultaneously taking into consideration the temporal video segment complexity. More content representatives are extracted for video segments of high complexity, whereas a low number is required for low-complexity segments. In addition, a degree of interest is assigned to each video shot (or frame) to address the fact that, from the user's perception, the visual content of a set of shots (frames) satisfies his/her information needs. Finally, a computationally efficient algorithm is proposed to regulate the degree of detail (i.e., the number of shot/frames representatives) in case the visual content is not efficiently represented from the user's perceptive view. Experimental results on real-life video sequences indicate the performance of the proposed GA-based video decomposition scheme compared to other hierarchical video organization methods.

Proceedings ArticleDOI
23 Oct 2004
TL;DR: The results suggest that an interactive video mirror can be highly useful in martial arts and other sports and introduces new kind of graphical controls that float around the user so that they can be manipulated with gestures regardless of the user's position.
Abstract: This paper studies gesture and speech controlled video for sports training. The goal is to combine the benefits of recording your performance with video equipment and training with a mirror. For example, a delayed camera view projected on a screen can be used to repeatedly perform and evaluate a spin kick, a move that is difficult to practice with a mirror.A video mirror can also be augmented with speech or gesture control for playback, recording and inspecting of individual frames. Three different interface design approaches are evaluated, based on testing with eight users that practice martial arts and acrobatics. The results suggest that an interactive video mirror can be highly useful in martial arts and other sports. The paper also introduces new kind of graphical controls that float around the user so that they can be manipulated with gestures regardless of the user's position.

Patent
06 Apr 2004
TL;DR: In this paper, a method comprising the steps of storing a full-length video program, storing a short-form program, wherein the short form program is an edited version of the full-term video program and storing searchable parameters for the fullterm video programs and the short-term program was proposed.
Abstract: A method, comprising the steps of storing a full-length video program, storing a short form program, wherein the short form program is an edited version of the full-length video program, storing searchable parameters for the full-length video program and the short form program, receiving a search request from a user, the search request including search parameters and returning search results to the user including the full-length video program and the short form program when the search parameters match the searchable parameters of the full-length video program and the short form program.

Journal ArticleDOI
TL;DR: This paper is about MirrorSpace, an interactive video communication system that was originally conceived as a prototype for the interLiving project (http:// interliving.kth.se/) of the European Disappearing Computer initiative (2001-2003).
Abstract: This paper is about MirrorSpace, an interactive video communication system. When you're far away, your own image reflects in it like in a distorting mirror, blurred and imprecise. As you move toward it, however, it becomes clearer and more accurate. By the time you reach it, the reflection is almost perfect, as in a conventional mirror. What you see is not a simple optical reflection but a video image captured, processed, and displayed in real time. MirrorSpace was originally conceived as a prototype for the interLiving project (http:// interliving.kth.se/) of the European Disappearing Computer initiative (2001-2003). This project focused on the design of technologies to support communication among family members located in different households.

Journal ArticleDOI
TL;DR: A generic view of new media as Interactive TV, Interactive Video or collaborative Hypervideo based on the key concepts of combining video content, interactivity and support for communities is elaborate on.

Patent
15 Apr 2004
TL;DR: In this article, a lighted hat is provided that in one embodiment includes a crown (12), a bill (14) extending from the crown, and at least one light source positioned to direct light through a light-transmissive portion (18) of the bill.
Abstract: A lighted hat (10) is provided that in one embodiment includes a crown (12), a bill (14) extending from the crown (12), and at least one light source (16) positioned to direct light (17) through a light-transmissive portion (18) of the bill (14). The light -transmissive portion (18) may include one or more indicia which are high-lighted by the light from the light source (16).

Journal Article
TL;DR: Synchronized real-time motion and video acquisition provides a comprehensive assessment solution by combining quantitative motion analysis tools and qualitative targeted video scoring and it is observed that graphs containing fewer sudden velocity peaks are less likely to have erroneous movements.
Abstract: Surgical dexterity in operating theatres has traditionally been assessed subjectively. Electromagnetic (EM) motion tracking systems such as the Imperial College Surgical Assessment Device (ICSAD) have been shown to produce valid and accurate objective measures of surgical skill. To allow for video integration we have modified the data acquisition and built it within the ROVIMAS analysis software. We then used ActiveX 9.0 DirectShow video capturing and the system clock as a time stamp for the synchronized concurrent acquisition of kinematic data and video frames. Interactive video/motion data browsing was implemented to allow the user to concentrate on frames exhibiting certain kinematic properties that could result in operative errors. We exploited video-data synchronization to calculate the camera visual hull by identifying all 3D vertices using the ICSAD electromagnetic sensors. We also concentrated on high velocity peaks as a means of identifying potential erroneous movements to be confirmed by studying the corresponding video frames. The outcome of the study clearly shows that the kinematic data are precisely synchronized with the video frames and that the velocity peaks correspond to large and sudden excursions of the instrument tip. We validated the camera visual hull by both video and geometrical kinematic analysis and we observed that graphs containing fewer sudden velocity peaks are less likely to have erroneous movements. This work presented further developments to the well-established ICSAD dexterity analysis system. Synchronized real-time motion and video acquisition provides a comprehensive assessment solution by combining quantitative motion analysis tools and qualitative targeted video scoring.

Proceedings ArticleDOI
22 Sep 2004
TL;DR: In this paper, the authors discuss the development of interactive video tutorial-based problems to help introductory physics students learn effective problem solving heuristics, and present problem solving strategies using concrete examples in an interactive environment.
Abstract: We discuss the development of interactive video tutorial‐based problems to help introductory physics students learn effective problem solving heuristics. The video tutorials present problem solving strategies using concrete examples in an interactive environment. They force students to follow a systematic approach to problem solving and students are required to solve sub‐problems (research‐guided multiple choice questions) to show their level of understanding at every stage of problem solving. The tutorials are designed to provide scaffolding support at every stage of problem solving as needed and help students view the problem solving process as an opportunity for knowledge and skill acquisition rather than a “plug and chug” chore. A focus on helping students learn first to analyze a problem qualitatively, and then to plan a solution in terms of the relevant physics principles, can be useful for developing their reasoning skills. The reflection stage of problem solving can help students develop meta‐cognitive skills because they must focus on what they have learned by solving the problem and how it helps them extend and organize their knowledge. Preliminary evaluations show that a majority of students who are unable to solve the tutorial problems without help can solve similar problems after working through the video tutorial. Further evaluation to assess the development of useful skills is underway.

Book ChapterDOI
21 Jul 2004
TL;DR: This work proposes using truncated object-object similarity matrix as an access structure for interactive video retrieval and offers a scalable solution to retrieval and allows combination of different feature spaces or sources of information.
Abstract: We propose using truncated object-object similarity matrix as an access structure for interactive video retrieval. The proposed approach offers a scalable solution to retrieval and allows combination of different feature spaces or sources of information. Experiments were performed on TREC Video collections of 2002 and 2003.

Proceedings ArticleDOI
25 Oct 2004
TL;DR: Simulation results show that the proposed retransmission mechanism maintains the video quality under different loss rates and with less overhead compared to error control methods that depend on controlling the intra-update rate.
Abstract: We propose a mechanism that combines retransmission-based error control with path diversity, to provide different levels of protection to interactive video in ad-hoc networks. The mechanism factors in the importance of the retransmitted packets to the reconstructed video quality as well as the end-to-end latency constraints to minimize the overhead and maximize the reconstructed video quality at the receiver. Simulation results show that the proposed retransmission mechanism maintains the video quality under different loss rates and with less overhead compared to error control methods that depend on controlling the intra-update rate. In addition, the mechanism is shown to be more robust to wireless losses than schemes that combine layered coding with path diversity.

Proceedings ArticleDOI
25 Oct 2004
TL;DR: The proposed proxy caching mechanisms can provide users with a continuous and high-quality video distribution in accordance with network condition and can be adapted appropriately in the proxy to cope with the client-to-client heterogeneity.
Abstract: The proxy mechanism widely used in WWW systems offers low-delay and scalable delivery of data by means of a "proxy server" By applying proxy mechanism to video streaming systems, high-quality and low-delay video distribution can be accomplished without imposing extra load on the system We have proposed proxy caching mechanisms to accomplish the high-quality and highly-interactive video streaming services In our proposed mechanisms, proxies communicate with each other, retrieve a missing video data from an appropriate server by taking into account transfer delay and offerable quality In addition, the quality of cached video data can be adapted appropriately in the proxy to cope with the client-to-client heterogeneity, in terms of the available bandwidth, end-system performance, and user preferences on the perceived video quality In this paper, to verify the practicality of our mechanisms, we implemented them on a real system for MPEG-4 video streaming services, and conducted experiments Through evaluations, it was shown that our proxy caching system can provide users with a continuous and high-quality video distribution in accordance with network condition

Patent
05 Jan 2004
TL;DR: In this paper, the authors define a structure of nodes, wherein a node comprises a data structure containing a link to content, selecting the link, and generating an output that is based on the content.
Abstract: A method, for use in an interactive video interface, includes defining a structure of nodes, wherein a node comprises a data structure containing a link to content, selecting the link, and generating an output that is based on the content. The link is one of plural links to different content accessible via the node, and selecting the link includes selecting among the plural links.

01 Jan 2004
TL;DR: First evaluation results show that the proposed approach facilitates accessing video content in a novel way and is based on the scalable vector graphics standard and the MPEG-7 reference implementation.
Abstract: The paper introduces a novel approach for interactive video browsing that makes video content fully transparent to the user. Video clips are analysed and indexed by two tree structures: a content index tree representing the content of automatically segmented video shots and a time index tree representing the temporal structure. The index top levels give an overview over the entire content. Subsequent levels illustrate content relationships more detailed. Every level of both trees is a twodimensional self-organising map organising media objects by two degrees of freedom. Media objects are represented by content-based visual MPEG-7 descriptions. The implemented navigation scheme allows the user for switching between content index tree and time index tree without loosing the overview. Context information (position in the tree, content of next lower level, etc.) is permanently shown in auxiliary panels. The implementation is based on the scalable vector graphics standard (visualisation) and the MPEG-7 reference implementation. First evaluation results show that the proposed approach facilitates accessing video content in a novel way.


Proceedings ArticleDOI
16 Nov 2004
TL;DR: A new segment-based proxy caching algorithm, named popularity-wise caching, is proposed for highly interactive streaming, designed to deal with arbitrary popularity distribution of media content.
Abstract: Most of the current proxy caching algorithms for streaming video media assume that users favor the beginning of the media object. However, this assumption is questionable in highly interactive scenarios, such as e-learning, where some parts of the video other than the prefix can also be popular. A new segment-based proxy caching algorithm, named popularity-wise caching, is proposed for highly interactive streaming. It is designed to deal with arbitrary popularity distribution of media content. Simulations are performed using synthetic traces with different kinds and levels of user interactivity. The results show that the performance of current segment-based caching (such as exponential caching and soccer caching) degrade with increasing user interactivity, while popularity-wise caching can provide the lowest user startup latency for interactive requests and highest bandwidth saving for the backbone network.

Proceedings ArticleDOI
17 May 2004
TL;DR: A novel system that can automatically create an optimal and nonrepetitive summarization and support different user requirements for video browsing and content overview by outputting both the optimal set of key frames and a summarized version of the original video with the user-specified time length is proposed.
Abstract: This paper proposes a novel system that can automatically create an optimal and nonrepetitive summarization and support different user requirements for video browsing and content overview by outputting both the optimal set of key frames and a summarized version of the original video with the user-specified time length. Comparing our approach to video abstraction with another algorithm, we demonstrate that our approach is fast and produces an effective video summary.