scispace - formally typeset
Search or ask a question

Showing papers in "IEEE MultiMedia in 2005"


Journal ArticleDOI
TL;DR: An extensible event and object ontology expressed in VERL is presented and a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video is discussed.
Abstract: The notion of "events" is extremely important in characterizing the contents of video. An event is typically triggered by some kind of change of state captured in the video, such as when an object starts moving. The ability to reason with events is a critical step toward video understanding. This article describes the findings of a recent workshop series that has produced an ontology framework for representing video events-called Video Event Representation Language (VERL) -and a companion annotation framework, called Video Event Markup Language (VEML). One of the key concepts in this work is the modeling of events as composable, whereby complex events are constructed from simpler events by operations such as sequencing, iteration, and alternation. The article presents an extensible event and object ontology expressed in VERL and discusses a detailed example of applying VERL and VEML to the description of a "tailgating" event in surveillance video.

229 citations


Journal ArticleDOI
TL;DR: Further features of the human perceptual system like multisensory integration and perceptual stream dissociation are presented with regard to effective interactive sonification.
Abstract: Research shows that it enhances users' perception to present stimuli in two modalities simultaneously (for example, with speech perception). In our experiments, we use sonification to transform parameters of human movement patterns into sound to enhance perception accuracy. This article also presents further features of the human perceptual system like multisensory integration and perceptual stream dissociation with regard to effective interactive sonification.

155 citations


Journal ArticleDOI
TL;DR: This work presents the multilayer conceptual framework of MIees, algorithms for expressive content analysis and processing, and MIEEs-based art applications.
Abstract: Multisensory integrated expressive environments is a framework for mixed reality applications in the performing arts such as interactive dance, music, or video installations. MIEEs address the expressive aspects of nonverbal human communication. We present the multilayer conceptual framework of MIEEs, algorithms for expressive content analysis and processing, and MIEEs-based art applications.

148 citations


Journal ArticleDOI
TL;DR: In this article, the authors describe the most common types of geometric attacks and survey proposed solutions for image watermarking methods, and present a survey of solutions for different types of attacks.
Abstract: Synchronization errors can lead to significant performance loss in image watermarking methods, as the geometric attacks in the Stirmark benchmark software show. The authors describe the most common types of geometric attacks and survey proposed solutions.

134 citations


Journal ArticleDOI
TL;DR: The authors designed the Ballancer experimental tangible interface to exploit the model-based sonification of the rolling ball to improve the experience and effectiveness of the interaction.
Abstract: Balancing a ball along a tillable track is a control metaphor for a variety of continuous control tasks. The authors designed the Ballancer experimental tangible interface to exploit such a metaphor. Direct, model-based sonification of the rolling ball improves the experience and effectiveness of the interaction.

105 citations


Journal ArticleDOI
TL;DR: The research field of sonification, a subset of the topic of auditory display, has developed rapidly in recent decades and brings together interests from the areas of data mining, exploratory data analysis, human-computer interfaces, and computer music.
Abstract: The research field of sonification, a subset of the topic of auditory display, has developed rapidly in recent decades. It brings together interests from the areas of data mining, exploratory data analysis, human-computer interfaces, and computer music. Sonification presents information by using sound (particularly nonspeech), so that the user of an auditory display obtains a deeper understanding of the data or processes under investigation by listening.

102 citations


Journal ArticleDOI
TL;DR: 3D sound is used to help navigate an immersive virtual environment and results of user tests obtained with a game-like application show that auditory cues help in navigation, and auditory navigation is possible even without any visual feedback.
Abstract: The authors use 3D sound to help navigate an immersive virtual environment and report results of user tests obtained with a game-like application. The results show that auditory cues help in navigation, and auditory navigation is possible even without any visual feedback. The best performance is obtained in audiovisual navigation where auditory cues indicate the approximate direction and visual cues help in the final approach.

78 citations


Journal ArticleDOI
TL;DR: This work presents two techniques supporting segment based proxy caching of streaming media objects, and evaluated these techniques in simulations and real systems.
Abstract: The proliferation of multimedia content on the Internet poses challenges to existing content delivery networks. While proxy caching can successfully deliver traditional text-based static objects, it faces difficulty delivering streaming media objects because of the objects' sizes as well as clients' rigorous continuous delivery demands. We present two techniques supporting segment based proxy caching of streaming media. We evaluated these techniques in simulations and real systems.

73 citations


Journal ArticleDOI
Jelena Tesic1
TL;DR: These metadata schemas provide a standard format for creating, processing, and exchanging digital image metadata and enable image management, analysis, indexing, and search applications.
Abstract: Digital image metadata plays a crucial role in managing digital image repositories It lets us catalog and maintain large image collections as well as search for and find relevant information Moreover, describing a digital image with defined metadata schemes lets multiple systems with different platforms and interfaces access and process image metadata Metadata's wide use in commercial, academic, and educational domains as well as on the Web has propelled the development of new standards for digital image data schemes The Japan Electronics and Information Technology Industries Association has proposed the Exchangeable Image File Format (EXIF) as a standard for storing administrative metadata in digital image files during acquisition The International Press Telecommunications Council (IPTC) has developed a standard for storing descriptive metadata information within digital images These metadata schemas, as well as other emerging standards, provide a standard format for creating, processing, and exchanging digital image metadata and enable image management, analysis, indexing, and search applications

72 citations


Journal ArticleDOI
TL;DR: Subjects' ability to perform a discrimination task with parametric sonification in real time is studied.
Abstract: The authors introduce a device for the parametric sonification of electroencephalographic (EEC) data. The device allows auditory feedback of multiple EEG characteristics in real time. Six frequency bands are assigned as instruments from a MIDI device. The time-dependent parameters modulate the timing, pitch, and volume of the instruments. Using this, we studied subjects' ability to perform a discrimination task with parametric sonification in real time.

68 citations


Journal ArticleDOI
TL;DR: The two most prominent approaches to lifelogging, namely MyLifeBits and EyeTap, are introduced and the struggle with the experiential value of the collected data and its durability in those approaches is discussed.
Abstract: Editor's Note: This column introduces the two most prominent approaches to lifelogging,namely MyLifeBits and EyeTap. These approaches deal with the problem of establishing media-based memory structures that address the cognition of audio--visual data with respect to comprehension and, in some aspects, interpretation. Here I discuss my struggle with the experiential value of the collected data and its durability in those approaches. --Frank Nack

Journal ArticleDOI
TL;DR: This work presents an approach for using pictorial artwork as information displays and shows how to combine almost any kind of computer-generated visual information directly with the painted content.
Abstract: We present an approach for using pictorial artwork as information displays and show how to combine almost any kind of computer-generated visual information directly with the painted content. We describe a novel technological approach, a mathematical model, a real-time rendering algorithm, and examples of presentation techniques. Our system displays such information while keeping the observers' attention on the original artifact and doesn't require additional screens.

Journal ArticleDOI
TL;DR: Escritoire as discussed by the authors uses two overlapping projectors to create a projected display for a personal computer, where a large low-resolution region fills an entire desk while a high resolution region accommodates the user's focus of attention.
Abstract: We created a system called Escritoire that uses two overlapping projectors to create a projected display for a personal computer. A large low-resolution region fills an entire desk while a high-resolution region accommodates the userýs focus of attention. The system works in real time and can be used by one person at a desk or by remote participants to create a shared visual space.

Journal ArticleDOI
TL;DR: Lightweight Application Scene Representation (Laser) is the Moving Picture Expert Group's solution for delivering rich media services to mobile, resource-constrained devices.
Abstract: Lightweight Application Scene Representation (Laser) is the Moving Picture Expert Group's solution for delivering rich media services to mobile, resource-constrained devices. Laser provides easy content creation, optimized rich media data delivery, and enhanced rendering on all devices.

Journal ArticleDOI
TL;DR: The CODAC Project, led by Harald Kosch, implements different multimedia processes and ties them together in the life cycle through metadata, which stores content and MPEG-7-based metadata.
Abstract: During its lifetime, multimedia content undergoes different stages or cycles from production to consumption. Content is created, processed or modified in a postproduction stage, delivered to users, and finally, consumed. Metadata, or descriptive data about the multimedia content, pass through similar stages but with different time lines. Metadata may be produced, modified, and consumed by all actors involved in the content production-consumption chain. At each step of the chain, different kinds of metadata may be produced by highly different methods and of substantially different semantic value. Different metadata let us tie the different multimedia processes in a life cycle together. However, to employ these metadata, they must be appropriately generated. The CODAC Project, led by Harald Kosch, implements different multimedia processes and ties them together in the life cycle. CODAC uses distributed systems to implement multimedia processes. The project's core component is a multimedia database management system (MMDBMS) which stores content and MPEG-7-based metadata. It communicates with a streaming server for data delivery. The database is realized in the multimedia data cartridge (MDC) - which is an extension of the Oracle database management system - to handle multimedia content and MPEG-7 metadata.

Journal ArticleDOI
TL;DR: Whether there's a relationship between the writing and reading behavior of online students and whether active participation influences learning efficiency is investigated and an interesting related result that emerged from the study is that the effort of the instructor in terms of reading and writing posts is higher than that of the learners themselves.
Abstract: In this article, the authors offer their perspective about online learning. In general, we know that online learning develops through interaction and that it's a collaborative process where students actively engage in writing and reading messages among themselves and with the instructor. However, it's also well known that in any online community, not all users are equally active, and there are indeed people who never take an active part-the so-called lurkers. This article focuses on the lurkers; the authors ran extensive experiments to demonstrate whether there's a relationship between the writing and reading behavior of online students and whether active participation influences learning efficiency. An interesting related result that emerged from the study is that the effort of the instructor in terms of reading and writing posts is higher than that of the learners themselves.

Journal ArticleDOI

Journal ArticleDOI
TL;DR: This paper focuses on how to use the tools specified within MPEG-21 for interoperable multimedia communication, and explores device and coding format independence issues.
Abstract: The desire to gain access to rich, multimedia-based information anywhere, anytime grows enormously. To achieve such access, the research and standardization communities have launched an initiative called universal multimedia access (UMA). However, UMA tools and specifications that have so far emerged concentrate mostly on constraints imposed by terminals and networks along the multimedia delivery chain; users who consume the content are rarely considered. Researchers have developed a plethora of technologies and standards to address some of the issues. However, the big picture of how these different technologies and standards fit together is missing. Thus, the moving picture experts group (MPEG) decided to standardize the MPEG-21 multimedia framework with the ultimate goal to support users during the exchange, access, consumption, trade, or other manipulation of so-called digital items in an efficient, transparent, and interoperable way. Device and coding-format-independent multimedia content adaptation standardization committees such as MPEG, the Internet Engineering Task Force (IETF), and the World Wide Web Consortium (W3C) are exploring device and coding format independence issues. Here, however, we focus on how we can use the tools specified within MPEG-21 for interoperable multimedia communication.

Journal ArticleDOI
TL;DR: Developing the standards and technologies for video blogging requires a combination of approaches from various areas including media representation, information retrieval, multimedia content analysis, and video summarization.
Abstract: The lure of video blogging combines the ubiquitous, grassroots, Web-based journaling of blogging with the richness of expression available in multimedia. Some claim that video blogging is an important force in a future world of video journalism and a powerful technical adjunct to our existing televised news sources. Others point to the huge demands it imposes on networking resources, the lack of hard standards, and the poor usability of current video blogging systems as indicators that it's doomed to fail. Like any nascent technology, video blogging has many unsolved problems. The field, however, is vibrant, the goals are fairly clear, and the challenges they pose to multimedia researchers are exciting indeed. Developing the standards and technologies for video blogging requires a combination of approaches from various areas including media representation, information retrieval, multimedia content analysis, and video summarization. Like the development of the Web and text blogging before, video blogging only come about through open development and collaboration between engineers and researchers from diverse fields. Most strikingly, it is fueled by the passion and enthusiasm of those creating content - those who go to the trouble of recording their lives and opinions within the fledgling medium, shaping it as a lively and useful resource for generations of Internet users to come.

Journal ArticleDOI
TL;DR: FAs are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users' growing need for better-integrated multimedia solutions.
Abstract: MPEG standards, instrumental in shaping the multimedia landscape since the early 1990's, are taking another step forward with MPEG-A, which specifies multimedia application formats. MAFs are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users' growing need for better-integrated multimedia solutions.

Journal ArticleDOI
TL;DR: Methods for using sound to encode georeferenced data patterns and for navigating maps for visually impaired people are compared.
Abstract: Auditory information is an important channel for the visually impaired. Effective sonification (the use of non-speech audio to convey information) promotes equal working opportunities for people with vision impairments by helping them explore data collections for problem solving and decision making. Interactive sonification systems can make georeferenced data accessible to people with vision impairments. The authors compare methods for using sound to encode georeferenced data patterns and for navigating maps.

Journal ArticleDOI
TL;DR: This work examines the use of auditory display for ubiquitous computing to extend the boundaries of human-computer interaction and gathers free-text identification responses from participants to identify possible metaphors and mappings of sound to human action and/or system status.
Abstract: We examine the use of auditory display for ubiquitous computing to extend the boundaries of human-computer interaction (HCI). Our design process is based on listening tests, gathering free-text identification responses from participants. The responses and their classifications indicate how accurately sounds are identified and help us identify possible metaphors and mappings of sound to human action and/or system status.

Journal ArticleDOI
TL;DR: MPEG's most recent effort to progress the state of the art is the MPEG Surround work item, which provides an efficient method for coding multichannel sound via the transmission of a compressed stereophonic audio program plus a low-rate side-information channel.
Abstract: MPEG's most recent effort to progress the state of the art is the MPEG Surround work item. It provides an efficient method for coding multichannel sound via the transmission of a compressed stereophonic (or even monophonic) audio program plus a low-rate side-information channel. Benefits of this approach include backward compatibility with pervasive stereo playback systems while permitting next-generation players to reconstruct high-quality multichannel sound.

Journal ArticleDOI
TL;DR: This work introduces a piecewise linear parameterization of 3D surfaces that can be used for texture mapping, morphing, remeshing, and geometry imaging and guarantees one-to-one mapping without foldovers in a geometrically intuitive way.
Abstract: Digital geometry is a new data type for multimedia applications. To foster the use of 3D geometry, we introduce a piecewise linear parameterization of 3D surfaces that we can use for texture mapping, morphing, remeshing, and geometry imaging. Our method guarantees one-to-one mapping without foldovers in a geometrically intuitive way.

Journal ArticleDOI
TL;DR: A platform for interactive virtual heritage applications integrates a high-end virtual reality system with wireless, connected portable and wearable computers, facilitating and enhancing user navigation, visualization control, and peer-to-peer information exchange.
Abstract: The rapid evolution of pervasive computing technologies enables bringing virtual heritage applications to a new level of user participation. A platform for interactive virtual heritage applications integrates a high-end virtual reality system with wireless, connected portable and wearable computers, facilitating and enhancing user navigation, visualization control, and peer-to-peer information exchange.

Journal ArticleDOI
TL;DR: The authors describe the techniques and tools for creating a preliminary IEBook, embodying some of the basic concepts of immersive electronic books for surgical training.
Abstract: Immersive electronic books (IEBooks) for surgical training will let surgeons explore previous surgical procedures in 3D. The authors describe the techniques and tools for creating a preliminary IEBook, embodying some of the basic concepts.

Journal ArticleDOI
TL;DR: Sometimes designers just apply the technology available to traditional museum schemes, without paying much attention to the visitor's experience, particularly to the ways they expect users to interact with the system or to the cognitive and aesthetic factors involved.
Abstract: Museums and exhibitions don't communicate. These places are often just a collection of objects, standing deaf in front of visitors. In many cases, objects are accompanied by textual descriptions, usually too short or long to be useful for the visitor. In the last decade, progress in multimedia has allowed for new, experimental forms of communication (using computer technologies) in public spaces. Implementations have ranged from simply using standard PCs with multimedia applications that show thumbnails of image data integrated with text to large theaters that immerse users in virtual worlds or reproduce and display 3D models of masterpieces. Often designers just apply the technology available to traditional museum schemes, without paying much attention to the visitor's experience, particularly to the ways they expect users to interact with the system or to the cognitive and aesthetic factors involved.

Journal ArticleDOI
TL;DR: A method to improve user feedback, specifically the display of time-varying probabilistic information, through asynchronous granular synthesis is described, using these displays in mobile, gestural interfaces where visual display is often impractical.
Abstract: We describe a method to improve user feedback, specifically the display of time-varying probabilistic information, through asynchronous granular synthesis. We have applied these techniques to challenging control problems as well as to the sonification of online probabilistic gesture recognition. We're using these displays in mobile, gestural interfaces where visual display is often impractical.

Journal ArticleDOI
TL;DR: The integration of symbolic music representation (SMR) in a versatile multimedia framework such as MPEG will enable the development of a huge number of new applications in the entertainment, education, and information delivery domains as discussed by the authors.
Abstract: With the spread of computer technology into the artistic fields, new application scenarios for computer-based applications of symbolic music representation (SMR) have been identified. The integration of SMR in a versatile multimedia framework such as MPEG will enable the development of a huge number of new applications in the entertainment, education, and information delivery domains.

Journal ArticleDOI
TL;DR: Insight into the game industry is provided as well as discussing the debate between high and low resolution graphics in the context of other visual media such as art and film.
Abstract: Advances in graphics have done more than just revolutionize the computer game industry-graphics themselves have come to define the entire industry, surpassing even gameplay as the singular identifying characteristic of all games, regardless of genre. These spectacular developments are even more remarkable when we consider the almost ruthless constraints imposed by an industry needing to deliver high performance to a demanding computer-savvy group of users. A core issue involves low- versus high-resolution graphics. As multimedia technologies continue to mature, the debate between high and low resolution takes on more urgency. What is the most effective way to convey an emotion or information? Is it just pure realism or is it an artistic device? Always searching for the most effective way to provide an immersive experience, game developers still have no consensus on what rendered graphics should really look like. This article provides insight into the game industry as well as discussing this topic in the context of other visual media such as art and film.