scispace - formally typeset
Search or ask a question

Showing papers in "Multimedia Tools and Applications in 2009"


Journal ArticleDOI
TL;DR: A user study performed, as a first step, for the evaluation of a low-cost VR system using a Head-Mounted Display (HMD), which shows that, although users were generally satisfied with the VR system, and found the HMD interaction intuitive and natural, most performed better with the desktop setup.
Abstract: Virtual Reality (VR) has been constantly evolving since its early days, and is now a fundamental technology in different application areas. User evaluation is a crucial step in the design and development of VR systems that do respond to users' needs, as well as for identifying applications that indeed gain from the use of such technology. Yet, there is not much work reported concerning usability evaluation and validation of VR systems, when compared with the traditional desktop setup. The paper presents a user study performed, as a first step, for the evaluation of a low-cost VR system using a Head-Mounted Display (HMD). That system was compared to a traditional desktop setup through an experiment that assessed user performance, when carrying out navigation tasks in a game scenario for a short period. The results show that, although users were generally satisfied with the VR system, and found the HMD interaction intuitive and natural, most performed better with the desktop setup.

217 citations


Journal ArticleDOI
TL;DR: A mechanism which compiles feedback related to the behavioral state of the user in the context of reading an electronic document is presented, achieved using a non-intrusive scheme, which uses a simple web camera to detect and track the head, eye and hand movements.
Abstract: Most e-learning environments which utilize user feedback or profiles, collect such information based on questionnaires, resulting very often in incomplete answers, and sometimes deliberate misleading input. In this work, we present a mechanism which compiles feedback related to the behavioral state of the user (e.g. level of interest) in the context of reading an electronic document; this is achieved using a non-intrusive scheme, which uses a simple web camera to detect and track the head, eye and hand movements and provides an estimation of the level of interest and engagement with the use of a neuro-fuzzy network initialized from evidence from the idea of Theory of Mind and trained from expert-annotated data. The user does not need to interact with the proposed system, and can act as if she was not monitored at all. The proposed scheme is tested in an e-learning environment, in order to adapt the presentation of the content to the user profile and current behavioral state. Experiments show that the proposed system detects reading- and attention-related user states very effectively, in a testbed where children's reading performance is tracked.

167 citations


Journal ArticleDOI
TL;DR: This work breaks player actions down into discrete categories, and shows that each category is distinct in terms of several key metrics, and discusses which categories of actions could be supported on current mobile devices, and presents evidence in form of a user survey demonstrating the demand for such services.
Abstract: Providing Massively Multiplayer Online Role-Playing Games (MMORPGs) is a big challenge for future mobile, IP-based networks. Understanding how the players' actions affect the network parameters, the game platform, and the overall perceived quality is highly relevant for the purposes of game design, as well as for the networking infrastructure and network support for games. We break player actions down into discrete categories, and show that each category is distinct in terms of several key metrics. We discuss which categories of actions could be supported on current mobile devices, and present evidence in form of a user survey demonstrating the demand for such services. The starting points into the discussion include the networking, session and latency requirements for particular player actions on one side, and the players' interest on the other. The Blizzard Entertainment's World of Warcraft (WoW) is used as a case study.

62 citations


Journal ArticleDOI
TL;DR: This article proposes in this article a methodology for classifying the genre of television programmes, which reaches a classification accuracy rate of 95% and is used for training a parallel neural network system able to distinguish between seven video genres.
Abstract: Improvements in digital technology have made possible the production and distribution of huge quantities of digital multimedia data. Tools for high-level multimedia documentation are becoming indispensable to efficiently access and retrieve desired content from such data. In this context, automatic genre classification provides a simple and effective solution to describe multimedia contents in a structured and well understandable way. We propose in this article a methodology for classifying the genre of television programmes. Features are extracted from four informative sources, which include visual-perceptual information (colour, texture and motion), structural information (shot length, shot distribution, shot rhythm, shot clusters duration and saturation), cognitive information (face properties, such as number, positions and dimensions) and aural information (transcribed text, sound characteristics). These features are used for training a parallel neural network system able to distinguish between seven video genres: football, cartoons, music, weather forecast, newscast, talk show and commercials. Experiments conducted on more than 100 h of audiovisual material confirm the effectiveness of the proposed method, which reaches a classification accuracy rate of 95%.

60 citations


Journal ArticleDOI
TL;DR: This paper describes techniques for automatic image annotation that take advantage of collaboratively annotated image databases, so called visual folksonomies, and applies two techniques based on image analysis, which propagates user generated, folksonomic annotations.
Abstract: Automatic image annotation is an important and challenging task, and becomes increasingly necessary when managing large image collections. This paper describes techniques for automatic image annotation that take advantage of collaboratively annotated image databases, so called visual folksonomies. Our approach applies two techniques based on image analysis: First, classification annotates images with a controlled vocabulary and second tag propagation along visually similar images. The latter propagates user generated, folksonomic annotations and is therefore capable of dealing with an unlimited vocabulary. Experiments with a pool of Flickr images demonstrate the high accuracy and efficiency of the proposed methods in the task of automatic image annotation. Both techniques were applied in the prototypical tag recommender "tagr".

58 citations


Journal ArticleDOI
TL;DR: Two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario are proposed, falling in the trans-media pseudo-relevance feedback category.
Abstract: This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.

57 citations


Journal ArticleDOI
TL;DR: This work proposes a balancing scheme with two main goals: allocate load on server nodes proportionally to each one’s power and reduce the inter-server communication overhead, considering the load as the occupied bandwidth of each server.
Abstract: In a distributed MMOG (massively multiplayer online game) server architecture, the server nodes may become easily overloaded by the high demand from the players for state updates. Many works propose algorithms to distribute the load on the server nodes, but this load is usually defined as the number of players on each server, what is not an ideal measure. Also, the possible heterogeneity of the system is frequently overlooked. We propose a balancing scheme with two main goals: allocate load on server nodes proportionally to each one's power and reduce the inter-server communication overhead, considering the load as the occupied bandwidth of each server. Four algorithms were proposed, from which ProGReGA is the best for overhead reduction and ProGReGA-KF is the most suited for reducing player migrations between servers. We also make a review of related works and some comparisons were made, where our approach performed better.

55 citations


Journal ArticleDOI
TL;DR: An algorithm for video summarization with a two-level redundancy detection procedure that removes redundant video content using hierarchical agglomerative clustering in the key frame level and a repetitive frame segment detection procedure to remove redundant information in the initial video summary.
Abstract: The mushroom growth of video information, consequently, necessitates the progress of content-based video analysis techniques. Video summarization, aiming to provide a short video summary of the original video document, has drawn much attention these years. In this paper, we propose an algorithm for video summarization with a two-level redundancy detection procedure. By video segmentation and cast indexing, the algorithm first constructs story boards to let users know main scenes and cast (when this is a video with cast) in the video. Then it removes redundant video content using hierarchical agglomerative clustering in the key frame level. The impact factors of scenes and key frames are defined, and parts of key frames are selected to generate the initial video summary. Finally, a repetitive frame segment detection procedure is designed to remove redundant information in the initial video summary. Results of experimental applications on TV series, movies and cartoons are given to illustrate the proposed algorithm.

53 citations


Journal ArticleDOI
TL;DR: The implications of the mobility traces collected in Second Life on the design of peer-to-peer architecture, interest management, mobility modeling of avatars, server load balancing and zone partitioning, caching, and prefetching for user-created virtual worlds are discussed.
Abstract: We collected mobility traces of avatars spanning multiple regions in Second Life, a popular user-created virtual world. We analyzed the traces to characterize the dynamics of the avatars' mobility and behavior, both temporally and spatially. We discuss the implications of our findings on the design of peer-to-peer architecture, interest management, mobility modeling of avatars, server load balancing and zone partitioning, caching, and prefetching for user-created virtual worlds.

49 citations


Journal ArticleDOI
TL;DR: The proposed platform uses stereotypes to initialize user models, adapts user profiles dynamically and clusters users into similar interest groups based on a semantic description of content and on information implicitly collected about the users through their interactions with the museum.
Abstract: Presentation of content is an important aspect of today's virtual reality applications, especially in domains such as virtual museums. The large amount and variety of exhibits in such applications raise a need for adaptation and personalization of the environment. This paper presents a content personalization platform for Virtual Museums, which is based on a semantic description of content and on information implicitly collected about the users through their interactions with the museum. The proposed platform uses stereotypes to initialize user models, adapts user profiles dynamically and clusters users into similar interest groups. A science fiction museum has been set up as a case study for this platform and an evaluation has been carried out.

47 citations


Journal ArticleDOI
Shiguo Lian1
TL;DR: Analysis and experiments show that the scheme obtains high perceptual security and time efficiency, and the watermarking and encryption operations can be commutated, which make the scheme a suitable choice for efficient media content distribution.
Abstract: Commutative Watermarking and Encryption (CWE) provides a solution for interoperation between watermarking and encryption. It realizes the challenging operation that embeds a watermark into the encrypted multimedia data directly, which avoids the decryption---watermarking---encryption triples. Till now, few CWE schemes have been reported. They often obtain the commutative property by partitioning multimedia data into independent parts (i.e., the encryption part and the watermarking part). Since the two parts are isolated, it can not keep secure enough against replacement attacks. To avoid the disadvantage, a novel quasi-commutative watermarking and encryption (QCWE) scheme based on quasi-commutative operations is proposed in this paper. In the proposed scheme, the encryption operation and watermarking operation are applied to the same data part. Since the two operations are homogenous with commutative properties, their orders can be commutated. As an example, the scheme for MPEG2 video encryption and watermarking is presented. In this example, the DCs in intra macroblocks are encrypted or watermarked based on random module addition, while the DCs in other macroblocks and all the ACs' signs are encrypted with a stream cipher or block cipher. Analysis and experiments show that the scheme obtains high perceptual security and time efficiency, and the watermarking and encryption operations can be commutated. These properties make the scheme a suitable choice for efficient media content distribution. Additionally, the paper shows the availability of constructing the commutative watermarking and encryption scheme with homogenous operations, which is expected to activate the new research topic.

Journal ArticleDOI
TL;DR: The goal of this article is to define semantic ambient media and discuss the contributions to the Semantic Ambient Media Experience (SAME) workshop, which was held in conjunction with the ACM Multimedia conference in Vancouver in 2008.
Abstract: The medium is the message! And the message was literacy, media democracy and music charts. Mostly one single distinguishable medium such as TV, the Web, the radio, or books transmitted the message. Now in the age of ubiquitous and pervasive computing, where information flows through a plethora of distributed interlinked media--what is the message ambient media will tell us? What does semantic mean in this context? Which experiences will it open to us? What is content in the age of ambient media? Ambient media are embedded throughout the natural environment of the consumer--in his home, in his car, in restaurants, and on his mobile device. Predominant sample services are smart wallpapers in homes, location based services, RFID based entertainment services for children, or intelligent homes. The goal of this article is to define semantic ambient media and discuss the contributions to the Semantic Ambient Media Experience (SAME) workshop, which was held in conjunction with the ACM Multimedia conference in Vancouver in 2008. The results of the workshop can be found on: www.ambientmediaassociation.org .

Journal ArticleDOI
TL;DR: The main aim of this paper is to aid educational designers in selecting, designing and evaluating three dimensional collaborative virtual environments in order to gain the pedagogical benefits of Computer Supported Collaborative Learning.
Abstract: E-learning systems have gone through a radical change from the initial text-based environments to more stimulating multimedia systems. Such systems are Collaborative Virtual Environments, which could be used in order to support collaborative e-learning scenarios. The main aim of this paper is to aid educational designers in selecting, designing and evaluating three dimensional collaborative virtual environments in order to gain the pedagogical benefits of Computer Supported Collaborative Learning. Therefore, this paper initially discusses the potential of three dimensional networked virtual environments for supporting collaborative learning. Furthermore, based on a two-step platform selection process this paper (a) presents and compares three dimensional multi-user virtual environments for supporting collaborative learning and (b) validates the most promising solution against a set of design principles for educational virtual environments. According to these principles, an educational environment has been implemented on top of the selected platform in order to support collaborative e-learning scenarios. The design of this environment is also presented. In addition, this paper presents the results of three small scale studies carried out in a tertiary education department, to assess the educational environment. This environment has been evaluated based on a hybrid evaluation methodology for uncovering usability problems, collecting further requirements for additional functionality to support collaborative virtual learning environments, and determining the appropriateness of different kinds of learning scenarios.

Journal ArticleDOI
TL;DR: An automatic video summarization technique based on graph theory methodology and the dominant sets clustering algorithm that improves video representation and reveals its structure is presented.
Abstract: The paper presents an automatic video summarization technique based on graph theory methodology and the dominant sets clustering algorithm. The large size of the video data set is handled by exploiting the connectivity information of prototype frames that are extracted from a down-sampled version of the original video sequence. The connectivity information for the prototypes which is obtained from the whole set of data improves video representation and reveals its structure. Automatic selection of the optimal number of clusters and hereafter keyframes is accomplished at a next step through the dominant set clustering algorithm. The method is free of user-specified modeling parameters and is evaluated in terms of several metrics that quantify its content representational ability. Comparison of the proposed summarization technique to the Open Video storyboard, the Adaptive clustering algorithm and the Delaunay clustering approach, is provided.

Journal ArticleDOI
TL;DR: These two proposed features, ART-based elevation descriptor (ART-ED) and shell grid descriptor (SGD), for 3D model retrieval are combined in an attempt to improve retrieval.
Abstract: In this paper, we will propose a new exterior shape feature, ART-based elevation descriptor (ART-ED), and a new interior shape feature, shell grid descriptor (SGD), for 3D model retrieval. ART-ED describes the elevation information of a 3D model from six different angles. Since ART-ED represents only the exterior contour of a 3D model, SGD is proposed for extracting the interior shape information. Finally, these two proposed features as well as other features are combined in an attempt to improve retrieval. Experimental results show that the proposed methods are superior to other descriptors.

Journal ArticleDOI
TL;DR: A novel approach to automatically segment scenes and semantically represent scenes is proposed to satisfy human demand on video retrieval and promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.
Abstract: Grouping video content into semantic segments and classifying semantic scenes into different types are the crucial processes to content-based video organization, management and retrieval. In this paper, a novel approach to automatically segment scenes and semantically represent scenes is proposed. Firstly, video shots are detected using a rough-to-fine algorithm. Secondly, key-frames within each shot are selected adaptively with hybrid features, and redundant key-frames are removed by template matching. Thirdly, spatio-temporal coherent shots are clustered into the same scene based on the temporal constraint of video content and visual similarity between shot activities. Finally, under the full analysis of typical characters on continuously recorded videos, scene content is semantically represented to satisfy human demand on video retrieval. The proposed algorithm has been performed on various genres of films and TV program. Promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.

Journal ArticleDOI
TL;DR: This paper presents a video analysis approach based on concept detection and keyframe extraction employing a visual thesaurus representation, and a framework is proposed for working on very large data sets.
Abstract: This paper presents a video analysis approach based on concept detection and keyframe extraction employing a visual thesaurus representation. Color and texture descriptors are extracted from coarse regions of each frame and a visual thesaurus is constructed after clustering regions. The clusters, called region types, are used as basis for representing local material information through the construction of a model vector for each frame, which reflects the composition of the image in terms of region types. Model vector representation is used for keyframe selection either in each video shot or across an entire sequence. The selection process ensures that all region types are represented. A number of high-level concept detectors is then trained using global annotation and Latent Semantic Analysis is applied. To enhance detection performance per shot, detection is employed on the selected keyframes of each shot, and a framework is proposed for working on very large data sets.

Journal ArticleDOI
TL;DR: This work shows how to create a music video automatically, using computable characteristics of the video and music to promote coherent matching, and drastically reduces the skill required to make simple music videos.
Abstract: We show how to create a music video automatically, using computable characteristics of the video and music to promote coherent matching. We analyze the flow of both music and video, and then segment them into sequences of near-uniform flow. We extract features from the both video and music segments, and then find matching pairs. The granularity of the matching process can be adapted by extending the segmentation process to several levels. Our approach drastically reduces the skill required to make simple music videos.

Journal ArticleDOI
TL;DR: An efficient markerless registration algorithm is presented for both outdoor and indoor AR system, and a marker-less AR system is described using this algorithm, and the system architecture and working flow are also proposed.
Abstract: Accurate 3D registration is a key issue in the Augmented Reality (AR) applications, particularly where are no markers placed manually. In this paper, an efficient markerless registration algorithm is presented for both outdoor and indoor AR system. This algorithm first calculates the correspondences among frames using fixed region tracking, and then estimates the motion parameters on projective transformation following the homography of the tracked region. To achieve the illumination insensitive tracking, the illumination parameters are solved jointly with motion parameters in each step. Based on the perspective motion parameters of the tracked region, the 3D registration, the camera's pose and position, can be calculated with calibrated intrinsic parameters. A marker-less AR system is described using this algorithm, and the system architecture and working flow are also proposed. Experimental results with comparison quantitatively demonstrate the correctness of the theoretical analysis and the robustness of the registration algorithm.

Journal ArticleDOI
TL;DR: A mechanism, SMIL State, can be used to add user-defined state to declarative time-based languages such as SMIL or SVG animation, thereby enabling the author to create control flows that are difficult to realize within the temporal containment model of the host languages.
Abstract: In this paper we examine adaptive time-based web applications (or presentations). These are interactive presentations where time dictates which parts of the application are presented (providing the major structuring paradigm), and that require interactivity and other dynamic adaptation. We investigate the current technologies available to create such presentations and their shortcomings, and suggest a mechanism for addressing these shortcomings. This mechanism, SMIL State, can be used to add user-defined state to declarative time-based languages such as SMIL or SVG animation, thereby enabling the author to create control flows that are difficult to realize within the temporal containment model of the host languages. In addition, SMIL State can be used as a bridging mechanism between languages, enabling easy integration of external components into the web application. Finally, SMIL State enables richer expressions for content control. This paper defines SMIL State in terms of an introductory example, followed by a detailed specification of the State model. Next, the implementation of this model is discussed. We conclude with a set of potential use cases, including dynamic content adaptation and delayed insertion of custom content such as advertisements.

Journal ArticleDOI
TL;DR: This paper proposes an ambient media service provisioning framework that incorporates the above requirements while keeping the user at the center of the media selection loop and shows experimental results by considering real-life scenario in a smart home environment.
Abstract: The provisioning of ambient media in the user's environment requires that a system handles the different aspects related to the media selection process. For example, ambient media is delivered to the user depending on their context and hence, the system needs to dynamically determine the context and provide media that are relevant therein. To set the premise of ambient media, a system may also need to customize the physical environment, for example, by dimming the lighting level or by lowering the volume. Besides, users' need for media services changes over time and space that requires mechanisms to continually update their preferences based on their mobility in the environment. In this paper, we propose an ambient media service provisioning framework that incorporates the above requirements while keeping the user at the center of the media selection loop. To demonstrate the usefulness of this framework, we show experimental results by considering real-life scenario in a smart home environment.

Journal ArticleDOI
TL;DR: This special issue, which collected the best papers form the 1 ACM International Workshop on Semantic Ambient Media Experience (NAMU Series) SAME’08, contains a full analysis of the contribution to the workshop and contains the results of the workshop.
Abstract: It is our great pleasure to welcome you to this special issue, which collected the best papers form the 1 ACM International Workshop on Semantic Ambient Media Experience (NAMU Series) SAME’08. The first article in the special issue contains a full analysis of the contribution to the workshop, as well as it contains the results of the workshop, which was actually a ‘workshop’ in form of a creative think-tank instead of simple paper presentations.

Journal ArticleDOI
TL;DR: This work explores the use of latency estimation techniques to gather information about player latencies in massively multi-player online games, and expects a decrease in the aggregate latency for the affected players.
Abstract: Massively multi-player online games (MMOGs) have stringent latency requirements and must support large numbers of concurrent players. To handle these conflicting requirements, it is common to divide the virtual environment into virtual regions. As MMOGs are world-spanning games, it is plausible to disperse these regions on geographically distributed servers. Core selection can then be applied to locate an optimal server for placing a region, based on player latencies. Functionality for migrating objects supports this objective, with a distributed name server ensuring that references to the moved objects are maintained. As a result we anticipate a decrease in the aggregate latency for the affected players. The core selection relies on a set of servers and measurements of the interacting players latencies. Measuring these latencies by actively probing the network is not scalable for a large number of players. We therefore explore the use of latency estimation techniques to gather this information.

Journal ArticleDOI
TL;DR: This article presents a hybrid MMOG architecture called MM-VISA (Massively Multiuser VIrtual Simulation Architecture), in which servers and peers are coupled together to take the inherent advantages of the centralized architecture and the scalability of distributed systems.
Abstract: Distributed Virtual Environments are becoming more popular in today's computing and communications among people. Perhaps the most widely used form of such environments is Massively Multiplayer Online Games (MMOG), which are in the form of client/server architecture that requires considerable server resources to manage a large number of distributed players. Peer-to-peer communication can achieve scalability at lower cost but may introduce other difficulties. Synchronous communication is a prime concern for multi-user collaborative applications like MMOGs where players need frequently interaction with each other to share their game states. In this article, we present a hybrid MMOG architecture called MM-VISA (Massively Multiuser VIrtual Simulation Architecture). In this architecture, servers and peers are coupled together to take the inherent advantages of the centralized architecture and the scalability of distributed systems. As the virtual world is decomposed into smaller manageable zones, the players' random movement causes reorganization at the P2P overlay structure. The frequent nature of movements along with unintelligent zone crossing approaches, currently implemented in MMOGs, breaks synchronous communication. To limit such problem, we consider players' gaming characteristics to intelligently define routing paths. A graph-theoretic framework is incorporated for overlay oriented real-time distributed virtual environments. We shall show that interest-driven zone crossing, dynamic shared region between adjacent zones, and clustering of entities based on their attributes significantly decrease unstable overlay situations. The effectiveness of the presented system is justified through simulation.

Journal ArticleDOI
TL;DR: In this paper, the authors present the information-theoretic based feature information interaction, a measure that can describe complex feature dependencies in multivariate settings, and compare empirical dependency estimates of correlation, mutual information and 3-way feature interaction.
Abstract: This article presents the information-theoretic based feature information interaction, a measure that can describe complex feature dependencies in multivariate settings. According to the theoretical development, feature interactions are more accurate than current, bivariate dependence measures due to their stable and unambiguous definition. In experiments with artificial and real data we compare first the empirical dependency estimates of correlation, mutual information and 3-way feature interaction. Then, we present feature selection and classification experiments that show superior performance of interactions over bivariate dependence measures for the artificial data, for real world data this goal is not achieved yet.

Journal ArticleDOI
TL;DR: A unified method of automatically creating dynamic and static video abstracts considering the semantic content targeting especially on broadcasted sports videos is proposed and an effective display style is proposed where a user can easily access target scenes from a list of keyframes by tracing the tree structures of sports games.
Abstract: Video abstraction is defined as creating a video abstract which includes only important information in the original video streams. There are two general types of video abstracts, namely the dynamic and static ones. The dynamic video abstract is a 3-dimensional representation created by temporally arranging important scenes while the static video abstract is a 2-dimensional representation created by spatially arranging only keyframes of important scenes. In this paper, we propose a unified method of automatically creating these two types of video abstracts considering the semantic content targeting especially on broadcasted sports videos. For both types of video abstracts, the proposed method firstly determines the significance of scenes. A play scene, which corresponds to a play, is considered as a scene unit of sports videos, and the significance of every play scene is determined based on the play ranks, the time the play occurred, and the number of replays. This information is extracted from the metadata, which describes the semantic content of videos and enables us to consider not only the types of plays but also their influence on the game. In addition, user's preferences are considered to personalize the video abstracts. For dynamic video abstracts, we propose three approaches for selecting the play scenes of the highest significance: the basic criterion, the greedy criterion, and the play-cut criterion. For static video abstracts, we also propose an effective display style where a user can easily access target scenes from a list of keyframes by tracing the tree structures of sports games. We experimentally verified the effectiveness of our method by comparing our results with man-made video abstracts as well as by conducting questionnaires.

Journal ArticleDOI
TL;DR: Evaluation results are presented to prove the feasibility of the downsized semantic reasoning process in the DTV receivers, supported by a pre-selection of material driven by audience stereotypes in the head-end, and to assess the quality it achieves in comparison with previous ones.
Abstract: Experience has proved that interactive applications delivered through Digital TV must provide personalized information to the viewers in order to be perceived as a valuable service. Due to the limited computational power of DTV receivers (either domestic set-top boxes or mobile devices), most of the existing systems have opted to place the personalization engines in dedicated servers, assuming that a return channel is always available for bidirectional communication. However, in a domain where most of the information is transmitted through broadcast, there are still many cases of intermittent, sporadic or null access to a return channel. In such situations, it is impossible for the servers to learn who is watching TV at the moment, and so the personalization features become unavailable. To solve this problem without sacrificing much personalization quality, this paper introduces solutions to run a downsized semantic reasoning process in the DTV receivers, supported by a pre-selection of material driven by audience stereotypes in the head-end. Evaluation results are presented to prove the feasibility of this approach, and also to assess the quality it achieves in comparison with previous ones.

Journal ArticleDOI
TL;DR: This work proposes a new triangulation algorithm that provides network connectivity to support P2P NVEs while dramatically decreasing maintenance overhead by reducing the number of connection changes due to users’ insertion and movement.
Abstract: Peer-to-peer (P2P) architectures have recently become a popular design choice for building scalable Networked Virtual Environments (NVEs). In P2P-based NVEs, system and data management is distributed among all participating users. Towards this end, a Delaunay Triangulation can be used to provide connectivity between the different NVE users depending on their positions in the virtual world. However, a Delaunay Triangulation clearly suffers from high maintenance cost as it is subject to high connection change rate due to continuous users' movement. In this paper, we propose a new triangulation algorithm that provides network connectivity to support P2P NVEs while dramatically decreasing maintenance overhead by reducing the number of connection changes due to users' insertion and movement. Performance evaluations show that our solution drastically reduces overlay maintenance cost in highly dynamic NVEs. More importantly, and beyond its quantitative advantages, this work questions the well accepted Delaunay Triangulation as a reference means for providing connectivity in NVEs, and paves the way for more research towards more practical alternatives for NVE applications.

Journal ArticleDOI
TL;DR: It is demonstrated that direct marketing firms can exploit the information on visual content to optimize the learning phase and a two-phase learning strategy based on a cascade of regression methods that takes advantage of the visual and text features to improve and accelerate the learning process.
Abstract: Traditionally, direct marketing companies have relied on pre-testing to select the best offers to send to their audience. Companies systematically dispatch the offers under consideration to a limited sample of potential buyers, rank them with respect to their performance and, based on this ranking, decide which offers to send to the wider population. Though this pre-testing process is simple and widely used, recently the industry has been under increased pressure to further optimize learning, in particular when facing severe time and learning space constraints. The main contribution of the present work is to demonstrate that direct marketing firms can exploit the information on visual content to optimize the learning phase. This paper proposes a two-phase learning strategy based on a cascade of regression methods that takes advantage of the visual and text features to improve and accelerate the learning process. Experiments in the domain of a commercial Multimedia Messaging Service (MMS) show the effectiveness of the proposed methods and a significant improvement over traditional learning techniques. The proposed approach can be used in any multimedia direct marketing domain in which offers comprise both a visual and text component.

Journal ArticleDOI
TL;DR: This paper presents an enhanced version of CCA, which reduces the broadcast latency up to 50% as compared to CCA and provides an analytical evaluation to show its performance advantage as compared with some existing techniques.
Abstract: Periodic broadcast is a cost-effective solution for large-scale distribution of popular videos. Regardless of the number of video requests, this strategy guarantees a constant worst service latency to all clients, making it possible to serve a large community with a minimal amount of broadcast bandwidth. Although many efficient periodic broadcast techniques have been proposed, most of them impose rigid requirements on client receiving bandwidth. They either demand clients to have the same bandwidth as the video server, or limit them to receive no more than two video streams at any one time. In our previous work, we addressed this problem with a Client-Centric Approach (CCA). This scheme takes into consideration both server broadcast bandwidth and client receiving bandwidth and allows clients to use all their receiving capability for prefetching broadcast data. As a result, given a fixed broadcast bandwidth, a shorter broadcast period can be achieved with an improved client communication capability. In this paper, we present an enhanced version of CCA to further leverage client bandwidth for more efficient video broadcast. The new scheme reduces the broadcast latency up to 50% as compared to CCA. We prove the correctness of this new technique and provide an analytical evaluation to show its performance advantage as compared with some existing techniques.