scispace - formally typeset
Search or ask a question

Showing papers presented at "ACM SIGMM Conference on Multimedia Systems in 2011"


Proceedings ArticleDOI
23 Feb 2011
TL;DR: In this paper, some insight and background into the Dynamic Adaptive Streaming over HTTP (DASH) specifications as available from 3GPP and in draft version also from MPEG is provided.
Abstract: In this paper, we provide some insight and background into the Dynamic Adaptive Streaming over HTTP (DASH) specifications as available from 3GPP and in draft version also from MPEG. Specifically, the 3GPP version provides a normative description of a Media Presentation, the formats of a Segment, and the delivery protocol. In addition, it adds an informative description on how a DASH Client may use the provided information to establish a streaming service for the user. The solution supports different service types (e.g., On-Demand, Live, Time-Shift Viewing), different features (e.g., adaptive bitrate switching, multiple language support, ad insertion, trick modes, DRM) and different deployment options. Design principles and examples are provided.

1,203 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper focuses on the rate-adaptation mechanisms of adaptive streaming and experimentally evaluates two major commercial players (Smooth Streaming, Netflix) and one open source player (OSMF).
Abstract: Adaptive (video) streaming over HTTP is gradually being adopted, as it offers significant advantages in terms of both user-perceived quality and resource utilization for content and network service providers. In this paper, we focus on the rate-adaptation mechanisms of adaptive streaming and experimentally evaluate two major commercial players (Smooth Streaming, Netflix) and one open source player (OSMF). Our experiments cover three important operating conditions. First, how does an adaptive video player react to either persistent or short-term changes in the underlying network available bandwidth. Can the player quickly converge to the maximum sustainable bitrate? Second, what happens when two adaptive video players compete for available bandwidth in the bottleneck link? Can they share the resources in a stable and fair manner? And third, how does adaptive streaming perform with live content? Is the player able to sustain a short playback delay? We identify major differences between the three players, and significant inefficiencies in each of them.

729 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: A receiver-driven rate adaptation method for HTTP/TCP streaming that deploys a step-wise increase/ aggressive decrease method to switch up/down between the different representations of the content that are encoded at different bitrates is presented.
Abstract: Recently, HTTP has been widely used for the delivery of real-time multimedia content over the Internet, such as in video streaming applications. To combat the varying network resources of the Internet, rate adaptation is used to adapt the transmission rate to the varying network capacity. A key research problem of rate adaptation is to identify network congestion early enough and to probe the spare network capacity. In adaptive HTTP streaming, this problem becomes challenging because of the difficulties in differentiating between the short-term throughput variations, incurred by the TCP congestion control, and the throughput changes due to more persistent bandwidth changes.In this paper, we propose a novel rate adaptation algorithm for adaptive HTTP streaming that detects bandwidth changes using a smoothed HTTP throughput measured based on the segment fetch time (SFT). The smoothed HTTP throughput instead of the instantaneous TCP transmission rate is used to determine if the bitrate of the current media matches the end-to-end network bandwidth capacity. Based on the smoothed throughput measurement, this paper presents a receiver-driven rate adaptation method for HTTP/TCP streaming that deploys a step-wise increase/ aggressive decrease method to switch up/down between the different representations of the content that are encoded at different bitrates. Our rate adaptation method does not require any transport layer information such as round trip time (RTT) and packet loss rates which are available at the TCP layer. Simulation results show that the proposed rate adaptation algorithm quickly adapts to match the end-to-end network capacity and also effectively controls buffer underflow and overflow.

455 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: A Quality Adaptation Controller for live adaptive video streaming designed by employing feedback control theory and found to be able to throttle the video quality to match the available bandwidth with a transient of less than 30s while ensuring a continuous video reproduction.
Abstract: Multimedia content feeds an ever increasing fraction of the Internet traffic. Video streaming is one of the most important applications driving this trend. Adaptive video streaming is a relevant advancement with respect to classic progressive download streaming such as the one employed by YouTube. It consists in dynamically adapting the content bitrate in order to provide the maximum Quality of Experience, given the current available bandwidth, while ensuring a continuous reproduction. In this paper we propose a Quality Adaptation Controller (QAC) for live adaptive video streaming designed by employing feedback control theory. An experimental comparison with Akamai adaptive video streaming has been carried out. We have found the following main results: 1) QAC is able to throttle the video quality to match the available bandwidth with a transient of less than 30s while ensuring a continuous video reproduction; 2) QAC fairly shares the available bandwidth both in the cases of a concurrent TCP greedy connection or a concurrent video streaming flow; 3) Akamai underutilizes the available bandwidth due to the conservativeness of its heuristic algorithm; moreover, when abrupt available bandwidth reductions occur, the video reproduction is affected by interruptions.

236 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: The proposed Stanford Mobile Visual Search data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips, and query data collected from heterogeneous low and high-end camera phones.
Abstract: We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips. The data set has several key characteristics lacking in existing data sets: rigid objects, widely varying lighting conditions, perspective distortion, foreground and background clutter, realistic ground-truth reference data, and query data collected from heterogeneous low and high-end camera phones. We hope that the data set will help push research forward in the field of mobile visual search.

141 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: The benefits of using the Scalable Video Coding (SVC) for such a DASH environment is shown, which helps video clients dynamically adapt the requested video quality for ongoing video flows, to match their current download rate as good as possible.
Abstract: HTTP-based delivery for Video on Demand (VoD) has been gaining popularity within recent years. Progressive Download over HTTP, typically used in VoD, takes advantage of the widely deployed network caches to relieve video servers from sending the same content to a high number of users in the same access network. However, due to a sharp increase in the requests at peak hours or due to cross-traffic within the network, congestion may arise in the cache feeder link or access link respectively. Since the connection characteristics may vary over the time, with Dynamic Adaptive Streaming over HTTP (DASH), a technique that has been recently proposed, video clients may dynamically adapt the requested video quality for ongoing video flows, to match their current download rate as good as possible. In this work we show the benefits of using the Scalable Video Coding (SVC) for such a DASH environment.

126 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: A client-side request scheduler that distributes requests for the video over multiple heterogeneous interfaces simultaneously, and reduces the number of playback interruptions and improves video quality significantly for all cases where the earlier approach struggled.
Abstract: Devices capable of connecting to multiple, overlapping networks simultaneously are becoming increasingly common. For example, most laptops are equipped with LAN- and WLAN-interfaces, and smart phones can typically connect to both WLANs and 3G mobile networks. At the same time, streaming high-quality video is becoming increasingly popular. However, due to bandwidth limitations or the unreliable and unpredictable nature of some types of networks, streaming video can be subject to frequent periods of rebuffering and characterised by a low picture quality.In this paper, we present a client-side request scheduler that distributes requests for the video over multiple heterogeneous interfaces simultaneously. Each video is divided into independent segments with constant duration, enabling segments to be requested over separate links, utilizing all the available bandwidth. To increase performance even further, the segments are divided into smaller subsegments, and the sizes are dynamically calculated on the fly, based on the throughput of the different links. This is an improvement over our earlier subsegment approach, which divided segments into fixed size subsegments.Both subsegment approaches were evaluated with on-demand streaming and quasi-live streaming. The new subsegment approach reduces the number of playback interruptions and improves video quality significantly for all cases where the earlier approach struggled. Otherwise, they show similar performance.

79 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper presents the World of Warcraft Avatar History (WoWAH) dataset, which comprises the records of 91,065 avatars, which includes the avatars' game play times and a number of attributes, during a 1,107-day period between Jan. 2006 and Jan. 2009.
Abstract: From the perspective of game system designers, players' behavior is one of the most important factors they must consider when designing game systems. To gain a fundamental understanding of the game play behavior of online gamers, exploring users' game play time provides a good starting point. This is because the concept of game play time is applicable to all genres of games and it enables us to model the system workload as well as the impact of system and network QoS on users' behavior. It can even help us predict players' loyalty to specific games. In this paper, we present the World of Warcraft Avatar History (WoWAH) dataset, which comprises the records of 91,065 avatars. The data includes the avatars' game play times and a number of attributes, such as their race, profession, current level, and in-game locations, during a 1,107-day period between Jan. 2006 and Jan. 2009. We believe the WOWAH dataset could be used for various creative purposes, now that it is a public asset of the research community. It is available for free download at http://mmnet.iis.sinica.edu.tw/dl/wowah/.

69 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: A multimedia test-bed enabling session mobility in the context of the emerging ISO/IEC MPEG standard, Dynamic Adaptive Streaming over HTTP (DASH), which can conclude that interoperability is achieved adopting existing standards while the performance of the system does not depend on these standards.
Abstract: In this paper, we present a multimedia test-bed enabling session mobility in the context of the emerging ISO/IEC MPEG standard, Dynamic Adaptive Streaming over HTTP (DASH). In general, session mobility is defined as the transfer of a running streaming session from one device to another device where it may need to be consumed in an adaptive way. The two main challenges are: (1) taking into account the new context of the device (e.g., capabilities) to which the session is transferred and (2) performing the actual transfer in a seamless and interoperable way. Our system addresses both challenges supported by a prototype implementation integrated into VLC. In anticipation of the results we can conclude that interoperability is achieved adopting existing standards while the performance of the system does not depend on these standards. That is, the modules responsible for the performance are usually not defined within such standards and left out for competition. However, our system is designed in an extensible way and is able to accommodate this fact.

64 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: An in-depth experimental analysis of the use of HTTP-based request-response streams for video streaming, finding that request- response streams are able to scale with the available bandwidth by increasing the chunk size or the number of concurrent streams.
Abstract: Adaptive video streaming based on TCP/HTTP is becoming popular because of its ability to adapt to changing network conditions We present an in-depth experimental analysis of the use of HTTP-based request-response streams for video streaming In this scheme, video fragments are fetched by a client from the server, in smaller units called chunks, potentially via multiple parallel HTT P requests (TCP connections) A model for the achievable throughput is formulatedThe model is validated by a broad range of streaming experiments, including an evaluation of TCP-friendlinessOur findings include that request-response streams are able to scale with the available bandwidth by increasing the chunk size or the number of concurrent streams Several combinations of system parameters exhibiting TCP-friendliness are presented We also evaluate the video streaming performance in terms of video quality in the presence of packet loss Multiple request-response streams are able to maintain satisfactory performance, while a single TCP connection deteriorates rapidly with increasing packet loss The results provide experimental evidence that HTTP-based request-response streams are a good alternative to classical TCP streaming

63 citations


Proceedings ArticleDOI
23 Feb 2011
TL;DR: It is found that the recommendation-aware prefetching approach can achieve an overall hit ratio up to 81%, while the hit ratio achieved by the caching scheme can only reach 40%.
Abstract: Even though user generated video sharing sites are tremendously popular, the experience of the user watching videos is often unsatisfactory. Delays due to buffering before and during a video playback at a client are quite common. In this paper, we present a prefetching approach for user-generated video sharing sites like YouTube. We motivate the need for prefetching by showing that video playbacks of videos of YouTube is often unsatisfactory and introduce a series of prefetching schemes: the conventional caching scheme, the search result-based prefetching scheme, and the recommendation-aware prefetching scheme. We evaluate and compare the proposed schemes using user browsing pattern data collected from network measurement. We find that the recommendation-aware prefetching approach can achieve an overall hit ratio up to 81%, while the hit ratio achieved by the caching scheme can only reach 40%. Thus, the recommendation-aware prefetching approach demonstrates a strong potential for improving the playback quality at the client. We also explore the trade-offs and feasibility of implementing recommendation-aware prefetching.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: The impact and trade-offs of SVC-based quality adaptation with focus on the SVC layer selection algorithms are explored, which are performed at different streaming stages and give multimedia providers insights on how to design and fine-tune their VoD system in order to achieve best performance.
Abstract: P2P Video-on-Demand (VoD) based on Scalable Video Coding (SVC) (the scalable extension of the H.264/AVC standard) is gaining momentum in the research community, as it provides elegant adaptation to heterogeneous resources and network dynamics. The major question is, how do the adaptation algorithms and designs affect the overall perceived performance of the system? Better yet, how can the performance of an SVC-based VoD system be defined? This paper explores the impact and trade-offs of SVC-based quality adaptation with focus on the SVC layer selection algorithms, which are performed at different streaming stages. We carry out extensive experiments to evaluate the performance in terms of session quality (start-up delay, video stalls) and delivered SVC video quality (layer switches, received layers), and find out that these two metrics exhibit a trade-off. Our analysis and conclusions give multimedia providers insights on how to design and fine-tune their VoD system in order to achieve best performance.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This publication analyzes what is missing in the current DASH standards with regards to content protection, and proposes changes and extensions to DASH in order to enable the application of DRM.
Abstract: Dynamic adaptive HTTP streaming (DASH) is a new concept for video streaming using consecutive downloads of short video segments. 3GPP has developed the basic DASH standard which is further extended by the Open IPTV Forum (OIPF) and MPEG. In all versions available to date, only very simple content protection use cases are enabled. Extensions are needed to enable important advanced use cases like pay-per-view and license change in an ongoing video channel.In this publication, we analyze what is missing in the current DASH standards with regards to content protection, and propose changes and extensions to DASH in order to enable the application of DRM. This includes changes to the Media Presentation Description (MPD), and the file format. With a suitable key and license structure used together with DASH, even complex use cases like pay-per-maximum-quality are possible.Besides the analysis of required changes to DASH for content protection, and the description of suitable key and license structures applied to DASH, we also present a proof-of-concept implementation of the proposed concepts.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper considers how the bandwidth needed to transmit the RoIs can be reduced by carefully encoding the source video, and shows that the encoding method can reduce the expected bandwidth by upto 27%.
Abstract: Zoomable video allows users to selectively zoom and pan into regions of interest within the video for viewing at higher resolutions. Such interaction requires dynamic cropping of RoIs on the source video. In this paper, we consider how the bandwidth needed to transmit the RoIs can be reduced by carefully encoding the source video. The key idea is to exploit user access patterns to the RoIs, and encode different regions of the video with different encoding parameters based on the popularity of the region. We show that our encoding method can reduce the expected bandwidth by upto 27%.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: The proposed SyncCast is a multi-stream multicast dissemination scheme that takes into account the bandwidth constraint, as well as the inter-stream, inter-sender and inter-receiver synchronization, and is compared with the previous ViewCast algorithm.
Abstract: An ideal interactive 3D tele-immersion (3DTI) system is expected to disseminate and synchronize multi-streams with a shortest-possible latency among participating sites, achieve inter-stream synchronization, and bound both inter-sender and inter-receiver skews. This is, however, a key challenge because of (1) the coexistence of multi-modal, correlated, bandwidth-savvy streams from multiple source media, (2) the bounded bandwidth resources at each site, (3) the heterogeneous transmission end-to-end delays (EED) between sites and (4) the diversity of 3D views requested by multiple users. Our study of the existing content dissemination topologies reveals their inadequacy of handlings the complication and dynamics present in 3DTI systems.In this paper, we propose SyncCast, a multi-stream multicast dissemination scheme that takes into account the bandwidth constraint, as well as the inter-stream, inter-sender and inter-receiver synchronization. We classify the 3DTI media streams into different service classes based on the users' visual interests. SyncCast is designed to address the interactions among EED, synchronization and bandwidth in the real-world Internet settings. We compare SyncCast and our previous ViewCast algorithm. The simulation results show the improvement in the synchronization performance and the implementation feasibility of SyncCast in supporting the multi-site interactive 3DTI system.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A study on usages of DASH for Rich Media Services is presented, as a presentation layer for the audio-visual content first, but also as a dedicated media in the DASH content for real-time media-synchronized interactive services.
Abstract: In recent years, audio-visual distribution over Internet has witnessed the growing usage of HTTP based delivery systems. While these systems have their drawbacks for some use-cases, they also have many advantages, the most important one being reusing the existing delivery infrastructure such as HTTP servers, proxies and caches. The MPEG group has started the standardization of the Dynamic Adaptive Streaming over HTTP (DASH) of major transport formats, MPEG-2 TS and ISO Base Media File, and mostly focuses on audio, video and subtitle formats. We believe Rich Media services have a role to play in this landscape, as a presentation layer for the audio-visual content first, but also as a dedicated media in the DASH content for real-time media-synchronized interactive services. In this paper, we present a study on usages of DASH for Rich Media Services.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A design framework that brings together several key ideas to enable energy-efficient mobile video management applications and substantially prolongs the battery life of mobile devices while only slightly increasing the search latency is presented.
Abstract: Mobile devices are increasingly popular for the versatile capture and delivery of video content. However, the acquisition and transmission of large amounts of video data on mobile devices face fundamental challenges such as power and wireless bandwidth constraints. To support diverse mobile video applications, it is critical to overcome these challenges. We present a design framework that brings together several key ideas to enable energy-efficient mobile video management applications. First, we leverage off-the-shelf smartphones as mobile video sensors. Second, concurrently with video recordings we acquire geospatial sensor meta-data to describe the videos. Third, we immediately upload the meta-data to a server to enable low latency video search. This last step allows for very energy-efficient transmissions, as the sensor data sets are small and the bulky video data can be uploaded on demand, if and when needed. We present the design, a simulation study, and a preliminary prototype of the proposed system. Experimental results show that our approach substantially prolongs the battery life of mobile devices while only slightly increasing the searchlatency

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper systematically study a large number of Internet paths between popular video destinations and clients to create an empirical understanding of location, persistence, and recurrence of failures, and shows that SIFR outperforms IP-path selection by providing higher on-screen perceptual quality.
Abstract: This paper presents large scale Internet measurements to understand and improve the effects of Internet path selection on perceived video quality. We systematically study a large number of Internet paths between popular video destinations and clients to create an empirical understanding of location, persistence and recurrence of failures. We map these failures to perceptual quality by reconstructing video clips obtained from the trace to quantify both the perceptual degradations from these failures as well as the fraction of such failures that can be recovered.We then investigate ways to recover from QoE degradation by choosing one-hop detour paths that preserve application specific policies. We seek simple, scalable path selection strategies without the need for background path monitoring or apriori path knowledge of any kind. To do this, we deployed five measurement overlays: one each in the US, Europe, Asia-Pacific, and two spread across the globe. We used these to stream IP-traces of a variety of clips between source-destination pairs while probing alternate paths for an entire week. Our results indicate that a source can recover from upto 90% of the degradations by attempting to restore QoE with any five randomly chosen nodes in an overlay. We argue that our results are robust across datasets.Finally, we design and implement a prototype packet forwarding module called source initiated frame restoration (SIFR). We deployed SIFR on PlanetLab nodes, and compared the performance of SIFR with the default Internet routing. We show that SIFR outperforms IP-path selection by providing higher on-screen perceptual quality.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: Measurements of streaming real-time UDP traffic to a number of residential users are presented, and the basic characteristics of the data are discussed.
Abstract: Little performance data currently exists for streaming high-quality Internet video to residential users. Data on streaming performance will provide valuable input to the design of new protocols and applications, such as congestion control and error-correction schemes, and sizing playout buffers in video receivers. This paper presents measurements of streaming real-time UDP traffic to a number of residential users, and discusses the basic characteristics of the data.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper analyzes tradeoffs and potential impact that flash memory SSD can have for a VoD server, and claims that interval caching cannot be used with it, and proposes using file-level Least Frequently Used (LFU) due to the highly skewed video access pattern of the VoD workload.
Abstract: There is no doubt that video-on-demand (VoD) services are very popular these days. However, disk storage is a serious bottleneck limiting the scalability of a VoD server. Disk throughput degrades dramatically due to seek time overhead when the server is called upon to serve a large number of simultaneous video streams. To address the performance problem of disk, buffer cache algorithms that utilize RAM have been proposed. Interval caching is a state-of-the-art caching algorithm for a VoD server. Flash Memory Solid-State Drive (SSD) is a relatively new storage technology. Its excellent random read performance, low power consumption, and sharply dropping cost per gigabyte are opening new opportunities to efficiently use the device for enterprise systems. On the other hand, it has deficiencies such as poor small random write performance and limited number of erase operations. In this paper, we analyze tradeoffs and potential impact that flash memory SSD can have for a VoD server. Performance of various commercially available flash memory SSD models is studied. We find that low-end flash memory SSD provides better performance than the high-end one while costing less than the high-end one when the I/O request size is large, which is typical for a VoD server. Because of the wear problem and asymmetric read/write performance of flash memory SSD, we claim that interval caching cannot be used with it. Instead, we propose using file-level Least Frequently Used (LFU) due to the highly skewed video access pattern of the VoD workload. We compare the performance of interval caching with RAM and file-level LFU with flash memory by simulation experiments. In addition, from the cost-effectiveness analysis of three different storage configurations, we find that flash memory with hard disk drive is the most cost-effective solution compared to DRAM with hard disk drive or hard disk drive only.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: Two key issues that arise from the impact of TCP-based data transfers on real-time traffic (such as VoIP or online games) sharing a broadband link are illustrated.
Abstract: Consumer broadband services are increasingly a mix of TCP-based and UDP-based applications, often with quite distinct requirements for interactivity and network performance. Consumers can experience degraded service when application traffic collides at a congestion point between home LANs, service provider edge networks and fractional-Mbit/sec `broadband' links. We illustrate two key issues that arise from the impact of TCP-based data transfers on real-time traffic (such as VoIP or online games) sharing a broadband link. First, well-intentioned modifications to traditional TCP congestion control can noticeably increase the latencies experienced by VoIP or online games. Second, superficially-similar packet dropping rules in broadband gateways can induce distinctly different packet loss rates in VoIP and online game traffic. Our observations provide cautionary guidance to researchers who model such traffic mixes, and to vendors implementing equipment at either end of consumer links.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper proposes a general, cheat-proof framework that enables researchers to systematically quantify the minimum QoS needs for real-time networked multimedia services and demonstrates the usefulness of the derived QoS thresholds.
Abstract: Despite all the efforts devoted to improving the QoS of networked multimedia services, the baseline for such improvements has yet to be defined. In other words, although it is well recognized that better network conditions generally yield better service quality, the exact minimum level of network QoS required to ensure satisfactory user experience remains an open question.In this paper, we propose a general, cheat-proof framework that enables researchers to systematically quantify the minimum QoS needs for real-time networked multimedia services. Our framework has two major features: 1) it measures the quality of a service that users find intolerable by intuitive responses and therefore reduces the burden on experiment participants; and 2) it is cheat-proof because it supports systematic verification of the participants' inputs. Via a pilot study involving 38 participants, we verify the efficacy of our framework by proving that even inexperienced participants can easily produce consistent judgments. In addition, by cross-application and cross-service comparative analysis, we demonstrate the usefulness of the derived QoS thresholds. Such knowledge will serve important reference in the evaluation of competitive applications, application recommendation, network planning, and resource arbitration.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This work designs MultiSense to enable fine-grained multiplexing by exposing a virtual sensor to each application and optimizing the time to context-switch between virtual sensors and satisfy requests.
Abstract: Steerable sensors, such as pan-tilt-zoom video cameras, expose programmable actuators to applications, which steer them in different directions based on their goals. Despite being expensive to deploy and maintain, existing steerable sensor networks allow only a single application to control them due to the slow speed of their mechanical actuators. To address the problem, we design MultiSense to enable fine-grained multiplexing by (i) exposing a virtual sensor to each application and (ii) optimizing the time to context-switch between virtual sensors and satisfy requests.We implement MultiSense in Xen and explore how well proportional share scheduling, along with extensions for state restoration and request batching, satisfies the unique requirements of steerable sensors in the form of pan-tilt-zoom video cameras. We present experiments that show MultiSense efficiently isolates the performance of virtual cameras, allowing concurrent applications to satisfy conflicting goals. As one example, we enable a tracking application to photograph an object moving at nearly 3 mph every 23 ft along its trajectory at a distance of 300 ft, while supporting a security application that photographs a fixed point every 3 seconds

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A recognition-based user tracking and augmented reality system that works in extreme large scale areas and simulates the "soft handoff" feature and avoid frequent swaps in memory resource is presented.
Abstract: We present a recognition-based user tracking and augmented reality system that works in extreme large scale areas. The system will provide a user who captures an image of a building facade with precise location of the building and augmented information about the building. While GPS cannot provide information about camera poses, it is needed to aid reducing the searching ranges in image database. A patch-retrieval method is used for efficient computations and real-time camera pose recovery. With the patch matching as the prior information, the whole image matching can be done through propagations in an efficient way so that a more stable camera pose can be generated. Augmented information such as building names and locations are then delivered to the user. The proposed system mainly contains two parts, offline database building and online user tracking. The database is composed of images for different locations of interests. The locations are clustered into groups according to their UTM coordinates. An overlapped clustering method is used to cluster these locations in order to restrict the retrieval range and avoid ping pong effects. For each cluster, a vocabulary tree is built for searching the most similar view. On the tracking part, the rough location of the user is obtained from the GPS and the exact location and camera pose are calculated by querying patches of the captured image. The patch property makes the tracking robust to occlusions and dynamics in the scenes. Moreover, due to the overlapped clusters, the system simulates the "soft handoff" feature and avoid frequent swaps in memory resource. Experiments show that the proposed tracking and augmented reality system is efficient and robust in many cases.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This corpus is an extension of the UIUC Affect corpus of children's stories and includes new automatic annotations using Natural Language Processing toolkits as well as new manual annotations for affect magnitude detection and anaphora resolution.
Abstract: Improvement in human computer interaction requires effective and rapid development of multimedia systems that can understand and interact with humans. These systems need resources to train and learn how to interpret human emotions. Currently, there is a relative small number of existing resources such as annotated corpora that can be used for affect and multimodal content detection. In this paper, an extension of an existing corpus is presented. The corpus includes new annotations for affect magnitude detection and anaphora resolution. The format of the collected data is presented, along with the annotation methodology, basic statistics, suggestions for possible uses, and future work. This corpus is an extension of the UIUC Affect corpus of children's stories. The corpus includes new automatic annotations using Natural Language Processing toolkits as well as new manual annotations for affect magnitude detection and anaphora resolution. Results of inter-annotator agreement analysis on a subset of the corpus are also presented.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A complete testbed is implemented to efficiently collect and analyze network traces from a popular virtual world: Second Life and accurate traffic models can be derived from the trace files, which can guide developers for better virtual world designs.
Abstract: Although network traces of virtual worlds are valuable to ISPs (Internet service providers), virtual world software developers, and research communities, they do not exist in the public domain. In this work, we implement a complete testbed to efficiently collect and analyze network traces from a popular virtual world: Second Life. We use the testbed to gather traces from 100 regions with diverse characteristics. The network traces represent more than 60 hours of virtual world traffic and the trace files are created in a well-structured and concise format. Our preliminary analysis on the collected traces is consistent with previous work in the literature. It also reveals some new insights: for example, local avatar/object density imposes clear implications on traffic patterns. The developed testbed and released trace files can be leveraged by research communities for various studies on virtual worlds. For example, accurate traffic models can be derived from our trace files, which in turn can guide developers for better virtual world designs

Proceedings ArticleDOI
23 Feb 2011
TL;DR: This paper presents an aspect-oriented programming approach that allows developers to easily extend existing multimedia web services with the capability of efficient data transmission without modifying the implementations of the original services, while at the same time the advantages of SOAP web services are still maintained.
Abstract: The number of web services capable of processing multimedia data is growing. Typically, a multimedia web service realizes only a specific algorithmic processing step, such as video decoding. Thus, it is desirable to compose several web services hosted on different sites into a new value-added workflow. However, the transfer of large amounts of multimedia data within workflows based on SOAP as the prevalent communication paradigm between web services induces redundant data transfers. In previous work, we have presented a reference technique called Flex-SwA that solves this problem. However, its usage is accompanied by additional software development efforts that have to be repeated when a new service or client is implemented. In this paper, we present an aspect-oriented programming approach that significantly reduces these software development efforts. The solution allows developers to easily extend existing multimedia web services with the capability of efficient data transmission without modifying the implementations of the original services, while at the same time the advantages of SOAP web services are still maintained. Experimental results for a distributed video analysis workflow demonstrate the feasibility of the presented approach.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A method to produce low bit-rate visual quality feedback is introduced and the added ability to detect and correct large drift errors significantly reduces the resulting visual quality fluctuations.
Abstract: Effective adaptive streaming systems need informative feedback that supports selection of appropriate actions. Packet level timing and reception statistics are already widely reported in feedback. In this paper, we introduce a method to produce low bit-rate visual quality feedback and evaluate its effectiveness in controlling errors in live video multicast. The visual quality feedback is a digest of picture content, and allows localized comparison in time and space on a continuous basis. This conveniently allows detection and localization of significant errors that may have originated from earlier irrecoverable losses, a task that is typically challenging with packet level feedback only. Our visual quality feedback has low bit overhead, at about 1% for high-definition video encoded at typical rates. For live video multicast with 10 clients, our experimental results show that the added ability to detect and correct large drift errors significantly reduces the resulting visual quality fluctuations.

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A new scheme based on the IEEE 802.11e quality of service (QoS) mechanism where a tradeoff between codec quality and priority is exploited to improve the number of calls that can be supported is proposed.
Abstract: Research on Voice over IP (VoIP) in infrastructure wireless local area networks (WLANs) has shown that, because of the access mechanism, the access point severely limits the number of voice calls a WLAN can support. To address this problem we propose a new scheme based on the IEEE 802.11e quality of service (QoS) mechanism where a tradeoff between codec quality and priority is exploited to improve the number of calls that can be supported. In particular, we propose certain priority settings at the access point (AP) to encourage users to switch to a lower quality codec during periods of high contention and thus enable them to maintain the call. We develop a detailed analytical model to show the benefit of this scheme. Our analytical results are validated by extensive simulation and show that the voice capacity in the network can be significantly improved. Furthermore, by using the ITU-T E-model to assess the voice quality, we show that users with a lower quality codec can still maintain an acceptable level of quality using the proposed scheme.