scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Viewport-adaptive navigable 360-degree video delivery

TL;DR: The impact of various spherical-to-plane projections and quality arrangements on the video quality displayed to the user is investigated, showing that the cube map layout offers the best quality for the given bit-rate budget.
Abstract: The delivery and display of 360-degree videos on Head-Mounted Displays (HMDs) presents many technical challenges. 360-degree videos are ultra high resolution spherical videos, which contain an omnidirectional view of the scene. However only a portion of this scene is displayed on the HMD. Moreover, HMD need to respond in 10 ms to head movements, which prevents the server to send only the displayed video part based on client feedback. To reduce the bandwidth waste, while still providing an immersive experience, a viewport-adaptive 360-degree video streaming system is proposed. The server prepares multiple video representations, which differ not only by their bit-rate, but also by the qualities of different scene regions. The client chooses a representation for the next segment such that its bit-rate fits the available throughput and a full quality region matches its viewing. We investigate the impact of various spherical-to-plane projections and quality arrangements on the video quality displayed to the user, showing that the cube map layout offers the best quality for the given bit-rate budget. An evaluation with a dataset of users navigating 360-degree videos demonstrates that segments need to be short enough to enable frequent view switches.
Citations
More filters
Proceedings ArticleDOI
20 Jun 2017
TL;DR: A dataset of head movements of users watching 360-degree videos on a Head-Mounted Display (HMD) is introduced and some examples of statistics that can be extracted from the collected data, for a content-dependent analysis of users' navigation patterns.
Abstract: While Virtual Reality applications are increasingly attracting the attention of developers and business analysts, the behaviour of users watching 360-degree (ie omnidirectional) videos has not been thoroughly studied yet This paper introduces a dataset of head movements of users watching 360-degree videos on a Head-Mounted Display (HMD) The dataset includes data collected from 59 users watching five 70 s-long 360-degree videos on the Razer OSVR HDK2 HMD The selected videos span a wide range of 360-degree content for which different viewer's involvement, thus navigation patterns, could be expected We describe the open-source software developed to produce the dataset and present the test material and viewing conditions considered during the data acquisition Finally, we show some examples of statistics that can be extracted from the collected data, for a content-dependent analysis of users' navigation patterns The source code of the software used to collect the data has been made publicly available, together with the entire dataset, to enable the community to extend the dataset

238 citations


Cites background from "Viewport-adaptive navigable 360-deg..."

  • ...Instead of streaming the entire spherical content at each instant in time, viewport adaptive streaming solutions [6, 8, 13, 20] have been proposed, where the streamed content depends not only on the available bandwidth between the client and the server but also on the user’s instantaneous viewing direction....

    [...]

Proceedings ArticleDOI
15 Oct 2018
TL;DR: This work conducts an IRB-approved user study and develops novel online algorithms that determine which spatial portions to fetch and their corresponding qualities for Flare, a practical system for streaming 360-degree videos on commodity mobile devices.
Abstract: Flare is a practical system for streaming 360-degree videos on commodity mobile devices. It takes a viewport-adaptive approach, which fetches only portions of a panoramic scene that cover what a viewer is about to perceive. We conduct an IRB-approved user study where we collect head movement traces from 130 diverse users to gain insights on how to design the viewport prediction mechanism for Flare. We then develop novel online algorithms that determine which spatial portions to fetch and their corresponding qualities. We also innovate other components in the streaming pipeline such as decoding and server-side transmission. Through extensive evaluations (~400 hours' playback on WiFi and ~100 hours over LTE), we show that Flare significantly improves the QoE in real-world settings. Compared to non-viewport-adaptive approaches, Flare yields up to 18x quality level improvement on WiFi, and achieves high bandwidth reduction (up to 35%) and video quality enhancement (up to 4.9x) on LTE.

201 citations


Cites background or methods from "Viewport-adaptive navigable 360-deg..."

  • ...Within different viewport-adaptive approaches [23, 26, 42, 47, 48, 52, 62], we adopt the tile-based solution [42, 48] due to its conceptual simplicity and potential benefits....

    [...]

  • ...Other studies along this direction include [26, 54, 60], to name a few....

    [...]

Proceedings ArticleDOI
20 Jun 2017
TL;DR: This paper develops fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature.
Abstract: We study the problem of predicting the Field-of-Views (FoVs) of viewers watching 360° videos using commodity Head-Mounted Displays (HMDs). Existing solutions either use the viewer's current orientation to approximate the FoVs in the future, or extrapolate future FoVs using the historical orientations and dead-reckoning algorithms. In this paper, we develop fixation prediction networks that concurrently leverage sensor- and content-related features to predict the viewer fixation in the future, which is quite different from the solutions in the literature. The sensor-related features include HMD orientations, while the content-related features include image saliency maps and motion maps. We build a 360° video streaming testbed to HMDs, and recruit twenty-five viewers to watch ten 360° videos. We then train and validate two design alternatives of our proposed networks, which allows us to identify the better-performing design with the optimal parameter settings. Trace-driven simulation results show the merits of our proposed fixation prediction networks compared to the existing solutions, including: (i) lower consumed bandwidth, (ii) shorter initial buffering time, and (iii) short running time.

200 citations


Cites background or methods from "Viewport-adaptive navigable 360-deg..."

  • ...Another open issue is to understand how 360° video mapping models, including cube and rhombic dodecahedron [9, 27], a‚ect the coding eciency and saliency map quality....

    [...]

  • ...At the time of writing, there exist no public HMD sensor datasets; the ones used in the literature [19, 27] are proprietary....

    [...]

  • ...Œere are several mapping models for 360° videos [9, 27], including: (i) equirectangular, (ii) cube, and (iii) rhombic dodecahedron mapping....

    [...]

Proceedings ArticleDOI
20 Jun 2017
TL;DR: This paper presents datasets of both content data (such as image saliency maps and motion maps derived from 360° videos) and sensor data ( such as viewer head positions and orientations derived from HMD sensors) that can be used to optimize existing 360° video streaming applications and novel applications (like crowd-driven camera movements).
Abstract: 360° videos and Head-Mounted Displays (HMDs) are getting increasingly popular. However, streaming 360° videos to HMDs is challenging. This is because only video content in viewers' Field-of-Views (FoVs) is rendered, and thus sending complete 360° videos wastes resources, including network bandwidth, storage space, and processing power. Optimizing the 360° video streaming to HMDs is, however, highly data and viewer dependent, and thus dictates real datasets. However, to our best knowledge, such datasets are not available in the literature. In this paper, we present our datasets of both content data (such as image saliency maps and motion maps derived from 360° videos) and sensor data (such as viewer head positions and orientations derived from HMD sensors). We put extra efforts to align the content and sensor data using the timestamps in the raw log files. The resulting datasets can be used by researchers, engineers, and hobbyists to either optimize existing 360° video streaming applications (like rate-distortion optimization) and novel applications (like crowd-driven camera movements). We believe that our dataset will stimulate more research activities along this exciting new research direction.

193 citations


Cites background from "Viewport-adaptive navigable 360-deg..."

  • ..., datasets used in [11, 24] are from the industry and proprietary....

    [...]

Proceedings ArticleDOI
20 Jun 2017
TL;DR: This paper presents a head tracking dataset composed of 48 users watching 18 sphere videos from 5 categories, and shows that people share certain common patterns in VR spherical video streaming, which are different from conventional video streaming.
Abstract: With Virtual Reality (VR) devices and content getting increasingly popular, understanding user behaviors in virtual environment is important for not only VR product design but also user experience improvement. In VR applications, the head movement is one of the most important user behaviors, which can reflect a user's visual attention, preference, and even unique motion pattern. However, to the best of our knowledge, no dataset containing this information is publicly available. In this paper, we present a head tracking dataset composed of 48 users (24 males and 24 females) watching 18 sphere videos from 5 categories. We carefully record how users watch the videos, how their heads move in each session, what directions they focus, and what content they can remember after each session. Based on this dataset, we show that people share certain common patterns in VR spherical video streaming, which are different from conventional video streaming. We believe the dataset can serve good resource for exploring user behavior patterns in VR applications.

191 citations


Cites background from "Viewport-adaptive navigable 360-deg..."

  • ...Yu et al.[10] and Corbillon et al.[2] used a small dataset of user behavior during spherical video viewing provided by Jaunt Inc....

    [...]

  • ...[10] and Corbillon et al.[2] used a small dataset of user behavior during spherical video viewing provided by Jaunt Inc....

    [...]

References
More filters
Proceedings ArticleDOI
09 Nov 2003
TL;DR: This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions, and develops an image synthesis method to calibrate the parameters that define the relative importance of different scales.
Abstract: The structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity can provide a good approximation to perceived image quality. This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions. We develop an image synthesis method to calibrate the parameters that define the relative importance of different scales. Experimental comparisons demonstrate the effectiveness of the proposed method.

4,333 citations


"Viewport-adaptive navigable 360-deg..." refers methods in this paper

  • ...We used two objective video quality metrics to measure the quality of the extracted viewport compared to the original full quality viewport: Multiscale - Structural Similarity (MSSSIM) [24] and Peak Signal Noise to Ratio (PSNR)....

    [...]

Proceedings ArticleDOI
23 Feb 2011
TL;DR: A receiver-driven rate adaptation method for HTTP/TCP streaming that deploys a step-wise increase/ aggressive decrease method to switch up/down between the different representations of the content that are encoded at different bitrates is presented.
Abstract: Recently, HTTP has been widely used for the delivery of real-time multimedia content over the Internet, such as in video streaming applications. To combat the varying network resources of the Internet, rate adaptation is used to adapt the transmission rate to the varying network capacity. A key research problem of rate adaptation is to identify network congestion early enough and to probe the spare network capacity. In adaptive HTTP streaming, this problem becomes challenging because of the difficulties in differentiating between the short-term throughput variations, incurred by the TCP congestion control, and the throughput changes due to more persistent bandwidth changes.In this paper, we propose a novel rate adaptation algorithm for adaptive HTTP streaming that detects bandwidth changes using a smoothed HTTP throughput measured based on the segment fetch time (SFT). The smoothed HTTP throughput instead of the instantaneous TCP transmission rate is used to determine if the bitrate of the current media matches the end-to-end network bandwidth capacity. Based on the smoothed throughput measurement, this paper presents a receiver-driven rate adaptation method for HTTP/TCP streaming that deploys a step-wise increase/ aggressive decrease method to switch up/down between the different representations of the content that are encoded at different bitrates. Our rate adaptation method does not require any transport layer information such as round trip time (RTT) and packet loss rates which are available at the TCP layer. Simulation results show that the proposed rate adaptation algorithm quickly adapts to match the end-to-end network capacity and also effectively controls buffer underflow and overflow.

455 citations


"Viewport-adaptive navigable 360-deg..." refers background in this paper

  • ...Rate adaptation algorithms are developed [11, 20] to reduce the mismatch between the requested bit-rate and the throughput....

    [...]

Proceedings ArticleDOI
03 Oct 2016
TL;DR: This paper proposes a cellular-friendly streaming scheme that delivers only 360 videos' visible portion based on head movement prediction, which can reduce bandwidth consumption by up to 80% based on a trace-driven simulation.
Abstract: As an important component of the virtual reality (VR) technology, 360-degree videos provide users with panoramic view and allow them to freely control their viewing direction during video playback. Usually, a player displays only the visible portion of a 360 video. Thus, fetching the entire raw video frame wastes bandwidth. In this paper, we consider the problem of optimizing 360 video delivery over cellular networks. We first conduct a measurement study on commercial 360 video platforms. We then propose a cellular-friendly streaming scheme that delivers only 360 videos' visible portion based on head movement prediction. Using viewing data collected from real users, we demonstrate the feasibility of our approach, which can reduce bandwidth consumption by up to 80% based on a trace-driven simulation.

391 citations


"Viewport-adaptive navigable 360-deg..." refers background in this paper

  • ...[17], who showed that prediction accuracy drops for time periods greater than 2 s....

    [...]

  • ...[17] also propose the delivery of tiles based on a prediction of the head movements....

    [...]

  • ...[17] have recently made a first attempt in this direction....

    [...]

Proceedings ArticleDOI
29 Sep 2015
TL;DR: This paper extract viewport based head motion trajectories, and compares the original and coded videos on the viewport, and shows that the average viewport quality can be approximated by a weighted spherical PSNR.
Abstract: Omnidirectional videos of real world environments viewed on head-mounted displays with real-time head motion tracking can offer immersive visual experiences. For live streaming applications, compression is critical to reduce the bitrate. Omnidirectional videos, which are spherical in nature, are mapped onto one or more planes before encoding to interface with modern video coding standards. In this paper, we consider the problem of evaluating the coding efficiency in the context of viewing with a head-mounted display. We extract viewport based head motion trajectories, and compare the original and coded videos on the viewport. With this approach, we compare different sphere-to-plane mappings. We show that the average viewport quality can be approximated by a weighted spherical PSNR.

387 citations


"Viewport-adaptive navigable 360-deg..." refers background or methods in this paper

  • ...This over-sampling degrades the performance of traditional video encoders [25]....

    [...]

  • ...In this paper, we consider the four projections that are the most discussed for -degree video encoding [25]....

    [...]

  • ...[25] studied the most frequent head positions of users....

    [...]

Proceedings ArticleDOI
10 Dec 2012
TL;DR: This paper develops a fully-functional DASH system, develops novel video rate control algorithms that balance the needs for video rate smoothness and high bandwidth utilization, and shows that a small video rate margin can lead to much improved smoothness in video rate and buffer size.
Abstract: Dynamic Adaptive Streaming over HTTP (DASH) is widely deployed on the Internet for live and on-demand video streaming services. Video adaptation algorithms in existing DASH systems are either too sluggish to respond to congestion level shifts or too sensitive to short-term network bandwidth variations. Both degrade user video experience. In this paper, we formally study the responsiveness and smoothness trade-off in DASH through analysis and experiments. We show that client-side buffered video time is a good feedback signal to guide video adaptation. We then propose novel video rate control algorithms that balance the needs for video rate smoothness and high bandwidth utilization. We show that a small video rate margin can lead to much improved smoothness in video rate and buffer size. The proposed DASH designs are also extended to work with multiple CDN servers. We develop a fully-functional DASH system and evaluate its performance through extensive experiments on a network testbed and the Internet. We demonstrate that our DASH designs are highly efficient and robust in realistic network environment.

313 citations


"Viewport-adaptive navigable 360-deg..." refers background in this paper

  • ...Rate adaptation algorithms are developed [11, 20] to reduce the mismatch between the requested bit-rate and the throughput....

    [...]