scispace - formally typeset
Search or ask a question

Showing papers by "Christian Timmerer published in 2023"


Journal ArticleDOI
TL;DR: In this article , a multi-layer architecture and a centralized optimization model are proposed to minimize the serving time and network cost of HAS clients, considering available resources and all possible serving actions, and three heuristic approaches are introduced to cope with the high time complexity of the centralized model.
Abstract: With the ever-increasing demands for high-definition and low-latency video streaming applications, network-assisted video streaming schemes have become a promising complementary solution in the HTTP Adaptive Streaming (HAS) context to improve users’ Quality of Experience (QoE) as well as network utilization. Edge computing is considered one of the leading networking paradigms for designing such systems by providing video processing and caching close to the end-users. Despite the wide usage of this technology, designing network-assisted HAS architectures that support low-latency and high-quality video streaming, including edge collaboration is still a challenge. To address these issues, this article leverages the Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing paradigms to propose A collaboRative edge-Assisted framewoRk for HTTP Adaptive video sTreaming (ARARAT). Aiming at minimizing HAS clients’ serving time and network cost, besides considering available resources and all possible serving actions, we design a multi-layer architecture and formulate the problem as a centralized optimization model executed by the SDN controller. However, to cope with the high time complexity of the centralized model, we introduce three heuristic approaches that produce near-optimal solutions through efficient collaboration between the SDN controller and edge servers. Finally, we implement the ARARAT framework, conduct our experiments on a large-scale cloud-based testbed including 250 HAS players, and compare its effectiveness with state-of-the-art systems within comprehensive scenarios. The experimental results illustrate that the proposed ARARAT methods ( ${i}$ ) improve users’ QoE by at least 47%, (ii) decrease the streaming cost, including bandwidth and computational costs, by at least 47%, and (iii) enhance network utilization, by at least 48% compared to state-of-the-art approaches.

5 citations


Journal ArticleDOI
TL;DR: In this article , the authors proposed a low-latency adaptive streaming (HAS) based live streaming architecture, which is agnostic to the protocol and codecs that can work equally with existing HAS-based approaches.
Abstract: While most of the HTTP adaptive streaming (HAS) traffic continues to be video-on-demand (VoD), more users have started generating and delivering live streams with high quality through popular online streaming platforms. Typically, the video contents are generated by streamers and being watched by large audiences which are geographically distributed far away from the streamers’ locations. The locations of streamers and audiences create a significant challenge in delivering HAS-based live streams with low latency and high quality. Any problem in the delivery paths will result in a reduced viewer experience. In this paper, we propose $\mathsf{HxL3}$, a novel architecture for low-latency live streaming. $\mathsf{HxL3}$ is agnostic to the protocol and codecs that can work equally with existing HAS-based approaches. By holding the minimum number of live media segments through efficient caching and prefetching policies at the edge, improved transmissions, as well as transcoding capabilities, $\mathsf{HxL3}$ is able to achieve high viewer experiences across the Internet by alleviating rebuffering and substantially reducing initial startup delay and live stream latency. $\mathsf{HxL3}$ can be easily deployed and used. Its performance has been evaluated using real live stream sources and entities that are distributed worldwide. Experimental results show the superiority of the proposed architecture and give good insights into how low latency live streaming is working.

4 citations


Journal ArticleDOI
TL;DR: In this paper , a Discrete Cosine Transform (DCT)-energy-based VQA with texture information fusion (VQ-TIF ) model was proposed for video streaming applications that predicts VMAF for the reconstructed video compared to the original video.
Abstract: The rise of video streaming applications has increased the demand for Video Quality Assessment (VQA). In 2016, Netflix introduced VMAF, a full reference VQA metric that strongly correlates with perceptual quality, but its computation is time-intensive. This paper proposes a Discrete Cosine Transform (DCT)-energy-based VQA with texture information fusion (VQ-TIF ) model for video streaming applications that predicts VMAF for the reconstructed video compared to the original video. VQ-TIF extracts Structural Similarity (SSIM) and spatio-temporal features of the frames from the original and reconstructed videos, fuses them using a Long Short-Term Memory (LSTM)-based model to estimate VMAF. Experimental results show that VQ-TIF estimates VMAF with a Pearson Correlation Coefficient (PCC) of 0.96 and a Mean Absolute Error (MAE) of 2.71, on average, compared to the ground truth VMAF scores. Additionally, VQ-TIF estimates VMAF at a rate of 9.14 times faster than the state-of-the-art VMAF implementation and a 89.44% reduction in the energy consumption, assuming an Ultra HD (2160p) display resolution.

3 citations


Proceedings ArticleDOI
20 Apr 2023
TL;DR: In this article , a reduced reference Transcoding Quality Prediction Model (TQPM) is proposed to determine the visual quality score of the video possibly transcoded in multiple stages, where the quality is predicted using Discrete Cosine Transform (DCT)-energy-based features of the videos (i.e., the video's brightness, spatial texture information, and temporal activity) and the target bitrate representation of each transcoding stage.
Abstract: In recent years, video streaming applications have proliferated the demand for Video Quality Assessment (VQA). Reduced reference video quality assessment (RR-VQA) is a category of VQA where certain features (e.g., texture, edges) of the original video are provided for quality assessment. It is a popular research area for various applications such as social media, online games, and video streaming. This paper introduces a reduced reference Transcoding Quality Prediction Model (TQPM) to determine the visual quality score of the video possibly transcoded in multiple stages. The quality is predicted using Discrete Cosine Transform (DCT)-energy-based features of the video (i.e., the video's brightness, spatial texture information, and temporal activity) and the target bitrate representation of each transcoding stage. To do that, the problem is formulated, and a Long Short-Term Memory (LSTM)-based quality prediction model is presented. Experimental results illustrate that, on average, TQPM yields PSNR, SSIM, and VMAF predictions with an R2 score of 0.83, 0.85, and 0.87, respectively, and Mean Absolute Error (MAE) of 1.31 dB, 1.19 dB, and 3.01, respectively, for single-stage transcoding. Furthermore, an R2 score of 0.84, 0.86, and 0.91, respectively, and MAE of 1.32 dB, 1.33 dB, and 3.25, respectively, are observed for a two-stage transcoding scenario. Moreover, the average processing time of TQPM for 4s segments is 0.328s, making it a practical VQA method in online streaming applications.

2 citations


Journal ArticleDOI
TL;DR: In this article , the authors proposed and analyzed different segment prefetching policies that differ in resource utilization, player and radio metrics needed, and deployment complexity, and analyzed their performance and feasibility using metrics such as QoE characteristics, computing times, prefetching hits, and link bitrate consumption.
Abstract: Multi-access Edge Computing (MEC) is a new paradigm that brings storage and computing close to the clients. MEC enables the deployment of complex network-assisted mechanisms for video streaming that improve clients’ Quality of Experience (QoE). One of these mechanisms is segment prefetching, which transmits the future video segments in advance closer to the client to serve content with lower latency. In this work, for HAS-based (HTTP Adaptive Streaming) video streaming and specifically considering a cellular (e.g., 5G) network edge, we present our approach Segment Prefetching and Caching at the Edge for Adaptive Video Streaming (SPACE). We propose and analyze different segment prefetching policies that differ in resource utilization, player and radio metrics needed, and deployment complexity. This variety of policies can dynamically adapt to the network’s current conditions and the service provider’s needs. We present segment prefetching policies based on diverse approaches and techniques: past segment requests, segment transrating (i.e., reducing segment bitrate/quality), Markov prediction model, machine learning to predict future segment requests, and super-resolution. We study their performance and feasibility using metrics such as QoE characteristics, computing times, prefetching hits, and link bitrate consumption. We analyze and discuss which segment prefetching policy is better under which circumstances, as well as the influence of the client-side Adaptive Bit Rate (ABR) algorithm and the set of available representations (“bitrate ladder”) in segment prefetching. Moreover, we examine the impact on segment prefetching of different caching policies for (pre-)fetched segments, including Least Recently Used (LRU), Least Frequently Used (LFU), and our proposed popularity-based caching policy Least Popular Used (LPU).

2 citations


Proceedings ArticleDOI
20 Jun 2023
TL;DR: In this paper , the authors performed a subjective study on four impact factors on the QoE of PC video sequences in MR conditions, including quality switches, viewing distance, and content characteristics.
Abstract: Point Cloud (PC) streaming has recently attracted research attention as it has the potential to provide six degrees of freedom (6DoF), which is essential for truly immersive media. PCs require high-bandwidth connections, and adaptive streaming is a promising solution to cope with fluctuating bandwidth conditions. Thus, understanding the impact of different factors in adaptive streaming on the Quality of Experience (QoE) becomes fundamental. Mixed Reality (MR) is a novel technology and has recently become popular. However, quality evaluations of PCs in MR environments are still limited to static images. In this paper, we perform a subjective study on four impact factors on the QoE of PC video sequences in MR conditions, including quality switches, viewing distance, and content characteristics. The experimental results show that these factors significantly impact QoE. The QoE decreases if the sequence switches to lower quality and/or is viewed at a shorter distance, and vice versa. Additionally, the end user might not distinguish the quality differences between two quality levels at a specific viewing distance. Regarding content characteristics, objects with lower contrast seem to provide better quality scores.

1 citations


Proceedings ArticleDOI
20 Jun 2023
TL;DR: In this paper , the authors present a demonstrator for subjective quality assessment of dynamic point clouds and mesh objects under different conditions in MR environments, including encoding parameters, quality switching, viewing distance, and content characteristics.
Abstract: 3D objects are important components in Mixed Reality (MR) environments as they allow users to inspect and interact with them in a six degrees of freedom (6DoF) system. Point clouds (PCs) and meshes are two common 3D object representations that can be compressed to reduce the delivered data at the cost of quality degradation. In addition, as the end users can move around in 6DoF applications, the viewing distance can vary. Quality assessment is necessary to evaluate the impact of the compressed representation and viewing distance on the Quality of Experience (QoE) of end users. This paper presents a demonstrator for subjective quality assessment of dynamic PC and mesh objects under different conditions in MR environments. Our platform allows conducting subjective tests to evaluate various QoE influence factors, including encoding parameters, quality switching, viewing distance, and content characteristics, with configurable settings for these factors.

1 citations


Book ChapterDOI
TL;DR: In this paper , a Mixed-Binary Linear Programming (MBLP) model called Multi-Codec Optimization Model at the edge for Live streaming (MCOM-Live) is proposed to jointly optimize the overall streaming costs and the visual quality of the content played out by the end-users by efficiently enabling multi-codec content delivery.
Abstract: HTTP Adaptive Streaming (HAS) is the predominant technique to deliver video contents across the Internet with the increasing demand of its applications. With the evolution of videos to deliver more immersive experiences, such as their evolution in resolution and framerate, highly efficient video compression schemes are required to ease the burden on the delivery process. While AVC/H.264 still represents the most adopted codec, we are experiencing an increase in the usage of new generation codecs (HEVC/H.265, VP9, AV1, VVC/H.266, etc.). Compared to AVC/H.264, these codecs can either achieve the same quality besides a bitrate reduction or improve the quality while targeting the same bitrate. In this paper, we propose a Mixed-Binary Linear Programming (MBLP) model called Multi-Codec Optimization Model at the edge for Live streaming (MCOM-Live) to jointly optimize (i) the overall streaming costs, and (ii) the visual quality of the content played out by the end-users by efficiently enabling multi-codec content delivery. Given a video content encoded with multiple codecs according to a fixed bitrate ladder, the model will choose among three available policies, i.e., fetch, transcode, or skip, the best option to handle the representations. We compare the proposed model with traditional approaches used in the industry. The experimental results show that our proposed method can reduce the additional latency by up to 23% and the streaming costs by up to 78%, besides improving the visual quality of the delivered segments by up to 0.5 dB, in terms of PSNR.

1 citations


Proceedings ArticleDOI
24 Apr 2023
TL;DR: In this paper , the authors present optimizations on VCA for faster and energy-efficient video complexity analysis, using eight CPU threads, Single Instruction Multiple Data (SIMD), and low-pass DCT optimization.
Abstract: For adaptive streaming applications, low-complexity and accurate video complexity features are necessary to analyze the video content in real time, which ensures fast and compression-efficient video streaming without disruptions. State-of-the-art video complexity features are Spatial Information (SI) and Temporal Information (TI) features which do not correlate well with the encoding parameters in adaptive streaming applications. To this light, Video Complexity Analyzer (VCA) was introduced, determining the features based on Discrete Cosine Transform (DCT)-energy. This paper presents optimizations on VCA for faster and energy-efficient video complexity analysis. Experimental results show that VCA v2.0, using eight CPU threads, Single Instruction Multiple Data (SIMD), and low-pass DCT optimization, determines seven complexity features of Ultra High Definition 8-bit videos with better accuracy at a speed of up to 292.68 fps and an energy consumption of 97.06% lower than the reference SITI implementation.

1 citations


Journal ArticleDOI
TL;DR: In this paper , a Just Noticeable Difference (JND)-aware constrained Variable Bitrate (cVBR) Two-pass Per-title encoding scheme (JTPS) is proposed for live video streaming.
Abstract: Adaptive live video streaming applications utilize a predefined collection of bitrate-resolution pairs, known as a bitrate ladder , for simplicity and efficiency, eliminating the need for additional run-time to determine the optimal pairs during the live streaming session. These applications do not incorporate two-pass encoding methods due to increased latency. However, an optimized bitrate ladder could result in lower storage and delivery costs and improved Quality of Experience (QoE). This paper presents a Just Noticeable Difference (JND)-aware constrained Variable Bitrate (cVBR) Two-pass Per-title encoding Scheme (JTPS) designed specifically for live video streaming. JTPS predicts a content- and JND-aware bitrate ladder using low-complexity features based on Discrete Cosine Transform (DCT) energy and optimizes the constant rate factor (CRF) for each representation using random forest-based models. The effectiveness of JTPS is demonstrated using the open source video encoder x265, with an average bitrate reduction of 18.80% and 32.59% for the same PSNR and VMAF, respectively, compared to the standard HTTP Live Streaming (HLS) bitrate ladder using Constant Bitrate (CBR) encoding. The implementation of JTPS also resulted in a 68.96% reduction in storage space and an 18.58% reduction in encoding time for a JND of six VMAF points.

1 citations




Journal ArticleDOI
TL;DR: In this article , a Just Noticeable Difference (JND)-aware per-scene bitrate ladder prediction scheme (JASLA) was proposed for adaptive video-on-demand streaming applications.
Abstract: In video streaming applications, a fixed set of bitrate-resolution pairs (known as a bitrate ladder) is typically used during the entire streaming session. However, an optimized bitrate ladder per scene may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience. This paper introduces a Just Noticeable Difference (JND)-aware per-scene bitrate ladder prediction scheme (JASLA) for adaptive video-on-demand streaming applications. JASLA predicts jointly optimized resolutions and corresponding constant rate factors (CRFs) using spatial and temporal complexity features for a given set of target bitrates for every scene, which yields an efficient constrained Variable Bitrate encoding. Moreover, bitrate-resolution pairs that yield distortion lower than one JND are eliminated. Experimental results show that, on average, JASLA yields bitrate savings of 34.42% and 42.67% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder Constant Bitrate encoding using x265 HEVC encoder, where the maximum resolution of streaming is Full HD (1080p). Moreover, a 54.34% average cumulative decrease in storage space is observed.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a first pass for two-pass rate control in all-intra configuration, using low-complexity video analysis and a Random Forest (RF)-based machine learning model to derive the data required for driving the second pass.
Abstract: Versatile Video Coding (VVC) allows for large compression efficiency gains over its predecessor, High Efficiency Video Coding (HEVC). The added efficiency comes at the cost of increased runtime complexity, especially for encoding. It is thus highly relevant to explore all available runtime reduction options. This paper proposes a novel first pass for two-pass rate control in all-intra configuration, using low-complexity video analysis and a Random Forest (RF)-based machine learning model to derive the data required for driving the second pass. The proposed method is validated using VVenC, an open and optimized VVC encoder. Compared to the default two-pass rate control algorithm in VVenC, the proposed method achieves around 32% reduction in encoding time for the preset faster, while on average only causing 2% BD-rate increase and achieving similar rate control accuracy.

Proceedings ArticleDOI
20 Jun 2023
TL;DR: In this article , a subjective assessment was conducted to investigate the impact of video resolution, different types of luminance, and different end devices on the QoE and energy consumption of video streaming services.
Abstract: The increasing use of ICT has raised concerns about its negative impact on energy consumption and $CO_{2}$ emissions. To address this issue, there is a need to better understand the trade-off between Quality of Experience (QoE) and sustainable video streaming services. In this study, we designed and conducted a subjective assessment to investigate the impact of video resolution, different types of luminance, and different end devices on the QoE and energy consumption of video streaming services. Then, we applied statistical models (Analysis of Variance and t-test) to subjective data to find out what factors influence the QoE the most and consume more energy. The obtained results suggest that under specific conditions (e.g., dark or bright ambient, low device backlight luminance, small-screen device) the users could be encouraged towards a trade-off between acceptable QoE and sustainable (green) choices because spending more energy (e.g., streaming higher-quality video) would not provide noticeable QoE enhancement.

Proceedings ArticleDOI
07 Jun 2023
TL;DR: In this paper , a matching-based method is proposed to schedule video encoding applications on both cloud and edge resources to optimize costs and energy consumption, which achieves lower costs by 17%-78% in the cost-optimized scenarios compared to the energy-optimised and tradeoff between cost and energy.
Abstract: The considerable surge in energy consumption within data centers can be attributed to the exponential rise in demand for complex computing workflows and storage resources. Video streaming applications are both compute and storage-intensive and account for the majority of today's internet services. In this work, we designed a video encoding application consisting of codec, bitrate, and resolution set for encoding a video segment. Then, we propose VE-Match, a matching-based method to schedule video encoding applications on both Cloud and Edge resources to optimize costs and energy consumption. Evaluation results on a real computing testbed federated between Amazon Web Services (AWS) EC2 Cloud instances and the Alpen-Adria University (AAU) Edge server reveal that VE-Match achieves lower costs by 17%-78% in the cost-optimized scenarios compared to the energy-optimized and tradeoff between cost and energy. Moreover, VE-Match improves the video encoding energy consumption by 38%-45% and gCO2 emission by up to 80% in the energy-optimized scenarios compared to the cost-optimized and tradeoff between cost and energy.

Proceedings ArticleDOI
08 May 2023
TL;DR: Li et al. as discussed by the authors proposed an efficient framework for dynamic bitrate ladder optimization in live HTTP adaptive streaming (HAS), which leverages the latest developments in video analytics to collect statistics from video players, content delivery networks and video encoders.
Abstract: Video content in Live HTTP Adaptive Streaming (HAS) is typically encoded using a pre-defined, fixed set of bitrate-resolution pairs (termed Bitrate Ladder), allowing play-back devices to adapt to changing network conditions using an adaptive bitrate (ABR) algorithm. However, using a fixed one-size-fits-all solution when faced with various content complexities, heterogeneous network conditions, viewer device resolutions and locations, does not result in an overall maximal viewer quality of experience (QoE). Here, we consider these factors and design LALISA, an efficient framework for dynamic bitrate ladder optimization in live HAS. LALISA dynamically changes a live video session’s bitrate ladder, allowing improvements in viewer QoE and savings in encoding, storage, and bandwidth costs. LALISA is independent of ABR algorithms and codecs, and is deployed along the path between viewers and the origin server. In particular, it leverages the latest developments in video analytics to collect statistics from video players, content delivery networks and video encoders, to perform bitrate ladder tuning. We evaluate the performance of LALISA against existing solutions in various video streaming scenarios using a trace-driven testbed. Evaluation results demonstrate significant improvements in encoding computation (24.4%) and bandwidth (18.2%) costs with an acceptable QoE.

Journal ArticleDOI
TL;DR: In this paper , the authors propose a Cloud-based Adaptive Video Streaming Evaluation (LLL-CAdViSE) framework for evaluating low-latency live streaming sessions. But, their work is limited to two major HTTP Adaptive Streaming (HAS) formats, MPEG-DASH and HTTP Live Streaming.
Abstract: Live media streaming is a challenging task by itself, and when it comes to use cases that define low-latency as a must, the complexity will rise multiple times. In a typical media streaming session, the main goal can be declared as providing the highest possible Quality of Experience (QoE), which has proved to be measurable using quality models and various metrics. In a low-latency media streaming session, the requirements are to provide the lowest possible delay between the moment a frame of video is captured and the moment that the captured frame is rendered on the client screen, also known as end-to-end (E2E) latency and maintain the QoE. This paper proposes a sophisticated cloud-based and open-source testbed that facilitates evaluating a low-latency live streaming session as the primary contribution. Live Low-Latency Cloud-based Adaptive Video Streaming Evaluation (LLL-CAdViSE) framework is enabled to asses the live streaming systems running on two major HTTP Adaptive Streaming (HAS) formats, Dynamic Adaptive Streaming over HTTP (MPEG-DASH) and HTTP Live Streaming (HLS). We use Chunked Transfer Encoding (CTE) to deliver Common Media Application Format (CMAF) chunks to the media players. Our testbed generates the test content (audiovisual streams). Therefore, no test sequence is required, and the encoding parameters (e.g., encoder, bitrate, resolution, latency) are defined separately for each experiment. We have integrated the ITU-T P.1203 quality model inside our testbed. To demonstrate the flexibility and power of LLL-CAdViSE, we have presented a secondary contribution in this paper; we have conducted a set of experiments with different network traces, media players, ABR algorithms, and with various requirements (e.g., E2E latency (typical/reduced/low/ultra-low), diverse bitrate ladders, and catch-up logic) and presented the essential findings and the experimental results.

Journal ArticleDOI
TL;DR: In this paper , the authors provide an elaborate introduction to the creation, streaming, and evaluation of immersive video, and provide lessons learned and to point at promising research paths to enable truly interactive immersive video applications toward holography.
Abstract: Video services are evolving from traditional two-dimensional video to virtual reality and holograms, which offer six degrees of freedom to users, enabling them to freely move around in a scene and change focus as desired. However, this increase in freedom translates into stringent requirements in terms of ultra-high bandwidth (in the order of Gigabits per second) and minimal latency (in the order of milliseconds). To realize such immersive services, the network transport, as well as the video representation and encoding, have to be fundamentally enhanced. The purpose of this tutorial article is to provide an elaborate introduction to the creation, streaming, and evaluation of immersive video. Moreover, it aims to provide lessons learned and to point at promising research paths to enable truly interactive immersive video applications toward holography.