scispace - formally typeset
Search or ask a question
Author

Hadi Amirpour

Bio: Hadi Amirpour is an academic researcher from Alpen-Adria-Universität Klagenfurt. The author has contributed to research in topics: Computer science & Encoding (memory). The author has an hindex of 7, co-authored 32 publications receiving 105 citations. Previous affiliations of Hadi Amirpour include University of Beira Interior & K.N.Toosi University of Technology.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
14 Jun 2022
TL;DR: The Video Complexity Analyzer (VCA) project aims to provide an efficient spatial and temporal complexity analysis of each video (segment) which can be used in various applications to find the optimal encoding decisions.
Abstract: For online analysis of the video content complexity in live streaming applications, selecting low-complexity features is critical to ensure low-latency video streaming without disruptions. To this light, for each video (segment), two features, i.e., the average texture energy and the average gradient of the texture energy, are determined. A DCT-based energy function is introduced to determine the block-wise texture of each frame. The spatial and temporal features of the video (segment) are derived from this DCT-based energy function. The Video Complexity Analyzer (VCA) project aims to provide an efficient spatial and temporal complexity analysis of each video (segment) which can be used in various applications to find the optimal encoding decisions. VCA leverages some of the x86 Single Instruction Multiple Data (SIMD) optimizations for Intel CPUs and multi-threading optimizations to achieve increased performance. VCA is an open-source library published under the GNU GPLv3 license. Github: https://github.com/cd-athena/VCA Online documentation: https://cd-athena.github.io/VCA/ Website: https://vca.itec.aau.at/

27 citations

Proceedings ArticleDOI
27 Mar 2018
TL;DR: Light field raw image data is decomposed into multi-views and used as a pseudo-sequence input for state-of-the-art codecs such as High Efficiency Video Coding (HEVC).
Abstract: Light fields capture a large number of samples of light rays in both intensity and direction terms, which allow post-processing applications such as refocusing, shifting view-point and depth estimation. However, they are represented by huge amount of data and require a high-efficient coding scheme for its compression. In this paper, light field raw image data is decomposed into multi-views and used as a pseudo-sequence input for state-of-the-art codecs such as High Efficiency Video Coding (HEVC). In order to better exploit redundancy between neighboring views and decrease distances between current view and its references instead of using conventional orders, views are divided into four smaller regions and each region is scanned by a snake order. Furthermore, according to this ordering, an appropriate referencing structure is defined that only selects adjacent views as references. Simulation results show that Rate-Distortion performance of proposed method has higher gain than the other state-of-the-art light field compression methods.

23 citations

Proceedings ArticleDOI
23 May 2022
TL;DR: An online per-title encoding scheme (OPTE) for live video streaming applications that predicts each target bitrate’s optimal resolution from any pre-defined set of resolutions using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for each video segment.
Abstract: Current per-title encoding schemes encode the same video content at various bitrates and spatial resolutions to find an optimized bitrate ladder for each video content in Video on Demand (VoD) applications. However, in live streaming applications, a bitrate ladder with fixed bitrate-resolution pairs is used to avoid the additional latency caused to find optimum bitrate-resolution pairs for every video content. This paper introduces an online per-title encoding scheme (OPTE) for live video streaming applications. In this scheme, each target bitrate’s optimal resolution is predicted from any pre-defined set of resolutions using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features for each video segment. Experimental results show that, on average, OPTE yields bitrate savings of 20.45% and 28.45% to maintain the same PSNR and VMAF, respectively, compared to a fixed bitrate ladder scheme (as adopted in current live streaming deployments) without any noticeable additional latency in streaming.

21 citations

Proceedings ArticleDOI
12 May 2019
TL;DR: Six objective quality metrics are assessed for five state-of-the-art codecs at various bit-rates and it is concluded that the average FSIM-Y is the most reliable metric.
Abstract: Light field imaging is a promising technology for 3D computational photography. As Light Field images are represented for multiple views, their subjective evaluation is a very demanding task. Hence, identifying reliable objective quality assessment methodologies plays a very important role. In this paper six objective quality metrics; PSNR-Y, PSNR-YUV, SSIM-Y, MSSSIM-Y, FSIM-Y and HDRVDP2-Y are assessed for five state-of-the-art codecs at various bit-rates. Moreover, the metrics are computed in the linear, perceptually uniform and perceptual quantizer spaces. The results are compared against those of a subjective study and is concluded that the average FSIM-Y is the most reliable metric. The paper also introduces maps of the objective metrics to evaluate the quality dispersion among the different light field image views.

19 citations

Proceedings ArticleDOI
01 Nov 2013
TL;DR: This work seeks two objectives, channel estimation in predictive modeling scenario and multi-secondary user scenario using Artificial Neural Networks, and a channel status predictor configured for each secondary user enabling them to identify the best available channel.
Abstract: Sensing the spectrum and accessing it are two important challenges for any secondary user who wants to use the available communication channel in a cognitive radio system This spectrum utilization can be improved by secondary users through using free licensed channels in the absence of the primary user In this work, we seek two objectives, channel estimation in predictive modeling scenario and multi-secondary user scenario using Artificial Neural Networks Time Delay Neural Network (TDNN) and Recurrent Neural Network (RNN) have been selected to design the predictor The accuracy of this forecasting can easily improve the spectrum utilization In the second scenario, a channel status predictor is configured for each secondary user enabling them to identify the best available channel Simulation results show that the prediction error has been reduced to less than 14% in average However, in some cases it can predict the next channel status correctly with zero error prediction

19 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs, and comprehensive insights are presented into open research challenges and future research directions for LF coding.
Abstract: Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.

48 citations

Journal ArticleDOI
TL;DR: A No-Reference Light Field image Quality Assessment (NR-LFQA) scheme, where the main idea is to quantify the LFI quality degradation through evaluating the spatial quality and angular consistency, and the weighted local binary pattern is proposed to capture the characteristics of local angular consistency degradation.
Abstract: Light field image quality assessment (LFI-QA) is a significant and challenging research problem. It helps to better guide light field acquisition, processing and applications. However, only a few objective models have been proposed and none of them completely consider intrinsic factors affecting the LFI quality. In this paper, we propose a No-Reference Light Field image Quality Assessment (NR-LFQA) scheme, where the main idea is to quantify the LFI quality degradation through evaluating the spatial quality and angular consistency. We first measure the spatial quality deterioration by capturing the naturalness distribution of the light field cyclopean image array, which is formed when human observes the LFI. Then, as a transformed representation of LFI, the Epipolar Plane Image (EPI) contains the slopes of lines and involves the angular information. Therefore, EPI is utilized to extract the global and local features from LFI to measure angular consistency degradation. Specifically, the distribution of gradient direction map of EPI is proposed to measure the global angular consistency distortion in the LFI. We further propose the weighted local binary pattern to capture the characteristics of local angular consistency degradation. Extensive experimental results on four publicly available LFI quality datasets demonstrate that the proposed method outperforms state-of-the-art 2D, 3D, multi-view, and LFI quality assessment algorithms.

43 citations

Patent
Li Xiang1, Ying Chen1, Li Zhang1, Hongbin Liu1, Chen Jianle1, Marta Karczewicz1 
25 Mar 2016
TL;DR: In this paper, a method of decoding video data includes selecting a motion information derivation mode from a plurality of motion-information derivation modes for determining motion information for a current block, where the motion information indicates motion of the current block relative to reference video data.
Abstract: In an example, a method of decoding video data includes selecting a motion information derivation mode from a plurality of motion information derivation modes for determining motion information for a current block, where each motion information derivation mode of the plurality comprises performing a motion search for a first set of reference data that corresponds to a second set of reference data outside of the current block, and where the motion information indicates motion of the current block relative to reference video data. The method also includes determining the motion information for the current block using the selected motion information derivation mode. The method also includes decoding the current block using the determined motion information and without decoding syntax elements representative of the motion information.

39 citations

Proceedings ArticleDOI
14 Jun 2022
TL;DR: The Video Complexity Analyzer (VCA) project aims to provide an efficient spatial and temporal complexity analysis of each video (segment) which can be used in various applications to find the optimal encoding decisions.
Abstract: For online analysis of the video content complexity in live streaming applications, selecting low-complexity features is critical to ensure low-latency video streaming without disruptions. To this light, for each video (segment), two features, i.e., the average texture energy and the average gradient of the texture energy, are determined. A DCT-based energy function is introduced to determine the block-wise texture of each frame. The spatial and temporal features of the video (segment) are derived from this DCT-based energy function. The Video Complexity Analyzer (VCA) project aims to provide an efficient spatial and temporal complexity analysis of each video (segment) which can be used in various applications to find the optimal encoding decisions. VCA leverages some of the x86 Single Instruction Multiple Data (SIMD) optimizations for Intel CPUs and multi-threading optimizations to achieve increased performance. VCA is an open-source library published under the GNU GPLv3 license. Github: https://github.com/cd-athena/VCA Online documentation: https://cd-athena.github.io/VCA/ Website: https://vca.itec.aau.at/

27 citations

Proceedings ArticleDOI
19 Apr 2021
TL;DR: In this article, the authors conduct a comparative study of QUIC and TCP against production endpoints hosted by Google, Facebook, and Cloudflare under various dimensions: network conditions, workloads, and client implementations.
Abstract: IETF QUIC, the standardized version of Google’s UDP-based layer-4 network protocol, has seen increasing adoption from large Internet companies for its benefits over TCP. Yet despite its rapid adoption, performance analysis of QUIC in production is scarce. Most existing analyses have only used unoptimized open-source QUIC servers on non-tuned kernels: these analyses are unrepresentative of production deployments which raises the question of whether QUIC actually outperforms TCP in practice. In this paper, we conduct one of the first comparative studies on the performance of QUIC and TCP against production endpoints hosted by Google, Facebook, and Cloudflare under various dimensions: network conditions, workloads, and client implementations. To understand our results, we create a tool to systematically visualize the root causes of performance differences between the two protocols. Using our tool we make several key observations. First, while QUIC has some inherent advantages over TCP, such as worst-case 1-RTT handshakes, its overall performance is largely determined by the server’s choice of congestion-control algorithm and the robustness of its congestion-control implementation under edge-case network scenarios. Second, we find that some QUIC clients require non-trivial configuration tuning in order to achieve optimal performance. Lastly, we demonstrate that QUIC’s removal of head-of-line (HOL) blocking has little impact on web-page performance in practice. Taken together, our observations illustrate the fact that QUIC’s performance is inherently tied to implementation design choices, bugs, and configurations which implies that QUIC measurements are not always a reflection of the protocol and often do not generalize across deployments.

20 citations