scispace - formally typeset
Search or ask a question
Author

Glenn Van Wallendael

Other affiliations: Netflix, Ericsson
Bio: Glenn Van Wallendael is an academic researcher from Ghent University. The author has contributed to research in topics: Video quality & Encoder. The author has an hindex of 15, co-authored 96 publications receiving 769 citations. Previous affiliations of Glenn Van Wallendael include Netflix & Ericsson.


Papers
More filters
Journal ArticleDOI
28 Mar 2013
TL;DR: Bitstream elements which maintain HEVC compatibility after encryption are listed and their impact on video adaptation is described and three bitstream elements are selected, namely intra prediction mode difference, motion vector difference sign, and residual sign.
Abstract: Video encryption techniques enable applications like digital rights management and video scrambling. Applying encryption on the entire video stream can be computationally costly and prevents advanced video modifications by an untrusted middlebox in the network, like splicing, quality monitoring, watermarking, and transcoding. Therefore, encryption techniques are proposed which influence a small amount of the video stream while keeping the video compliant with its compression standard, High Efficiency Video Coding. Encryption while guaranteeing standard compliance can cause degraded compression efficiency, so depending on their bitrate impact, a selection of encrypted syntax elements should be made. Each element also impacts the quality for untrusted decoders differently, so this aspect should also be considered. In this paper, multiple techniques for partial video encryption are investigated, most of them having a low impact on rate-distortion performance and having a broad range in scrambling performance.

69 citations

Journal ArticleDOI
TL;DR: The results show that both video content and the range of quality switches significantly influence end-users' rating behavior, and it is found that video stallings should be avoided during playback at all times.
Abstract: HTTP adaptive streaming facilitates video streaming to mobile devices connected through heterogeneous networks without the need for a dedicated streaming infrastructure. By splitting different encoded versions of the same video into small segments, clients can contin- uously decide which segments to download based on available network resources and device characteristics. These encoded versions can, for example, differ in terms of bitrate and spatial or temporal resolution. However, as a result of dynamically selecting video segments, perceived video quality can fluctuate during playback which will impact end-users' quality of experience. Subjective studies have already been conducted to assess the influence of video delivery using HTTP Adaptive Streaming to mobile devices. Nevertheless, existing studies are limited to the evalu- ation of short video sequences in controlled environments. Research has already shown that video duration and assessment environment influence quality perception. Therefore, in this article, we go beyond the traditional ways for subjective quality evaluation by conducting novel experiments on tablet devices in more ecologically valid testing environments using longer duration video sequences. As such, we want to mimic realistic viewing behavior as much as possible. Our results show that both video content and the range of quality switches significantly influence end-users' rating behavior. In general, quality level switches are only perceived in high motion sequences or in case switching occurs between high and low quality video segments. Moreover, we also found that video stallings should be avoided during playback at all times.

52 citations

Proceedings ArticleDOI
03 Jun 2015
TL;DR: This work shows how Reservoir Computing Networks (RCNs) can be used for detecting purposes on raw images and how RCNs with their simple and yet robust training procedure can be practically used for real surveillance tasks using very low resolution camera sensors.
Abstract: Among the various types of artificial neural networks used for event detection in visual contents, those with the ability of processing temporal information, such as recurrent neural networks, have been proved to be more effective. However, training of such networks is often difficult and time consuming. In this work, we show how Reservoir Computing Networks (RCNs) can be used for detecting purposes on raw images. The applicability of RCNs is illustrated using two example challenges, namely isolated digit handwriting recognition on the MNIST dataset as well as detection of the status of a door using self-developed moving pictures from a surveillance camera. Achieving an error rate of 0.92 percent on MNIST, we show that RCN can be a serious competitor to the state-of-the-art. Moreover, we show how RCNs with their simple and yet robust training procedure can be practically used for real surveillance tasks using very low resolution camera sensors.

50 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: The proposed framework, called Steered Mixture-of-Experts (SMoE), enables a multitude of processing tasks on light fields using a single unified Bayesian model that takes into account different regions of the scene, their edges, and their development along the spatial and disparity dimensions.
Abstract: The proposed framework, called Steered Mixture-of-Experts (SMoE), enables a multitude of processing tasks on light fields using a single unified Bayesian model. The underlying assumption is that light field rays are instantiations of a non-linear or non-stationary random process that can be modeled by piecewise stationary processes in the spatial domain. As such, it is modeled as a space-continuous Gaussian Mixture Model. Consequently, the model takes into account different regions of the scene, their edges, and their development along the spatial and disparity dimensions. Applications presented include light field coding, depth estimation, edge detection, segmentation, and view interpolation. The representation is compact, which allows for very efficient compression yielding state-of-the-art coding results for low bit-rates. Furthermore, due to the statistical representation, a vast amount of information can be queried from the model even without having to analyze the pixel values. This allows for “blind” light field processing and classification.

37 citations

Journal ArticleDOI
TL;DR: This paper proposes different transcoding techniques which are able to reduce the transcoding complexity in both CU and PU optimization levels and proposes a complexity-scalable transrating scheme which can be effectively controlled by the machine learning based approach.
Abstract: High efficiency video coding (HEVC) shows a significant advance in compression efficiency and is considered to be the successor of H.264/AVC. To incorporate the HEVC standard into real-life network applications and a diversity of other applications, efficient bit rate adaptation (transrating) algorithms are required. A current problem of transrating for HEVC is the high computational complexity associated with the encoder part of such a cascaded pixel domain transcoder. This paper focuses on deriving an optimal strategy for reducing the transcoding complexity with a complexity-scalable scheme. We propose different transcoding techniques which are able to reduce the transcoding complexity in both CU and PU optimization levels. At the CU level, CUs can be evaluated in top-to-bottom or bottom-to-top flows, in which the coding information of the input video stream is utilized to reduce the number of evaluations or to early terminate certain evaluations. At the PU level, the PU candidates are adaptively selected based on the probability of PU sizes and the co-located input PU partitioning. Moreover, with the use of different proposed methods, a complexity-scalable transrating scheme can be achieved. Furthermore, the transcoding complexity can be effectively controlled by the machine learning based approach. Simulations show that the proposed techniques provide a superior transcoding performance compared to the state-of-the-art related works. Additionally, the proposed methods can achieve a range of trade-offs between transrating complexity and coding performance. From the proposed schemes, the fastest approach is able to reduce the complexity by 82% while keeping the bitrate loss below 3%.

35 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An overview of recent advances in physical reservoir computing is provided by classifying them according to the type of the reservoir to expand its practical applications and develop next-generation machine learning systems.

959 citations

Patent
30 Sep 2013
TL;DR: In this paper, an image processing device and a method that make it possible to suppress block noise is presented, which can be applied to an image processor and can be used to suppress noise in a filter process determination unit.
Abstract: The present disclosure pertains to an image-processing device and method that make it possible to suppress block noise. A βLUT_input calculator and a clip processor determine βLUT_input, which is a value inputted to an existing-β generator and an expanded-β generator. When the value of βLUT_input qp from the clip processor is 51 or less, the existing-β generator determines β using a LUT conforming to the HVEC standard, and supplies same to a filter process determination unit. When the value of βLUT_input qp from the clip processor is larger than 51, the expanded-β generator determines the expanded β and supplies same to the filter process determination part. This disclosure, for example, can be applied to an image processing device.

340 citations

Journal ArticleDOI
TL;DR: A new video database is presented: CVD2014-Camera Video Database, which uses real cameras rather than introducing distortions via post-processing, which results in a complex distortion space in regard to the video acquisition process.
Abstract: This paper presents a new database, CID2013, to address the issue of using no-reference (NR) image quality assessment algorithms on images with multiple distortions. Current NR algorithms struggle to handle images with many concurrent distortion types, such as real photographic images captured by different digital cameras. The database consists of six image sets; on average, 30 subjects have evaluated 12–14 devices depicting eight different scenes for a total of 79 different cameras, 480 images, and 188 subjects (67% female). The subjective evaluation method was a hybrid absolute category rating-pair comparison developed for the study and presented in this paper. This method utilizes a slideshow of all images within a scene to allow the test images to work as references to each other. In addition to mean opinion score value, the images are also rated using sharpness, graininess, lightness, and color saturation scales. The CID2013 database contains images used in the experiments with the full subjective data plus extensive background information from the subjects. The database is made freely available for the research community.

203 citations

Journal ArticleDOI
TL;DR: Advancing over previous work, this system is able to reproduce challenging content such as view-dependent reflections, semi-transparent surfaces, and near-field objects as close as 34 cm to the surface of the camera rig.
Abstract: We present a system for capturing, reconstructing, compressing, and rendering high quality immersive light field video. We accomplish this by leveraging the recently introduced DeepView view interpolation algorithm, replacing its underlying multi-plane image (MPI) scene representation with a collection of spherical shells that are better suited for representing panoramic light field content. We further process this data to reduce the large number of shell layers to a small, fixed number of RGBA+depth layers without significant loss in visual quality. The resulting RGB, alpha, and depth channels in these layers are then compressed using conventional texture atlasing and video compression techniques. The final compressed representation is lightweight and can be rendered on mobile VR/AR platforms or in a web browser. We demonstrate light field video results using data from the 16-camera rig of [Pozo et al. 2019] as well as a new low-cost hemispherical array made from 46 synchronized action sports cameras. From this data we produce 6 degree of freedom volumetric videos with a wide 70 cm viewing baseline, 10 pixels per degree angular resolution, and a wide field of view, at 30 frames per second video frame rates. Advancing over previous work, we show that our system is able to reproduce challenging content such as view-dependent reflections, semi-transparent surfaces, and near-field objects as close as 34 cm to the surface of the camera rig.

179 citations