scispace - formally typeset
Search or ask a question

Showing papers by "Mohamed-Chaker Larabi published in 2014"


Book ChapterDOI
01 Jan 2014
TL;DR: In this chapter different factors that may influence Quality of Experience (QoE) in the context of media consumption, networked services, and other electronic communication services and applications, are discussed.
Abstract: In this chapter different factors that may influence Quality of Experience (QoE) in the context of media consumption, networked services, and other electronic communication services and applications, are discussed. QoE can be subject to a range of complex and strongly interrelated factors, falling into three categories: human, system and context influence factors (IFs). With respect to Human IFs, we discuss variant and stable factors that may potentially bear an influence on QoE, either for low-level (bottom-up) or higher-level (top-down) cognitive processing. System IFs are classified into four distinct categories, namely content-, media-, network- and device-related IFs. Finally, the broad category of possible Context IFs is decomposed into factors linked to the physical, temporal, social, economic, task and technical information context. The overview given here illustrates the complexity of QoE and the broad range of aspects that potentially have a major influence on it.

163 citations


Journal ArticleDOI
TL;DR: A preprocessing method based on an improvement of histogram matching (HM) algorithm using only common regions across views to consider the occlusion problem and, once the correction is performed, the color of real and rendered views is harmonized and looks very consistent as a whole.
Abstract: Multi-View Video (MVV) consists of capturing the same scene with multiple cameras from different viewpoints. Therefore, substantial illumination and color inconsistencies can be observed between the different views. These color mismatches can reduce significantly compression efficiency and rendering quality. In this paper, we propose a preprocessing method for cor- recting these color discrepancies in MVV. To consider occlusion problem, our method is based on an improvement of Histogram Matching (HM) algorithm using only common regions across views. These regions are defined by an invariant feature detector (SIFT), followed by RANSAC algorithm to increase the matching robustness. In addition, to maintain temporal correlation, HM algorithm is applied on a temporal sliding-window, allowing to cope with time-varying acquiring system, camera moving capture and real-time broadcasting. Moreover, unlike to choose always by default the center view as the reference one, we propose an automatic selection algorithm based on both views statistics and quality. Experimental results show that the proposed method increases coding efficiency with gains of up to 1.1 dB and 2.2 dB for the luminance and chrominance components, respectively. Further, once the correction is performed, the color of real and rendered views is harmonized and looks very consistent as a whole.

26 citations


Proceedings ArticleDOI
27 Oct 2014
TL;DR: The proposed metric handles effectively the asymmetric distortions of stereoscopic images, by incorporating human visual system (HVS) characteristics, and correlates better with human perception than the state-of-the-art metrics.
Abstract: Developing a metric that can reliably predict the perceptual 3D quality as perceived by the end user, is a challenging issue and a necessary tool for the success of 3D multimedia applications. The various attempts at predicting 3D quality of experience as the combination of 2D quality of the left and right images have shown their limitations, and particularly for the case of asymmetric distortions. In this paper we propose a full reference quality assessment metric for stereoscopic images based on the perceptual binocular characteristics. The proposed metric handles effectively the asymmetric distortions of stereoscopic images, by incorporating human visual system (HVS) characteristics. Our approach was motivated by the fact that in case of asymmetric quality, 3D perception mechanisms supports the view providing the most important and contrasted information. To achieve that, weighting factors are defined for the quality of each view according to the local information content. Add to that, to take into account the sensitivity of the HVS, quality score of each region are modulated based on the Binocular Just Noticeable Difference (BJND). Experimental results show that the proposed metric correlates better with human perception than the state-of-the-art metrics.

25 citations


Proceedings ArticleDOI
03 Dec 2014
TL;DR: The problem of inpainting detection is investigated and an efficient and reliable detection method is proposed and the performance of the proposed method is demonstrated on several inpainted images using different inpainted techniques.
Abstract: The increasingly use of digital images in our daily life and the availability of powerful software for processing and editing images, open new challenges regarding illegal or unauthorized image manipulation. Thus, it becomes essential to authenticate digital copies, validate their content, and detect possible forgeries. In this paper, we focus on detection of a specific type of digital forgery, inpainting, where an object is removed using pixels coming from the same image. The problem of inpainting detection is investigated and an efficient and reliable detection method is proposed. The performance of the proposed method is demonstrated on several inpainted images using different inpainting techniques.

17 citations


Proceedings ArticleDOI
07 Sep 2014
TL;DR: Experimental results show that the proposed metric correlates better with human judgement than the state-of-the-art metrics.
Abstract: This paper presents a full reference quality assessment metric for stereoscopic images based on perceptual binocular characteristics. To ensure that the predicted 3D quality of experience is as reliable and close as possible to 3D human perception, the proposed stereoscopic image quality assessment (SIQA) method is relying on the cyclopean image. Our approach is motivated by the fact that in case of asymmetric quality, 3D perception mechanisms place more emphasis on the view providing the most important and contrasted information. We integrated this psychophysical findings in the proposed 3D-IQA framework thanks to a weighting factor based on local information content. Add to that, to take into account the disparity/depth masking effect, we modulate the obtained quality score of each pixel of the cyclopean image according to its location in the scene. Experimental results show that the proposed metric correlates better with human judgement than the state-of-the-art metrics.

14 citations


Proceedings ArticleDOI
10 Dec 2014
TL;DR: This work proposed to evaluate the mutual interaction between audio and video and their influence on the perceived quality in case of streaming video applications shows that audio quality plays a crucial role in the judgment of perceived quality especially when its quality is poor.
Abstract: Objective and subjective quality assessment of images and videos have been an active research topic in the recent years. Multimedia technologies require new quality metrics and methodologies taking into account the fundamental differences in the human visual perception and the typical distortions of both video and audio modalities. Because of the increase of multimedia content platforms (Streaming, IPTV, OTT, etc) to delivering video content over the open Internet to a variety of devices; from TVs, to tablets, and smart phones, the Quality of Experience (QoE) may change. In this work we propose to evaluate the mutual interaction between audio and video and their influence on the perceived quality in case of streaming video applications. From one side, we carried out subjective experiments for assessing audio-only, video-only and audiovisual quality to create an audiovisual database that contains a wide range of degradations. From another side, a statistical analysis has been performed to investigate the influence of video resolution, viewing device and audio quality on perceived audiovisual quality. The Results show that audio quality plays a crucial role in the judgment of perceived quality especially when its quality is poor.

13 citations


Journal ArticleDOI
TL;DR: An offline quality monitoring system for the extraction of most suitable legal evidence images for video-surveillance applications using a robust tracking algorithm based on foveal wavelet and mean shift, and a super-resolution algorithm allowing to increase the size of the tracked object without using any information outside the image itself.
Abstract: Video-surveillance attracted an important research effort in the last few years. Many works are dedicated to the design of efficient systems and the development of robust algorithms. video compression is a very important stage in order to ensure the viability of video-surveillance systems. However, it introduces some distortions decreasing significantly the detection, recognition and identification tasks for legal investigators. Fortunately, an important effort is made in terms of standard definition for video-surveillance in order to achieve to a complete interoperability. However, quality issues are still not addressed in an appropriate way. Investigators are often facing the dilemma of selecting the best match (legal evidence) of the targeted object in the video-sequence. In this paper, we propose an offline quality monitoring system for the extraction of most suitable legal evidence images for video-surveillance applications. This system is constructed around three innovative parts: First, a robust tracking algorithm based on foveal wavelet and mean shift. Second, a no-reference quality metric based on sharpness feature. Finally, a super-resolution algorithm allowing to increase the size of the tracked object without using any information outside the image itself. The combination of the proposed algorithms allowed the construction of a quality monitoring system increasing significantly the efficiency of the legal evidence image extraction.

10 citations


Journal ArticleDOI
TL;DR: A novel efficient approach for high-quality and fast image restoration by combining both a greedy strategy and a global optimization strategy, based on a pyramidal representation of the image, outperform objectively and subjectively state-of-the-art methods.

10 citations


Proceedings ArticleDOI
03 Dec 2014
TL;DR: The method takes advantage of the strong correlation between the RGB color channels and relies on the observation that in images the RGB components are often not over-exposed at the same position, making it ideally suited for hardware implementations.
Abstract: We present a new method for correcting over-exposed areas in images. The method takes advantage of the strong correlation between the RGB color channels and relies on the observation that in images the RGB components are often not over-exposed at the same position. Our solution operates on line profiles, making it ideally suited for hardware implementations. In addition to its low computational complexity, our method can accurately recover information in areas where one or two channels are over-exposed, while it reconstructs information in areas that are fully clipped. We show that our method outperforms previous algorithms through quantitative analysis and we demonstrate an important application of this type of solution in the context of high dynamic range image reconstruction.

9 citations


Proceedings ArticleDOI
04 May 2014
TL;DR: A fully automated model that dynamically determines the best bounds of asymmetry for each region of the image while outperforming the widely used asymmetric coding approaches in terms of 3D visual quality.
Abstract: The problem of determining the best level of asymmetry has been addressed by several recent works with the aim to guarantee an optimal binocular perception while keeping the minimum required information. To do so, subjective experiments have been conducted for the definition of an appropriate threshold. However, such an approach is lacking in terms of generalization because of the content variability. Moreover, using a fixed threshold does not allow an adaptation to the content and to the images' quality. The traditional asymmetric stereoscopic coding methods apply a uniform asymmetry by considering that all regions of an image have the same perceptual relevance which is not in compliance with the characteristics of human visual system (HVS). Consequently, this paper describes a fully automated model that dynamically determines the best bounds of asymmetry for each region of the image. Based on the Binocular Just Noticeable Difference (BJND) and the depth level in the scene, the proposed method achieves non-uniform reduction of spatial resolution of one view of the stereo pair with the aim to reduce bandwidth requirement. Experimental results show that the proposed method results in up to 43% of bitrate saving while outperforming the widely used asymmetric coding approaches in terms of 3D visual quality.

6 citations


Proceedings ArticleDOI
04 May 2014
TL;DR: In this article, a stereoscopic 3D saliency model relying on 2D salience features jointly with depth obtained from monocular cues was proposed, and the validation of the model using state-of-the-art procedures including Kullback-Leibler divergence (KLD), area under the curve (AUC), and correlation coefficient (CC) in comparison with attention maps showed very good performance.
Abstract: Saliency is one of the most important features in human visual perception. It is widely used nowadays for perceptually optimizing image processing algorithms. Several models have been proposed for 2D images and only few attempts can be observed for 3D ones. In this paper, we propose a stereoscopic 3D saliency model relying on 2D saliency features jointly with depth obtained from monocular cues. On the one hand, the use of 2D saliency features is justified psychophysically by the similarity observed between 2D and 3D attention maps. On the other hand, 3D perception is significantly based on monocular cues. The validation of our model using state-of-the-art procedures including Kullback-Leibler divergence (KLD), area under the curve (AUC) and correlation coefficient (CC) in comparison with attention maps showed very good performance.

Proceedings ArticleDOI
05 Nov 2014
TL;DR: Results indicate that the proposed method outperforms Suthaharan's technique and is more effective than MSU's technique for the H.264 AVC and MPEG2 standard with the Spearman Rank Order Correlation Coefficient.
Abstract: In This paper, we propose a new perceptually significant video quality metric to estimate the effect of block coding for standards H.264 AVC and MPEG2. Our method operates in the spatial domain and doesn't require a high complexity of computation. We compare it with Suthaharan's and MSU's techniques by using “LIVE” database. Results indicate that the proposed method outperforms Suthaharan's technique. They also indicate that our method is more effective than MSU's technique for the H.264 AVC and MPEG2 standard with the Spearman Rank Order Correlation Coefficient.

Proceedings ArticleDOI
27 Oct 2014
TL;DR: This paper proposes a new non-uniform asymmetric stereoscopic coding adjusting in a dynamic manner the level of asymmetry for each image region to ensure unaltered binocular perception and provides better 3D visual quality compared to state-of-the-art asymmetric coding methods.
Abstract: Asymmetric stereoscopic coding is a very promising technique to decrease the bandwidth required for stereoscopic 3D delivery. However, one large obstacle is linked to the limit of asymmetric coding or the just noticeable threshold of asymmetry, so that 3D viewing experience is not altered. By way of subjective experiments, recent works have attempted to identify this asymmetry threshold. However, fixed threshold, highly dependent on the experiment design, do not allow to adapt to quality and content variation of the image. In this paper, we propose a new non-uniform asymmetric stereoscopic coding adjusting in a dynamic manner the level of asymmetry for each image region to ensure unaltered binocular perception. This is achieved by exploiting several HVS-inspired models; specifically we used the Binocular Just Noticeable Difference (BJND) combined with visual saliency map and depth information to quantify precisely the asymmetry threshold. Simulation results show that the proposed method results in up to 44% of bitrate saving and provides better 3D visual quality compared to state-of-the-art asymmetric coding methods.

Proceedings ArticleDOI
TL;DR: In this paper, the authors evaluated and compared the visual fatigue from watching 2D and S3D content and found that watching stereo 3D content induced stronger feeling of visual fatigue than conventional 2D, and the nature of video has an important effect on its increase.
Abstract: The changing of TV systems from 2D to 3D mode is the next expected step in the telecommunication world. Some works have already been done to perform this progress technically, but interaction of the third dimension with humans is not yet clear. Previously, it was found that any increased load of visual system can create visual fatigue, like prolonged TV watching, computer work or video gaming. But watching S3D can cause another nature of visual fatigue, since all S3D technologies creates illusion of the third dimension based on characteristics of binocular vision. In this work we propose to evaluate and compare the visual fatigue from watching 2D and S3D content. This work shows the difference in accumulation of visual fatigue and its assessment for two types of content. In order to perform this comparison eye-tracking experiments using six commercially available movies were conducted. Healthy naive participants took part into the test and gave their answers feeling the subjective evaluation. It was found that watching stereo 3D content induce stronger feeling of visual fatigue than conventional 2D, and the nature of video has an important effect on its increase. Visual characteristics obtained by using eye-tracking were investigated regarding their relation with visual fatigue.

Proceedings ArticleDOI
09 Dec 2014
TL;DR: Experimental results show that the proposed metric correlates better with human perception than state-of-the-art metrics.
Abstract: In this paper, we propose a no-reference perceptual blur metric for 3D stereoscopic images. The proposed approach relies on computing perceptual local blurriness map for each image of the stereo pair. To take into account the disparity/depth masking effect, we modulate the obtained perceptual score at each position of the blurriness maps according to its location in the scene. Under the assumption that, in case of asymmetric stereoscopic image quality, 3D perception mechanisms place more emphasis on the view providing the most important and contrasted information, the two derived local blurriness maps are combined using weighting factors based on local information content. Thanks to the inclusion of those psychophysical findings, the proposed metric handles efficiently symmetric as well as asymmetric distortions. Experimental results show that the proposed metric correlates better with human perception than state-of-the-art metrics.

Proceedings ArticleDOI
09 Dec 2014
TL;DR: A saliency model for stereoscopic 3D video extracts information from three dimensions of content, i.e. spatial, temporal and depth, and benefits from the properties of interest points to be close to human fixations in order to build spatial salient features.
Abstract: Visual attention is one of the most important mechanisms in the human visual perception. Recently, its modeling becomes a principal requirement for the optimization of the image processing systems. Numerous algorithms have already been designed for 2D saliency prediction. However, only few works can be found for 3D content. In this study, we propose a saliency model for stereoscopic 3D video. This algorithm extracts information from three dimensions of content, i.e. spatial, temporal and depth. This model benefits from the properties of interest points to be close to human fixations in order to build spatial salient features. Besides, as the perception of depth relies strongly on monocular cues, our model extracts the depth salient features using the pictorial depth sources. Since weights for fusion strategy are often selected in ad-hoc manner, in this work, we suggest to use a machine learning approach. The used artificial Neural Network allows to define adaptive weights based on the eye-tracking data. The results of the proposed algorithm are tested versus ground-truth information using the state-of-the-art techniques.

Proceedings ArticleDOI
23 Nov 2014
TL;DR: Investigation of the influence of audio on visual attention showed that audio significantly affect the viewer's attention and consequently it must be taken into consideration for the development of any multimedia saliency model.
Abstract: Usual attention is an important mechanism of the human visual system. It allows reducing the amount of information to be processed and accelerates the overall process of vision. Several models for images and videos have been proposed in the literature with encouraging results. However, most existing saliency models do not take into account the multimodal aspect of the video (audio and image). In this paper, we propose to investigate the influence of audio on visual attention. From one side, we carried out eye-tracking experiments for recording subjects' eye movement when watching videos. From another side, eye positions (fixation duration) were used for computing and comparing video attention maps (through eye-movements) with and without audio using state-of-the-art measures. Results therefore showed that audio significantly affect the viewer's attention and consequently it must be taken into consideration for the development of any multimedia saliency model. The findings of these experiments have been used for the development of audiovisual saliency model based on talking face. Result obtained with our model were assessed using usual measurements and showed good performance with regards to ground truth.

Proceedings ArticleDOI
01 Oct 2014
TL;DR: This work proposes a saliency model for stereoscopic 3D video that is based the fusion of three maps i.e. spatial, temporal and depth, and relies on interest point features known for being close to human visual attention.
Abstract: Modeling visual attention is an important stage for the optimization of image processing systems nowadays. Several models have been already developed for 2D static and dynamic content, but only few attempts can be found for stereoscopic 3D content. In this work we propose a saliency model for stereoscopic 3D video. This model is based the fusion of three maps i.e. spatial, temporal and depth. It relies on interest point features known for being close to human visual attention. Moreover, since 3D perception is mostly based on monocular cues, depth information is obtained using a monocular model predicting the depth position of objects. Several fusion strategies have been experimented in order to determine the best match for our model. Finally, our approach has been validated using state-of-the-art metrics in comparison to attention maps obtained by eye-tracking experiments, and showed good performance.

Book ChapterDOI
01 Jan 2014
TL;DR: A novel efficient approach for high-quality and fast image restoration by combining a greedy strategy and a global optimization strategy based on a pyramidal representation of the image is proposed.
Abstract: Image inpainting is not only the art of restoring damaged images but also a powerful technique for image editing e.g. removing undesired objects, recomposing images, etc. Recently, it becomes an active research topic in image processing because of its challenging aspect and extensive use in various real-world applications. In this paper, we propose a novel efficient approach for high-quality and fast image restoration by combining a greedy strategy and a global optimization strategy based on a pyramidal representation of the image. The proposed approach is validated on different state-of-the-art images. Moreover, a comparative validation shows that the proposed approach outperforms the literature in addition to a very low complexity.

Proceedings ArticleDOI
TL;DR: Results indicate that for a CIF format, the VP8 codec provides a better image quality than the H.264 codec for low bitrates, and better results are globally achieved using wideband codecs offering good quality except for opus codec (at 12.2 kbps).
Abstract: During the last decade, the important advances and widespread availability of mobile technology (operating systems, GPUs, terminal resolution and so on) have encouraged a fast development of voice and video services like video-calling. While multimedia services have largely grown on mobile devices, the generated increase of data consumption is leading to the saturation of mobile networks. In order to provide data with high bit-rates and maintain performance as close as possible to traditional networks, the 3GPP (The 3rd Generation Partnership Project) worked on a high performance standard for mobile called Long Term Evolution (LTE). In this paper, we aim at expressing recommendations related to audio and video media profiles (selection of audio and video codecs, bit-rates, frame-rates, audio and video formats) for a typical video-calling services held over LTE/4G mobile networks. These profiles are defined according to targeted devices (smartphones, tablets), so as to ensure the best possible quality of experience (QoE). Obtained results indicate that for a CIF format (352 x 288 pixels) which is usually used for smartphones, the VP8 codec provides a better image quality than the H.264 codec for low bitrates (from 128 to 384 kbps). However sequences with high motion, H.264 in slow mode is preferred. Regarding audio, better results are globally achieved using wideband codecs offering good quality except for opus codec (at 12.2 kbps).


Proceedings ArticleDOI
TL;DR: An audiovisual database that contains a different talking scenario that provides subjective quality scores obtained using a tailored single-stimulus test method (ACR) and can be used for the comparison and for the design of new models.
Abstract: While objective and subjective quality assessment of images and video have been an active research topic in the recent years, multimedia technologies require new quality metrics and methodologies taking into account the fundamental differences in the human visual perception and the typical distortions of both video and audio modalities. Because of the importance of faces and especially the talking faces in the video sequences, this paper presents an audiovisual database that contains a different talking scenario. In addition to the video, the database also provides subjective quality scores obtained using a tailored single-stimulus test method (ACR). The resulting mean opinion scores (MOS) can be used to evaluate the performance of audiovisual quality metrics as well as for the comparison and for the design of new models.